Differences between prior distribution and prior predictive distribution?Intersections of chemistry and...

How to make ice magic work from a scientific point of view?

Is it a fallacy if someone claims they need an explanation for every word of your argument to the point where they don't understand common terms?

A curious equality of integrals involving the prime counting function?

What is the wife of a henpecked husband called?

How do you funnel food off a cutting board?

Is Krishna the only avatar among dashavatara who had more than one wife?

Which communication protocol is used in AdLib sound card?

Why do cars have plastic shrouds over the engine?

Constexpr if with a non-bool condition

Potential client has a problematic employee I can't work with

General past possibility with 'could'

Why did Luke use his left hand to shoot?

Early credit roll before the end of the film

When do I have to declare that I want to twin my spell?

Why TEventArgs wasn't made contravariant in standard event pattern in the .Net ecosystem?

Is it possible to grant users sftp access without shell access? If yes, how is it implemented?

Airplane generations - how does it work?

Clues on how to solve these types of problems within 2-3 minutes for competitive exams

Picture with grey box as background

Why are the books in the Game of Thrones citadel library shelved spine inwards?

Move fast ...... Or you will lose

How do I append a character to the end of every line in an Excel cell?

How would an AI self awareness kill switch work?

Dilemma of explaining to interviewer that he is the reason for declining second interview

Differences between prior distribution and prior predictive distribution?

Intersections of chemistry and statisticsExperimental Design on Testing ProportionsPrior/Posterior predictive distributionsExplanation that the prior predictive (marginal) distribution follows from prior and sampling distributionsMarginal likelihood vs. prior predictive probabilityInference from the posterior predictive distributionWhat is a predictive distribution?What is the difference between a flat and weak prior?Relation between Bayesian analysis and Bayesian hierarchical analysis?Relationship between negative binomial distribution and Bayesian Poisson with Gamma priorsUse of prior and posterior predictive distributions?How do interpret a vague prior for hierarchical modeling?

While studying Bayesian statistics, somehow I am facing a problem to understand the differences between prior distribution and prior predictive distribution. Prior distribution is sort of fine to understand but I have found it vague to understand the use of prior predictive distribution and why it is different from prior distribution.

asked 7 hours ago

Changhee Kang

211

New contributor

add a comment |

asked 7 hours ago

Changhee Kang

211

New contributor

add a comment |

asked 7 hours ago

Changhee Kang

211

New contributor

machine-learning bayesian inference data-mining hierarchical-bayesian

asked 7 hours ago

Changhee Kang

211

New contributor

asked 7 hours ago

Changhee Kang

211

New contributor

asked 7 hours ago

Changhee Kang

211

New contributor

asked 7 hours ago

Changhee Kang

211

asked 7 hours ago

Changhee Kang

211

New contributor

Changhee Kang is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

2 Answers
2

active

oldest

votes

Let $Y$ be a random variable representing the (maybe future) data. We have a (parametric) model for $Y$ with $Y sim f(y mid theta), quad theta in Theta$, $Theta$ the parameter space. Then we have a prior distribution represented by $pi(theta)$. Given an observation of $Y$, the prior distribution of $theta$ is
$$
f(theta mid y) =frac{f(ymidtheta) pi(theta)}{int_Theta f(ymidtheta) pi(theta); dtheta} $$
The prior predictive distribution of $Y$ is then the (modeled) distribution of $Y$ marginalized over the prior, that is, integrated over $pi(theta)$:
$$
f(y) = int_Theta f(ymidtheta) pi(theta); dtheta
$$ that is, the denominator in Bayes theorem above. This is also called the preposterior distribution of $Y$. This tells you what data (that is $Y$) you expect to see before learning more about $theta$. This have many uses, for instance in design of experiments, for an example, see Experimental Design on Testing Proportions or Intersections of chemistry and statistics.

Another use is as a way to understand the prior distribution better. Say you are interested in modeling the variation in weight of elephants, and your prior distribution leads to a prior predictive with substantial probability over 20 tons. Then you might want to rethink, typical weight of largest elephants is seldom above 6 tons, so a substantial probability over 20 tons seem wrong. One interesting paper in this direction is Gelman (which do not use the terminology ...)

Finally, preposterior concepts are typically not useful with uninformative priors, they require prior modeling taken serious. One example is the following: Let $Y sim mathcal{N}(theta, 1)$ with a flat prior $pi(theta)=1$. Then the prior predictive of $Y$ is
$$
f(y)= int_{-infty}^infty frac1{sqrt{2pi}} e^{-frac12 (y-theta)^2}; dtheta = 1
$$
so is itself uniform, so not very useful.

edited 4 hours ago

Christoph Hanck

17k34074

answered 5 hours ago

kjetil b halvorsen

30.6k983220

add a comment |

Predictive here means predictive for observations. The prior distribution is a distribution for the parameters whereas the prior predictive distribution is a distribution for the observation.

If $X$ denotes observation and we use the model (or likelihood) $p(x mid theta)$ then a prior distribution is a distribution for $theta$, for example $p_beta(theta)$ where $beta$ is a set of hyperparameters. Note that there's no conditioning on $beta$ , and therefore the hyperparameters are considered fixed, which is not the case in hierarchical models but this not the point here.

The prior predictive distribution is the distribution of $X$ "averaged" over $theta$,

$$
p_beta(x) = int p(x mid theta) p_beta(theta) dtheta
$$

This distribution is prior as it does not rely on any observations.

We can also define the same way the posterior predictive distribution, that is if we have a sample $X = (X_1, dots, X_n)$ the posterior predictive distribution is

begin{align*}
p_beta(x mid X) &= int p(x mid X, theta) p_beta(theta) dtheta \
&= int p(x mid theta) p(X mid theta) p_beta(theta) dtheta \
&= int p(x mid theta) p_beta(theta mid X)dtheta
end{align*}

thus the posterior predictive distribution is constructed the same way as the prior predictive distribution but while in the latter we weight with $p_beta(theta)$ is the former we weight with $p_beta(theta mid X)$ that is with our "updated" knowledge about $theta$.

Example : Beta-Binomial

Suppose our model is $X mid theta sim Bin(n_1,theta)$ i.e $P(X = x mid theta) = theta^x(1-theta)^{n_1-x}$.

We suppose a beta prior distribution for $theta$, $beta(a,b)$ where $(a,b)$ is the set of hyper parameters.

Then the prior predictive distribution for $theta$ is the beta-binomial distribution of parameter $(n_1,a,b)$. This discrete distribution gives the probability of $k$ successes out of $n_1$ trials given hyper-parameter $(a,b)$ on the probability of success.

Now suppose we observe $n_1$ draws $(x_1, dots, x_{n_1})$ whith $x$ successes.

Since the binomial and beta distributions are conjugate distributions we have:
begin{align*}
p(theta mid X=x)
&propto theta^x (1 - theta)^{n_1-x} times theta^{a-1}(1-theta)^{b-1}\
&propto theta^{a+x-1}(1-theta)^{n_1+b-x-1} \
&propto beta(a+x,n_1+b-x)
end{align*}

Thus $theta mid x$ also follows a beta distribution. Then, $p(x mid x, a,b)$ follows a beta-binomial but this time of parameters $(a+x,b+n_1-x)$ rather than $(a,b)$

Upon a $beta(a,b)$ prior distribution and a $Bin(n_1,theta)$ likelihood, if we observe $x$ successes out of $n_1$ trials the posterior predictive distribution is a beta-binomial of parameters $(n_2,a+x,b+n_1-x)$. Note that $n_2$ and $n_1$ play differents roles, since here the posterior predictive is about:

Given my current knowledge on $theta$ after observing $x$ successes out of $n_1$ trials, i.e $beta(n_1,a+x,n+b-x)$, what probability I have of observing $k$ successes out of $n_2$ additional trials.

I hope this is useful and clear

answered 5 hours ago

winperikle

584

$begingroup$
Yeap, I believe I have understood what you have explained here. Thank you very much.
$endgroup$
– Changhee Kang
3 hours ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Changhee Kang is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f394648%2fdifferences-between-prior-distribution-and-prior-predictive-distribution%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

edited 4 hours ago

Christoph Hanck

17k34074

answered 5 hours ago

kjetil b halvorsen

30.6k983220

add a comment |

edited 4 hours ago

Christoph Hanck

17k34074

answered 5 hours ago

kjetil b halvorsen

30.6k983220

add a comment |

edited 4 hours ago

Christoph Hanck

17k34074

answered 5 hours ago

kjetil b halvorsen

30.6k983220

edited 4 hours ago

Christoph Hanck

17k34074

answered 5 hours ago

kjetil b halvorsen

30.6k983220

edited 4 hours ago

Christoph Hanck

17k34074

edited 4 hours ago

Christoph Hanck

17k34074

edited 4 hours ago

Christoph Hanck

17k34074

answered 5 hours ago

kjetil b halvorsen

30.6k983220

answered 5 hours ago

kjetil b halvorsen

30.6k983220

answered 5 hours ago

kjetil b halvorsen

30.6k983220

add a comment |

Predictive here means predictive for observations. The prior distribution is a distribution for the parameters whereas the prior predictive distribution is a distribution for the observation.

The prior predictive distribution is the distribution of $X$ "averaged" over $theta$,

$$
p_beta(x) = int p(x mid theta) p_beta(theta) dtheta
$$

This distribution is prior as it does not rely on any observations.

We can also define the same way the posterior predictive distribution, that is if we have a sample $X = (X_1, dots, X_n)$ the posterior predictive distribution is

begin{align*}
p_beta(x mid X) &= int p(x mid X, theta) p_beta(theta) dtheta \
&= int p(x mid theta) p(X mid theta) p_beta(theta) dtheta \
&= int p(x mid theta) p_beta(theta mid X)dtheta
end{align*}

Example : Beta-Binomial

Suppose our model is $X mid theta sim Bin(n_1,theta)$ i.e $P(X = x mid theta) = theta^x(1-theta)^{n_1-x}$.

We suppose a beta prior distribution for $theta$, $beta(a,b)$ where $(a,b)$ is the set of hyper parameters.

Now suppose we observe $n_1$ draws $(x_1, dots, x_{n_1})$ whith $x$ successes.

Thus $theta mid x$ also follows a beta distribution. Then, $p(x mid x, a,b)$ follows a beta-binomial but this time of parameters $(a+x,b+n_1-x)$ rather than $(a,b)$

I hope this is useful and clear

answered 5 hours ago

winperikle

584

$begingroup$
Yeap, I believe I have understood what you have explained here. Thank you very much.
$endgroup$
– Changhee Kang
3 hours ago

add a comment |

Predictive here means predictive for observations. The prior distribution is a distribution for the parameters whereas the prior predictive distribution is a distribution for the observation.

The prior predictive distribution is the distribution of $X$ "averaged" over $theta$,

$$
p_beta(x) = int p(x mid theta) p_beta(theta) dtheta
$$

This distribution is prior as it does not rely on any observations.

We can also define the same way the posterior predictive distribution, that is if we have a sample $X = (X_1, dots, X_n)$ the posterior predictive distribution is

begin{align*}
p_beta(x mid X) &= int p(x mid X, theta) p_beta(theta) dtheta \
&= int p(x mid theta) p(X mid theta) p_beta(theta) dtheta \
&= int p(x mid theta) p_beta(theta mid X)dtheta
end{align*}

Example : Beta-Binomial

Suppose our model is $X mid theta sim Bin(n_1,theta)$ i.e $P(X = x mid theta) = theta^x(1-theta)^{n_1-x}$.

We suppose a beta prior distribution for $theta$, $beta(a,b)$ where $(a,b)$ is the set of hyper parameters.

Now suppose we observe $n_1$ draws $(x_1, dots, x_{n_1})$ whith $x$ successes.

Thus $theta mid x$ also follows a beta distribution. Then, $p(x mid x, a,b)$ follows a beta-binomial but this time of parameters $(a+x,b+n_1-x)$ rather than $(a,b)$

I hope this is useful and clear

answered 5 hours ago

winperikle

584

$begingroup$
Yeap, I believe I have understood what you have explained here. Thank you very much.
$endgroup$
– Changhee Kang
3 hours ago

add a comment |

Predictive here means predictive for observations. The prior distribution is a distribution for the parameters whereas the prior predictive distribution is a distribution for the observation.

The prior predictive distribution is the distribution of $X$ "averaged" over $theta$,

$$
p_beta(x) = int p(x mid theta) p_beta(theta) dtheta
$$

This distribution is prior as it does not rely on any observations.

We can also define the same way the posterior predictive distribution, that is if we have a sample $X = (X_1, dots, X_n)$ the posterior predictive distribution is

begin{align*}
p_beta(x mid X) &= int p(x mid X, theta) p_beta(theta) dtheta \
&= int p(x mid theta) p(X mid theta) p_beta(theta) dtheta \
&= int p(x mid theta) p_beta(theta mid X)dtheta
end{align*}

Example : Beta-Binomial

Suppose our model is $X mid theta sim Bin(n_1,theta)$ i.e $P(X = x mid theta) = theta^x(1-theta)^{n_1-x}$.

We suppose a beta prior distribution for $theta$, $beta(a,b)$ where $(a,b)$ is the set of hyper parameters.

Now suppose we observe $n_1$ draws $(x_1, dots, x_{n_1})$ whith $x$ successes.

Thus $theta mid x$ also follows a beta distribution. Then, $p(x mid x, a,b)$ follows a beta-binomial but this time of parameters $(a+x,b+n_1-x)$ rather than $(a,b)$

I hope this is useful and clear

answered 5 hours ago

winperikle

584

Predictive here means predictive for observations. The prior distribution is a distribution for the parameters whereas the prior predictive distribution is a distribution for the observation.

The prior predictive distribution is the distribution of $X$ "averaged" over $theta$,

$$
p_beta(x) = int p(x mid theta) p_beta(theta) dtheta
$$

This distribution is prior as it does not rely on any observations.

We can also define the same way the posterior predictive distribution, that is if we have a sample $X = (X_1, dots, X_n)$ the posterior predictive distribution is

begin{align*}
p_beta(x mid X) &= int p(x mid X, theta) p_beta(theta) dtheta \
&= int p(x mid theta) p(X mid theta) p_beta(theta) dtheta \
&= int p(x mid theta) p_beta(theta mid X)dtheta
end{align*}

Example : Beta-Binomial

Suppose our model is $X mid theta sim Bin(n_1,theta)$ i.e $P(X = x mid theta) = theta^x(1-theta)^{n_1-x}$.

We suppose a beta prior distribution for $theta$, $beta(a,b)$ where $(a,b)$ is the set of hyper parameters.

Now suppose we observe $n_1$ draws $(x_1, dots, x_{n_1})$ whith $x$ successes.

Thus $theta mid x$ also follows a beta distribution. Then, $p(x mid x, a,b)$ follows a beta-binomial but this time of parameters $(a+x,b+n_1-x)$ rather than $(a,b)$

I hope this is useful and clear

answered 5 hours ago

winperikle

584

answered 5 hours ago

winperikle

584

answered 5 hours ago

winperikle

584

answered 5 hours ago

winperikle

584

$begingroup$
Yeap, I believe I have understood what you have explained here. Thank you very much.
$endgroup$
– Changhee Kang
3 hours ago

add a comment |

$begingroup$
Yeap, I believe I have understood what you have explained here. Thank you very much.
$endgroup$
– Changhee Kang
3 hours ago

Yeap, I believe I have understood what you have explained here. Thank you very much.

– Changhee Kang
3 hours ago

add a comment |

Changhee Kang is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Changhee Kang is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Cross Validated!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Fhyujk