Do Bayesian credible intervals treat the estimated parameter as a random variable?From a Bayesian probability perspective, why doesn't a 95% confidence interval contain the true parameter with 95% probability?How is data generated in the Bayesian framework and what is the nature on the parameter that generates the data?Mathematical proof that the posterior probability that a CI contains the true parameter is in $0,1$Is It Ever Appropriate to Treat a Bayesian Credible Interval as a Frequentist Confidence Interval?Is bias a frequentist concept or a Bayesian concept?How to use simulation to check the correctness of my Bayesian model?Interpretation of confidence interval in Bayesian terms

Asymmetric table

Tex Quotes(UVa 272)

Duplicate Files

Why in most German places is the church the tallest building?

Was there ever a treaty between 2 entities with significantly different translations to the detriment of one party?

How to find out the average duration of the peer-review process for a given journal?

Where was Carl Sagan working on a plan to detonate a nuke on the Moon? Where was he applying when he leaked it?

What does zitch dog mean?

Why is 1. d4 Nf6 2. c4 e6 3. Bg5 almost never played?

What verb is かまされる?

Rent contract say that pets are not allowed. Possible repercussions if bringing the pet anyway?

What is the difference between "Grippe" and "Männergrippe"?

Notepad++ cannot print

Lost property on Portuguese trains

How do thermal tapes transfer heat despite their low thermal conductivity?

What is the best type of paint to paint a shipping container?

Are the A380 engines interchangeable (given they are not all equipped with reverse)?

How can I unambiguously ask for a new user's "Display Name"?

Do they have Supervillain(s)?

How many String objects would be created when concatenating multiple Strings?

Heyacrazy: No Diagonals

Disambiguation of "nobis vobis" and "nobis nobis"

Are there any elected officials in the U.S. who are not legislators, judges, or constitutional officers?

Duplicate instruments in unison in an orchestra



Do Bayesian credible intervals treat the estimated parameter as a random variable?


From a Bayesian probability perspective, why doesn't a 95% confidence interval contain the true parameter with 95% probability?How is data generated in the Bayesian framework and what is the nature on the parameter that generates the data?Mathematical proof that the posterior probability that a CI contains the true parameter is in $0,1$Is It Ever Appropriate to Treat a Bayesian Credible Interval as a Frequentist Confidence Interval?Is bias a frequentist concept or a Bayesian concept?How to use simulation to check the correctness of my Bayesian model?Interpretation of confidence interval in Bayesian terms






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








4












$begingroup$


I read the following paragraph on Wikipedia recently:




Bayesian intervals treat their bounds as fixed and the estimated parameter as a random variable, whereas frequentist confidence intervals treat their bounds as random variables and the parameter as a fixed value.




However, I am not sure whether this is true. My interpretation of the credible interval was that it encapsulated our own uncertainty about the true value of the estimated parameter but that the estimated parameter itself did have some kind of 'true' value.



This is slightly different to saying that the estimated parameter is a 'random variable'. Am I wrong?










share|cite|improve this question









$endgroup$













  • $begingroup$
    I would not defend every word choice, but the Wikipedia quote is essentially correct. Bayesian inference begins with a prior probability distribution on the parameter, taken to be a random variable.
    $endgroup$
    – BruceET
    6 hours ago










  • $begingroup$
    The sentence is confusing. In a Bayesian perspective, the parameter $theta$ is treated as random, while the estimator of the parameter $hattheta(x)$ is not. What is the estimated parameter?
    $endgroup$
    – Xi'an
    5 hours ago










  • $begingroup$
    I agree that it is confusing. To take an example consider the simple beta-binomial model. My question is: how do we interpret the posterior beta distribution of the parameter 'p'? Are we saying that it reflects the fact that 'p' itself is literally a random variable or does it reflect our own uncertainty about what 'p' could be?
    $endgroup$
    – Johnny Breen
    5 hours ago

















4












$begingroup$


I read the following paragraph on Wikipedia recently:




Bayesian intervals treat their bounds as fixed and the estimated parameter as a random variable, whereas frequentist confidence intervals treat their bounds as random variables and the parameter as a fixed value.




However, I am not sure whether this is true. My interpretation of the credible interval was that it encapsulated our own uncertainty about the true value of the estimated parameter but that the estimated parameter itself did have some kind of 'true' value.



This is slightly different to saying that the estimated parameter is a 'random variable'. Am I wrong?










share|cite|improve this question









$endgroup$













  • $begingroup$
    I would not defend every word choice, but the Wikipedia quote is essentially correct. Bayesian inference begins with a prior probability distribution on the parameter, taken to be a random variable.
    $endgroup$
    – BruceET
    6 hours ago










  • $begingroup$
    The sentence is confusing. In a Bayesian perspective, the parameter $theta$ is treated as random, while the estimator of the parameter $hattheta(x)$ is not. What is the estimated parameter?
    $endgroup$
    – Xi'an
    5 hours ago










  • $begingroup$
    I agree that it is confusing. To take an example consider the simple beta-binomial model. My question is: how do we interpret the posterior beta distribution of the parameter 'p'? Are we saying that it reflects the fact that 'p' itself is literally a random variable or does it reflect our own uncertainty about what 'p' could be?
    $endgroup$
    – Johnny Breen
    5 hours ago













4












4








4


2



$begingroup$


I read the following paragraph on Wikipedia recently:




Bayesian intervals treat their bounds as fixed and the estimated parameter as a random variable, whereas frequentist confidence intervals treat their bounds as random variables and the parameter as a fixed value.




However, I am not sure whether this is true. My interpretation of the credible interval was that it encapsulated our own uncertainty about the true value of the estimated parameter but that the estimated parameter itself did have some kind of 'true' value.



This is slightly different to saying that the estimated parameter is a 'random variable'. Am I wrong?










share|cite|improve this question









$endgroup$




I read the following paragraph on Wikipedia recently:




Bayesian intervals treat their bounds as fixed and the estimated parameter as a random variable, whereas frequentist confidence intervals treat their bounds as random variables and the parameter as a fixed value.




However, I am not sure whether this is true. My interpretation of the credible interval was that it encapsulated our own uncertainty about the true value of the estimated parameter but that the estimated parameter itself did have some kind of 'true' value.



This is slightly different to saying that the estimated parameter is a 'random variable'. Am I wrong?







bayesian empirical-bayes






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked 8 hours ago









Johnny BreenJohnny Breen

1263 bronze badges




1263 bronze badges














  • $begingroup$
    I would not defend every word choice, but the Wikipedia quote is essentially correct. Bayesian inference begins with a prior probability distribution on the parameter, taken to be a random variable.
    $endgroup$
    – BruceET
    6 hours ago










  • $begingroup$
    The sentence is confusing. In a Bayesian perspective, the parameter $theta$ is treated as random, while the estimator of the parameter $hattheta(x)$ is not. What is the estimated parameter?
    $endgroup$
    – Xi'an
    5 hours ago










  • $begingroup$
    I agree that it is confusing. To take an example consider the simple beta-binomial model. My question is: how do we interpret the posterior beta distribution of the parameter 'p'? Are we saying that it reflects the fact that 'p' itself is literally a random variable or does it reflect our own uncertainty about what 'p' could be?
    $endgroup$
    – Johnny Breen
    5 hours ago
















  • $begingroup$
    I would not defend every word choice, but the Wikipedia quote is essentially correct. Bayesian inference begins with a prior probability distribution on the parameter, taken to be a random variable.
    $endgroup$
    – BruceET
    6 hours ago










  • $begingroup$
    The sentence is confusing. In a Bayesian perspective, the parameter $theta$ is treated as random, while the estimator of the parameter $hattheta(x)$ is not. What is the estimated parameter?
    $endgroup$
    – Xi'an
    5 hours ago










  • $begingroup$
    I agree that it is confusing. To take an example consider the simple beta-binomial model. My question is: how do we interpret the posterior beta distribution of the parameter 'p'? Are we saying that it reflects the fact that 'p' itself is literally a random variable or does it reflect our own uncertainty about what 'p' could be?
    $endgroup$
    – Johnny Breen
    5 hours ago















$begingroup$
I would not defend every word choice, but the Wikipedia quote is essentially correct. Bayesian inference begins with a prior probability distribution on the parameter, taken to be a random variable.
$endgroup$
– BruceET
6 hours ago




$begingroup$
I would not defend every word choice, but the Wikipedia quote is essentially correct. Bayesian inference begins with a prior probability distribution on the parameter, taken to be a random variable.
$endgroup$
– BruceET
6 hours ago












$begingroup$
The sentence is confusing. In a Bayesian perspective, the parameter $theta$ is treated as random, while the estimator of the parameter $hattheta(x)$ is not. What is the estimated parameter?
$endgroup$
– Xi'an
5 hours ago




$begingroup$
The sentence is confusing. In a Bayesian perspective, the parameter $theta$ is treated as random, while the estimator of the parameter $hattheta(x)$ is not. What is the estimated parameter?
$endgroup$
– Xi'an
5 hours ago












$begingroup$
I agree that it is confusing. To take an example consider the simple beta-binomial model. My question is: how do we interpret the posterior beta distribution of the parameter 'p'? Are we saying that it reflects the fact that 'p' itself is literally a random variable or does it reflect our own uncertainty about what 'p' could be?
$endgroup$
– Johnny Breen
5 hours ago




$begingroup$
I agree that it is confusing. To take an example consider the simple beta-binomial model. My question is: how do we interpret the posterior beta distribution of the parameter 'p'? Are we saying that it reflects the fact that 'p' itself is literally a random variable or does it reflect our own uncertainty about what 'p' could be?
$endgroup$
– Johnny Breen
5 hours ago










2 Answers
2






active

oldest

votes


















3













$begingroup$

Consider the situation in which you have $n = 20$ observations of a binary (2-coutcome) process. Often the two possible outcomes on each trial are called Success and Failure.



Frequentist confidence interval. Suppose you observe $x = 15$ successes in the $n = 20$ trials. View the number $X$ of Successes as a random variable $X sim mathsfBinom(n=20; p),$ where the success probability $p$ is an unknown constant. The Wald 95% frequentist confidence interval
is based on $hat p = 15/20 = 0.75,$ an estimate of $p.$
Using a normal approximation, this CI is of the form $hat p pm 1.96sqrthat p(1-hat p)/n$ or
$(0.560, 0.940).$ [The somewhat improved Agresti-Coull
style of 95% CI is $(0.526, 0.890).]$



A common interpretation is that the procedure that
produces such an interval will produce lower and upper confidence limits that include the true value of $p$ in 95% of instances over the long run. [The advantage of the Agresti-Coull interval is that the long run proportion of such inclusions is nearer to 95% than for the Wald interval.]



Bayesian credible interval. The Bayesian approach
begins by treating $p$ as a random variable. Prior to seeing data, if we have no prior experience with the kind binomial experiment being conducted or no personal
opinion as to the distribution of $p,$ we may choose
the 'flat' or 'noninformative' uniform distribution,
saying $p sim mathsfUnif(0, 1) equiv
mathsfBeta(1,1).$



Then, given 15 successes in 20 binomial trials, we find the posterior distribution of $p$ as
the product of the prior distribution and the binomial likelihood function.



$$f(p|x) propto p^1-1(1-p)^1-1 times
p^15(1-p)^5 propto
p^16-1(1-p)^6-1,$$

where the symbol $propto$ (read 'proportional to')
indicates that we are omitting 'norming' constant
factors of the distributions, which do not contain $p.$
Without the norming factor, a density function or PMF
is called the 'kernel' of the distribution.



Here we recognize that the kernel of the posterior distribution is that of the distribution $mathsfBeta(16, 6).$ Then a 95% Bayesian posterior interval
or credible interval is found by cutting 2.5% from each tail of the posterior distribution. Here is the result from R:
$(0.528,0.887).$ [For information about beta distributions, see Wikipedia.]



qbeta(c(.025,.975), 16, 6)
[1] 0.5283402 0.8871906


If we believed the prior to be reasonable and believe that
the 20-trial binomial experiment was fairly conducted,
then logically we must expect the Bayesian
interval estimate to give useful information about
the experiment at hand---with no reference to a hypothetical long-run future.



Notice that this Bayesian credible interval
is numerically similar to the Agresti-Coull confidence interval. However, as you point out,
the interpretations of the two types of interval estimates (frequentist and Bayesian) are not the same.



Informative prior. Before we saw the data, if we had reason to believe
that $p approx 2/3,$ then we might have chosen the
distribution $mathsfBeta(8,4)$ as the prior distribution. [This distribution has mean 2/3, standard deviation about 0.35, and puts about 95% of its
probability in the interval $(0.39, 0.89).$]



qbeta(c(.025,.975), 8,4)
[1] 0.3902574 0.8907366


In that case, multiplying the prior by the likelihood gives the posterior kernel of $mathsfBeta(23,7),$
so that the 95% Bayesian credible interval is
$(0.603, 0.897).$ The posterior distribution is a melding of the information in the prior and the likelihood, which are in rough agreement, so the resulting Bayesian interval
estimate is shorter than than the interval from
the flat prior.



qbeta(c(.025,.975), 23,7)
[1] 0.6027531 0.8970164


Notes: (1) The beta prior and binomial likelihood function
are 'conjugage`, that is, mathematically compatible in a way that allows us to find the posterior distribution without computation. Sometimes, there does not seem to
be a prior distribution that is conjugate with the likelihood. The it may be necessary to use numerical integration to find the posterior distribution.



(2) A Bayesian credible interval from an noninformative prior essentially depends on the likelihood function. Also, much of frequentist inference depends of the likelihood function. Thus is is not
a surprise that a Bayesian credible interval from a flat prior may be numerically similar to a frequentist confidence interval based on the same likelihood.






share|cite|improve this answer











$endgroup$






















    2













    $begingroup$

    Your interpretation is correct. In my opinion that particular passage in the Wikipedia article obfuscates a simple concept with opaque technical language. The initial passage is much clearer: "is an interval within which an unobserved parameter value falls with a particular subjective probability".



    The technical term "random variable" is misleading, especially from a Bayesian point of view. It's still used just out of tradition; take a look at Shafer's intriguing historical study When to call a variable random about its origins. From a Bayesian point of view, "random" simply means "unknown" or "uncertain" (for whatever reason), and "variable" is a misnomer for "quantity" or "value". For example, when we try to assess our uncertainty about the speed of light $c$ from a measurement or experiment, we speak of $c$ as a "random variable"; but it's obviously not "random" (and what does "random" mean?), nor is it "variable" – in fact, it's a constant. It's just a physical constant whose exact value we're uncertain about.



    See § 16.4 (and other places) in Jaynes's book for an illuminating discussion of this topic.



    In frequentist theory the term "random variable" may have a different meaning though. I'm not an expert in this theory, so I won't try to define it there. I think there's some literature around that shows that frequentist confidence intervals and Bayesian intervals can be quite different; see for example Confidence intervals vs Bayesian intervals or https://www.ncbi.nlm.nih.gov/pubmed/6830080.






    share|cite|improve this answer











    $endgroup$














    • $begingroup$
      (+1) Jaynes has a lot to say that is important, but I think that the linked paper Confidence Intervals vs Bayesian Intervals is largely a polemic, and it may have been more relevant in the past when Bayesian methods were less accepted.
      $endgroup$
      – Michael Lew
      2 hours ago














    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "65"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f423548%2fdo-bayesian-credible-intervals-treat-the-estimated-parameter-as-a-random-variabl%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    3













    $begingroup$

    Consider the situation in which you have $n = 20$ observations of a binary (2-coutcome) process. Often the two possible outcomes on each trial are called Success and Failure.



    Frequentist confidence interval. Suppose you observe $x = 15$ successes in the $n = 20$ trials. View the number $X$ of Successes as a random variable $X sim mathsfBinom(n=20; p),$ where the success probability $p$ is an unknown constant. The Wald 95% frequentist confidence interval
    is based on $hat p = 15/20 = 0.75,$ an estimate of $p.$
    Using a normal approximation, this CI is of the form $hat p pm 1.96sqrthat p(1-hat p)/n$ or
    $(0.560, 0.940).$ [The somewhat improved Agresti-Coull
    style of 95% CI is $(0.526, 0.890).]$



    A common interpretation is that the procedure that
    produces such an interval will produce lower and upper confidence limits that include the true value of $p$ in 95% of instances over the long run. [The advantage of the Agresti-Coull interval is that the long run proportion of such inclusions is nearer to 95% than for the Wald interval.]



    Bayesian credible interval. The Bayesian approach
    begins by treating $p$ as a random variable. Prior to seeing data, if we have no prior experience with the kind binomial experiment being conducted or no personal
    opinion as to the distribution of $p,$ we may choose
    the 'flat' or 'noninformative' uniform distribution,
    saying $p sim mathsfUnif(0, 1) equiv
    mathsfBeta(1,1).$



    Then, given 15 successes in 20 binomial trials, we find the posterior distribution of $p$ as
    the product of the prior distribution and the binomial likelihood function.



    $$f(p|x) propto p^1-1(1-p)^1-1 times
    p^15(1-p)^5 propto
    p^16-1(1-p)^6-1,$$

    where the symbol $propto$ (read 'proportional to')
    indicates that we are omitting 'norming' constant
    factors of the distributions, which do not contain $p.$
    Without the norming factor, a density function or PMF
    is called the 'kernel' of the distribution.



    Here we recognize that the kernel of the posterior distribution is that of the distribution $mathsfBeta(16, 6).$ Then a 95% Bayesian posterior interval
    or credible interval is found by cutting 2.5% from each tail of the posterior distribution. Here is the result from R:
    $(0.528,0.887).$ [For information about beta distributions, see Wikipedia.]



    qbeta(c(.025,.975), 16, 6)
    [1] 0.5283402 0.8871906


    If we believed the prior to be reasonable and believe that
    the 20-trial binomial experiment was fairly conducted,
    then logically we must expect the Bayesian
    interval estimate to give useful information about
    the experiment at hand---with no reference to a hypothetical long-run future.



    Notice that this Bayesian credible interval
    is numerically similar to the Agresti-Coull confidence interval. However, as you point out,
    the interpretations of the two types of interval estimates (frequentist and Bayesian) are not the same.



    Informative prior. Before we saw the data, if we had reason to believe
    that $p approx 2/3,$ then we might have chosen the
    distribution $mathsfBeta(8,4)$ as the prior distribution. [This distribution has mean 2/3, standard deviation about 0.35, and puts about 95% of its
    probability in the interval $(0.39, 0.89).$]



    qbeta(c(.025,.975), 8,4)
    [1] 0.3902574 0.8907366


    In that case, multiplying the prior by the likelihood gives the posterior kernel of $mathsfBeta(23,7),$
    so that the 95% Bayesian credible interval is
    $(0.603, 0.897).$ The posterior distribution is a melding of the information in the prior and the likelihood, which are in rough agreement, so the resulting Bayesian interval
    estimate is shorter than than the interval from
    the flat prior.



    qbeta(c(.025,.975), 23,7)
    [1] 0.6027531 0.8970164


    Notes: (1) The beta prior and binomial likelihood function
    are 'conjugage`, that is, mathematically compatible in a way that allows us to find the posterior distribution without computation. Sometimes, there does not seem to
    be a prior distribution that is conjugate with the likelihood. The it may be necessary to use numerical integration to find the posterior distribution.



    (2) A Bayesian credible interval from an noninformative prior essentially depends on the likelihood function. Also, much of frequentist inference depends of the likelihood function. Thus is is not
    a surprise that a Bayesian credible interval from a flat prior may be numerically similar to a frequentist confidence interval based on the same likelihood.






    share|cite|improve this answer











    $endgroup$



















      3













      $begingroup$

      Consider the situation in which you have $n = 20$ observations of a binary (2-coutcome) process. Often the two possible outcomes on each trial are called Success and Failure.



      Frequentist confidence interval. Suppose you observe $x = 15$ successes in the $n = 20$ trials. View the number $X$ of Successes as a random variable $X sim mathsfBinom(n=20; p),$ where the success probability $p$ is an unknown constant. The Wald 95% frequentist confidence interval
      is based on $hat p = 15/20 = 0.75,$ an estimate of $p.$
      Using a normal approximation, this CI is of the form $hat p pm 1.96sqrthat p(1-hat p)/n$ or
      $(0.560, 0.940).$ [The somewhat improved Agresti-Coull
      style of 95% CI is $(0.526, 0.890).]$



      A common interpretation is that the procedure that
      produces such an interval will produce lower and upper confidence limits that include the true value of $p$ in 95% of instances over the long run. [The advantage of the Agresti-Coull interval is that the long run proportion of such inclusions is nearer to 95% than for the Wald interval.]



      Bayesian credible interval. The Bayesian approach
      begins by treating $p$ as a random variable. Prior to seeing data, if we have no prior experience with the kind binomial experiment being conducted or no personal
      opinion as to the distribution of $p,$ we may choose
      the 'flat' or 'noninformative' uniform distribution,
      saying $p sim mathsfUnif(0, 1) equiv
      mathsfBeta(1,1).$



      Then, given 15 successes in 20 binomial trials, we find the posterior distribution of $p$ as
      the product of the prior distribution and the binomial likelihood function.



      $$f(p|x) propto p^1-1(1-p)^1-1 times
      p^15(1-p)^5 propto
      p^16-1(1-p)^6-1,$$

      where the symbol $propto$ (read 'proportional to')
      indicates that we are omitting 'norming' constant
      factors of the distributions, which do not contain $p.$
      Without the norming factor, a density function or PMF
      is called the 'kernel' of the distribution.



      Here we recognize that the kernel of the posterior distribution is that of the distribution $mathsfBeta(16, 6).$ Then a 95% Bayesian posterior interval
      or credible interval is found by cutting 2.5% from each tail of the posterior distribution. Here is the result from R:
      $(0.528,0.887).$ [For information about beta distributions, see Wikipedia.]



      qbeta(c(.025,.975), 16, 6)
      [1] 0.5283402 0.8871906


      If we believed the prior to be reasonable and believe that
      the 20-trial binomial experiment was fairly conducted,
      then logically we must expect the Bayesian
      interval estimate to give useful information about
      the experiment at hand---with no reference to a hypothetical long-run future.



      Notice that this Bayesian credible interval
      is numerically similar to the Agresti-Coull confidence interval. However, as you point out,
      the interpretations of the two types of interval estimates (frequentist and Bayesian) are not the same.



      Informative prior. Before we saw the data, if we had reason to believe
      that $p approx 2/3,$ then we might have chosen the
      distribution $mathsfBeta(8,4)$ as the prior distribution. [This distribution has mean 2/3, standard deviation about 0.35, and puts about 95% of its
      probability in the interval $(0.39, 0.89).$]



      qbeta(c(.025,.975), 8,4)
      [1] 0.3902574 0.8907366


      In that case, multiplying the prior by the likelihood gives the posterior kernel of $mathsfBeta(23,7),$
      so that the 95% Bayesian credible interval is
      $(0.603, 0.897).$ The posterior distribution is a melding of the information in the prior and the likelihood, which are in rough agreement, so the resulting Bayesian interval
      estimate is shorter than than the interval from
      the flat prior.



      qbeta(c(.025,.975), 23,7)
      [1] 0.6027531 0.8970164


      Notes: (1) The beta prior and binomial likelihood function
      are 'conjugage`, that is, mathematically compatible in a way that allows us to find the posterior distribution without computation. Sometimes, there does not seem to
      be a prior distribution that is conjugate with the likelihood. The it may be necessary to use numerical integration to find the posterior distribution.



      (2) A Bayesian credible interval from an noninformative prior essentially depends on the likelihood function. Also, much of frequentist inference depends of the likelihood function. Thus is is not
      a surprise that a Bayesian credible interval from a flat prior may be numerically similar to a frequentist confidence interval based on the same likelihood.






      share|cite|improve this answer











      $endgroup$

















        3














        3










        3







        $begingroup$

        Consider the situation in which you have $n = 20$ observations of a binary (2-coutcome) process. Often the two possible outcomes on each trial are called Success and Failure.



        Frequentist confidence interval. Suppose you observe $x = 15$ successes in the $n = 20$ trials. View the number $X$ of Successes as a random variable $X sim mathsfBinom(n=20; p),$ where the success probability $p$ is an unknown constant. The Wald 95% frequentist confidence interval
        is based on $hat p = 15/20 = 0.75,$ an estimate of $p.$
        Using a normal approximation, this CI is of the form $hat p pm 1.96sqrthat p(1-hat p)/n$ or
        $(0.560, 0.940).$ [The somewhat improved Agresti-Coull
        style of 95% CI is $(0.526, 0.890).]$



        A common interpretation is that the procedure that
        produces such an interval will produce lower and upper confidence limits that include the true value of $p$ in 95% of instances over the long run. [The advantage of the Agresti-Coull interval is that the long run proportion of such inclusions is nearer to 95% than for the Wald interval.]



        Bayesian credible interval. The Bayesian approach
        begins by treating $p$ as a random variable. Prior to seeing data, if we have no prior experience with the kind binomial experiment being conducted or no personal
        opinion as to the distribution of $p,$ we may choose
        the 'flat' or 'noninformative' uniform distribution,
        saying $p sim mathsfUnif(0, 1) equiv
        mathsfBeta(1,1).$



        Then, given 15 successes in 20 binomial trials, we find the posterior distribution of $p$ as
        the product of the prior distribution and the binomial likelihood function.



        $$f(p|x) propto p^1-1(1-p)^1-1 times
        p^15(1-p)^5 propto
        p^16-1(1-p)^6-1,$$

        where the symbol $propto$ (read 'proportional to')
        indicates that we are omitting 'norming' constant
        factors of the distributions, which do not contain $p.$
        Without the norming factor, a density function or PMF
        is called the 'kernel' of the distribution.



        Here we recognize that the kernel of the posterior distribution is that of the distribution $mathsfBeta(16, 6).$ Then a 95% Bayesian posterior interval
        or credible interval is found by cutting 2.5% from each tail of the posterior distribution. Here is the result from R:
        $(0.528,0.887).$ [For information about beta distributions, see Wikipedia.]



        qbeta(c(.025,.975), 16, 6)
        [1] 0.5283402 0.8871906


        If we believed the prior to be reasonable and believe that
        the 20-trial binomial experiment was fairly conducted,
        then logically we must expect the Bayesian
        interval estimate to give useful information about
        the experiment at hand---with no reference to a hypothetical long-run future.



        Notice that this Bayesian credible interval
        is numerically similar to the Agresti-Coull confidence interval. However, as you point out,
        the interpretations of the two types of interval estimates (frequentist and Bayesian) are not the same.



        Informative prior. Before we saw the data, if we had reason to believe
        that $p approx 2/3,$ then we might have chosen the
        distribution $mathsfBeta(8,4)$ as the prior distribution. [This distribution has mean 2/3, standard deviation about 0.35, and puts about 95% of its
        probability in the interval $(0.39, 0.89).$]



        qbeta(c(.025,.975), 8,4)
        [1] 0.3902574 0.8907366


        In that case, multiplying the prior by the likelihood gives the posterior kernel of $mathsfBeta(23,7),$
        so that the 95% Bayesian credible interval is
        $(0.603, 0.897).$ The posterior distribution is a melding of the information in the prior and the likelihood, which are in rough agreement, so the resulting Bayesian interval
        estimate is shorter than than the interval from
        the flat prior.



        qbeta(c(.025,.975), 23,7)
        [1] 0.6027531 0.8970164


        Notes: (1) The beta prior and binomial likelihood function
        are 'conjugage`, that is, mathematically compatible in a way that allows us to find the posterior distribution without computation. Sometimes, there does not seem to
        be a prior distribution that is conjugate with the likelihood. The it may be necessary to use numerical integration to find the posterior distribution.



        (2) A Bayesian credible interval from an noninformative prior essentially depends on the likelihood function. Also, much of frequentist inference depends of the likelihood function. Thus is is not
        a surprise that a Bayesian credible interval from a flat prior may be numerically similar to a frequentist confidence interval based on the same likelihood.






        share|cite|improve this answer











        $endgroup$



        Consider the situation in which you have $n = 20$ observations of a binary (2-coutcome) process. Often the two possible outcomes on each trial are called Success and Failure.



        Frequentist confidence interval. Suppose you observe $x = 15$ successes in the $n = 20$ trials. View the number $X$ of Successes as a random variable $X sim mathsfBinom(n=20; p),$ where the success probability $p$ is an unknown constant. The Wald 95% frequentist confidence interval
        is based on $hat p = 15/20 = 0.75,$ an estimate of $p.$
        Using a normal approximation, this CI is of the form $hat p pm 1.96sqrthat p(1-hat p)/n$ or
        $(0.560, 0.940).$ [The somewhat improved Agresti-Coull
        style of 95% CI is $(0.526, 0.890).]$



        A common interpretation is that the procedure that
        produces such an interval will produce lower and upper confidence limits that include the true value of $p$ in 95% of instances over the long run. [The advantage of the Agresti-Coull interval is that the long run proportion of such inclusions is nearer to 95% than for the Wald interval.]



        Bayesian credible interval. The Bayesian approach
        begins by treating $p$ as a random variable. Prior to seeing data, if we have no prior experience with the kind binomial experiment being conducted or no personal
        opinion as to the distribution of $p,$ we may choose
        the 'flat' or 'noninformative' uniform distribution,
        saying $p sim mathsfUnif(0, 1) equiv
        mathsfBeta(1,1).$



        Then, given 15 successes in 20 binomial trials, we find the posterior distribution of $p$ as
        the product of the prior distribution and the binomial likelihood function.



        $$f(p|x) propto p^1-1(1-p)^1-1 times
        p^15(1-p)^5 propto
        p^16-1(1-p)^6-1,$$

        where the symbol $propto$ (read 'proportional to')
        indicates that we are omitting 'norming' constant
        factors of the distributions, which do not contain $p.$
        Without the norming factor, a density function or PMF
        is called the 'kernel' of the distribution.



        Here we recognize that the kernel of the posterior distribution is that of the distribution $mathsfBeta(16, 6).$ Then a 95% Bayesian posterior interval
        or credible interval is found by cutting 2.5% from each tail of the posterior distribution. Here is the result from R:
        $(0.528,0.887).$ [For information about beta distributions, see Wikipedia.]



        qbeta(c(.025,.975), 16, 6)
        [1] 0.5283402 0.8871906


        If we believed the prior to be reasonable and believe that
        the 20-trial binomial experiment was fairly conducted,
        then logically we must expect the Bayesian
        interval estimate to give useful information about
        the experiment at hand---with no reference to a hypothetical long-run future.



        Notice that this Bayesian credible interval
        is numerically similar to the Agresti-Coull confidence interval. However, as you point out,
        the interpretations of the two types of interval estimates (frequentist and Bayesian) are not the same.



        Informative prior. Before we saw the data, if we had reason to believe
        that $p approx 2/3,$ then we might have chosen the
        distribution $mathsfBeta(8,4)$ as the prior distribution. [This distribution has mean 2/3, standard deviation about 0.35, and puts about 95% of its
        probability in the interval $(0.39, 0.89).$]



        qbeta(c(.025,.975), 8,4)
        [1] 0.3902574 0.8907366


        In that case, multiplying the prior by the likelihood gives the posterior kernel of $mathsfBeta(23,7),$
        so that the 95% Bayesian credible interval is
        $(0.603, 0.897).$ The posterior distribution is a melding of the information in the prior and the likelihood, which are in rough agreement, so the resulting Bayesian interval
        estimate is shorter than than the interval from
        the flat prior.



        qbeta(c(.025,.975), 23,7)
        [1] 0.6027531 0.8970164


        Notes: (1) The beta prior and binomial likelihood function
        are 'conjugage`, that is, mathematically compatible in a way that allows us to find the posterior distribution without computation. Sometimes, there does not seem to
        be a prior distribution that is conjugate with the likelihood. The it may be necessary to use numerical integration to find the posterior distribution.



        (2) A Bayesian credible interval from an noninformative prior essentially depends on the likelihood function. Also, much of frequentist inference depends of the likelihood function. Thus is is not
        a surprise that a Bayesian credible interval from a flat prior may be numerically similar to a frequentist confidence interval based on the same likelihood.







        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited 4 hours ago

























        answered 5 hours ago









        BruceETBruceET

        13.3k1 gold badge9 silver badges26 bronze badges




        13.3k1 gold badge9 silver badges26 bronze badges


























            2













            $begingroup$

            Your interpretation is correct. In my opinion that particular passage in the Wikipedia article obfuscates a simple concept with opaque technical language. The initial passage is much clearer: "is an interval within which an unobserved parameter value falls with a particular subjective probability".



            The technical term "random variable" is misleading, especially from a Bayesian point of view. It's still used just out of tradition; take a look at Shafer's intriguing historical study When to call a variable random about its origins. From a Bayesian point of view, "random" simply means "unknown" or "uncertain" (for whatever reason), and "variable" is a misnomer for "quantity" or "value". For example, when we try to assess our uncertainty about the speed of light $c$ from a measurement or experiment, we speak of $c$ as a "random variable"; but it's obviously not "random" (and what does "random" mean?), nor is it "variable" – in fact, it's a constant. It's just a physical constant whose exact value we're uncertain about.



            See § 16.4 (and other places) in Jaynes's book for an illuminating discussion of this topic.



            In frequentist theory the term "random variable" may have a different meaning though. I'm not an expert in this theory, so I won't try to define it there. I think there's some literature around that shows that frequentist confidence intervals and Bayesian intervals can be quite different; see for example Confidence intervals vs Bayesian intervals or https://www.ncbi.nlm.nih.gov/pubmed/6830080.






            share|cite|improve this answer











            $endgroup$














            • $begingroup$
              (+1) Jaynes has a lot to say that is important, but I think that the linked paper Confidence Intervals vs Bayesian Intervals is largely a polemic, and it may have been more relevant in the past when Bayesian methods were less accepted.
              $endgroup$
              – Michael Lew
              2 hours ago
















            2













            $begingroup$

            Your interpretation is correct. In my opinion that particular passage in the Wikipedia article obfuscates a simple concept with opaque technical language. The initial passage is much clearer: "is an interval within which an unobserved parameter value falls with a particular subjective probability".



            The technical term "random variable" is misleading, especially from a Bayesian point of view. It's still used just out of tradition; take a look at Shafer's intriguing historical study When to call a variable random about its origins. From a Bayesian point of view, "random" simply means "unknown" or "uncertain" (for whatever reason), and "variable" is a misnomer for "quantity" or "value". For example, when we try to assess our uncertainty about the speed of light $c$ from a measurement or experiment, we speak of $c$ as a "random variable"; but it's obviously not "random" (and what does "random" mean?), nor is it "variable" – in fact, it's a constant. It's just a physical constant whose exact value we're uncertain about.



            See § 16.4 (and other places) in Jaynes's book for an illuminating discussion of this topic.



            In frequentist theory the term "random variable" may have a different meaning though. I'm not an expert in this theory, so I won't try to define it there. I think there's some literature around that shows that frequentist confidence intervals and Bayesian intervals can be quite different; see for example Confidence intervals vs Bayesian intervals or https://www.ncbi.nlm.nih.gov/pubmed/6830080.






            share|cite|improve this answer











            $endgroup$














            • $begingroup$
              (+1) Jaynes has a lot to say that is important, but I think that the linked paper Confidence Intervals vs Bayesian Intervals is largely a polemic, and it may have been more relevant in the past when Bayesian methods were less accepted.
              $endgroup$
              – Michael Lew
              2 hours ago














            2














            2










            2







            $begingroup$

            Your interpretation is correct. In my opinion that particular passage in the Wikipedia article obfuscates a simple concept with opaque technical language. The initial passage is much clearer: "is an interval within which an unobserved parameter value falls with a particular subjective probability".



            The technical term "random variable" is misleading, especially from a Bayesian point of view. It's still used just out of tradition; take a look at Shafer's intriguing historical study When to call a variable random about its origins. From a Bayesian point of view, "random" simply means "unknown" or "uncertain" (for whatever reason), and "variable" is a misnomer for "quantity" or "value". For example, when we try to assess our uncertainty about the speed of light $c$ from a measurement or experiment, we speak of $c$ as a "random variable"; but it's obviously not "random" (and what does "random" mean?), nor is it "variable" – in fact, it's a constant. It's just a physical constant whose exact value we're uncertain about.



            See § 16.4 (and other places) in Jaynes's book for an illuminating discussion of this topic.



            In frequentist theory the term "random variable" may have a different meaning though. I'm not an expert in this theory, so I won't try to define it there. I think there's some literature around that shows that frequentist confidence intervals and Bayesian intervals can be quite different; see for example Confidence intervals vs Bayesian intervals or https://www.ncbi.nlm.nih.gov/pubmed/6830080.






            share|cite|improve this answer











            $endgroup$



            Your interpretation is correct. In my opinion that particular passage in the Wikipedia article obfuscates a simple concept with opaque technical language. The initial passage is much clearer: "is an interval within which an unobserved parameter value falls with a particular subjective probability".



            The technical term "random variable" is misleading, especially from a Bayesian point of view. It's still used just out of tradition; take a look at Shafer's intriguing historical study When to call a variable random about its origins. From a Bayesian point of view, "random" simply means "unknown" or "uncertain" (for whatever reason), and "variable" is a misnomer for "quantity" or "value". For example, when we try to assess our uncertainty about the speed of light $c$ from a measurement or experiment, we speak of $c$ as a "random variable"; but it's obviously not "random" (and what does "random" mean?), nor is it "variable" – in fact, it's a constant. It's just a physical constant whose exact value we're uncertain about.



            See § 16.4 (and other places) in Jaynes's book for an illuminating discussion of this topic.



            In frequentist theory the term "random variable" may have a different meaning though. I'm not an expert in this theory, so I won't try to define it there. I think there's some literature around that shows that frequentist confidence intervals and Bayesian intervals can be quite different; see for example Confidence intervals vs Bayesian intervals or https://www.ncbi.nlm.nih.gov/pubmed/6830080.







            share|cite|improve this answer














            share|cite|improve this answer



            share|cite|improve this answer








            edited 3 hours ago

























            answered 4 hours ago









            pglpmpglpm

            4524 silver badges12 bronze badges




            4524 silver badges12 bronze badges














            • $begingroup$
              (+1) Jaynes has a lot to say that is important, but I think that the linked paper Confidence Intervals vs Bayesian Intervals is largely a polemic, and it may have been more relevant in the past when Bayesian methods were less accepted.
              $endgroup$
              – Michael Lew
              2 hours ago

















            • $begingroup$
              (+1) Jaynes has a lot to say that is important, but I think that the linked paper Confidence Intervals vs Bayesian Intervals is largely a polemic, and it may have been more relevant in the past when Bayesian methods were less accepted.
              $endgroup$
              – Michael Lew
              2 hours ago
















            $begingroup$
            (+1) Jaynes has a lot to say that is important, but I think that the linked paper Confidence Intervals vs Bayesian Intervals is largely a polemic, and it may have been more relevant in the past when Bayesian methods were less accepted.
            $endgroup$
            – Michael Lew
            2 hours ago





            $begingroup$
            (+1) Jaynes has a lot to say that is important, but I think that the linked paper Confidence Intervals vs Bayesian Intervals is largely a polemic, and it may have been more relevant in the past when Bayesian methods were less accepted.
            $endgroup$
            – Michael Lew
            2 hours ago


















            draft saved

            draft discarded
















































            Thanks for contributing an answer to Cross Validated!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f423548%2fdo-bayesian-credible-intervals-treat-the-estimated-parameter-as-a-random-variabl%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Invision Community Contents History See also References External links Navigation menuProprietaryinvisioncommunity.comIPS Community ForumsIPS Community Forumsthis blog entry"License Changes, IP.Board 3.4, and the Future""Interview -- Matt Mecham of Ibforums""CEO Invision Power Board, Matt Mecham Is a Liar, Thief!"IPB License Explanation 1.3, 1.3.1, 2.0, and 2.1ArchivedSecurity Fixes, Updates And Enhancements For IPB 1.3.1Archived"New Demo Accounts - Invision Power Services"the original"New Default Skin"the original"Invision Power Board 3.0.0 and Applications Released"the original"Archived copy"the original"Perpetual licenses being done away with""Release Notes - Invision Power Services""Introducing: IPS Community Suite 4!"Invision Community Release Notes

            Canceling a color specificationRandomly assigning color to Graphics3D objects?Default color for Filling in Mathematica 9Coloring specific elements of sets with a prime modified order in an array plotHow to pick a color differing significantly from the colors already in a given color list?Detection of the text colorColor numbers based on their valueCan color schemes for use with ColorData include opacity specification?My dynamic color schemes

            Ласкавець круглолистий Зміст Опис | Поширення | Галерея | Примітки | Посилання | Навігаційне меню58171138361-22960890446Bupleurum rotundifoliumEuro+Med PlantbasePlants of the World Online — Kew ScienceGermplasm Resources Information Network (GRIN)Ласкавецькн. VI : Літери Ком — Левиправивши або дописавши її