Why does mean tend be more stable in different samples than median?Why is median age a better statistic than mean age?Is median fairer than mean?Crash course in robust mean estimationWhen is the median more affected by sampling error than the mean?Is median better than mean for reporting on website speedWhat is the multivariate analog of the median?Is the mean smaller than the medianFor what (symmetric) distributions is sample mean a more efficient estimator than sample median?Comparison of medians in samples with unequal variance, size and shapeIs there more than one “median” formula?

Does Evolution Sage proliferate Blast Zone when played?

How to travel between two stationary worlds in the least amount of time? (time dilation)

What is the maximum amount of diamond in one Minecraft game?

What/Where usage English vs Japanese

Do I need to be legally qualified to install a Hive smart thermostat?

Why did the "Orks" never develop better firearms than Firelances and Handcannons?

How can I define a very large matrix efficiently?

Why does mean tend be more stable in different samples than median?

What is the addition in the re-released version of Avengers: Endgame?

What are the differences of checking a self-signed certificate vs ignore it?

PhD: When to quit and move on?

How would an Amulet of Proof Against Detection and Location interact with the Comprehend Languages spell?

Platform Event Design when Subscribers are Apex Triggers

Can you use a weapon affected by Heat Metal each turn if you drop it in between?

Motorcyle Chain needs to be cleaned every time you lube it?

What does the ash content of broken wheat really mean?

How to supply water to a coastal desert town with no rain and no freshwater aquifers?

What is this arch-and-tower near a road?

Do we have a much compact and generalized version of erase–remove idiom?

Park the computer

Has chattel slavery ever been used as a criminal punishment in the USA since the passage of the Thirteenth Amendment?

How serious is plagiarism in a master’s thesis?

Did Stalin kill all Soviet officers involved in the Winter War?

Implementing absolute value function in c



Why does mean tend be more stable in different samples than median?


Why is median age a better statistic than mean age?Is median fairer than mean?Crash course in robust mean estimationWhen is the median more affected by sampling error than the mean?Is median better than mean for reporting on website speedWhat is the multivariate analog of the median?Is the mean smaller than the medianFor what (symmetric) distributions is sample mean a more efficient estimator than sample median?Comparison of medians in samples with unequal variance, size and shapeIs there more than one “median” formula?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








5












$begingroup$


Section 1.7.2 of Discovering Statistics Using R by Andy Fields, et all, while listing a virtues of mean vs median, states:




... the mean tends to be stable in different samples.




This after explaining median's many virtues, e.g.




... The median is relatively unaffected by extreme scores at either end of the distribution ...




Given that median is relatively unaffected by extreme scores, I'd have thought it to be more stable across samples. So I was puzzled by authors's assertion. To confirm I ran a simulation — I generated 1M random numbers and sampled 100 numbers 1000 times and computed mean and median of each sample and then computed the sd of those sample means and medians.



nums = rnorm(n = 10**6, mean = 0, sd = 1)
hist(nums)
length(nums)
means=vector(mode = "numeric")
medians=vector(mode = "numeric")
for (i in 1:10**3) b = sample(x=nums, 10**2); medians[i]= median(b); means[i]=mean(b)
sd(means)
>> [1] 0.0984519
sd(medians)
>> [1] 0.1266079
p1 <- hist(means, col=rgb(0, 0, 1, 1/4))
p2 <- hist(medians, col=rgb(1, 0, 0, 1/4), add=T)


As you can see the means are more tightly distributed than medians.



enter image description here



In the attached image the red histogram is for medians — as you can see it is less tall and has fatter tail which also confirms the assertion of the author.



I’m flabbergasted by this, though! How can median which is more stable tends to ultimately vary more across samples? It seems paradoxical! Any insights would be appreciated.










share|cite|improve this question







New contributor



Alok Lal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$











  • $begingroup$
    Yeah, but try it by sampling from nums <- rt(n = 10**6, 1.1). That t1.1 distribution will give a bunch of extreme values, not necessarily balanced between positive and negative (just as good a chance of getting another positive extreme value as a negative extreme value to balance), that will cause a gigantic variance in $barx$. This is what median shields against. The normal distribution is unlikely to give any especially extreme values to stretch out the $barx$ distribution wider than median.
    $endgroup$
    – Dave
    9 hours ago







  • 4




    $begingroup$
    The author's statement is not generally true. (We have received many questions here related to errors in this author's books, so this is not a surprise.) The standard counterexamples are found among the "stable distributions", where the mean is anything but "stable" (in any reasonable sense of the term) and the median is far more stable.
    $endgroup$
    – whuber
    8 hours ago






  • 1




    $begingroup$
    "... the mean tends to be stable in different samples." is a nonsense statement. "stability" is not well defined. The (sample) mean is indeed quite stable in a single sample because it is a nonrandom quantity. If the data are "instable" (highly variable?) the mean is "instable" too.
    $endgroup$
    – AdamO
    7 hours ago


















5












$begingroup$


Section 1.7.2 of Discovering Statistics Using R by Andy Fields, et all, while listing a virtues of mean vs median, states:




... the mean tends to be stable in different samples.




This after explaining median's many virtues, e.g.




... The median is relatively unaffected by extreme scores at either end of the distribution ...




Given that median is relatively unaffected by extreme scores, I'd have thought it to be more stable across samples. So I was puzzled by authors's assertion. To confirm I ran a simulation — I generated 1M random numbers and sampled 100 numbers 1000 times and computed mean and median of each sample and then computed the sd of those sample means and medians.



nums = rnorm(n = 10**6, mean = 0, sd = 1)
hist(nums)
length(nums)
means=vector(mode = "numeric")
medians=vector(mode = "numeric")
for (i in 1:10**3) b = sample(x=nums, 10**2); medians[i]= median(b); means[i]=mean(b)
sd(means)
>> [1] 0.0984519
sd(medians)
>> [1] 0.1266079
p1 <- hist(means, col=rgb(0, 0, 1, 1/4))
p2 <- hist(medians, col=rgb(1, 0, 0, 1/4), add=T)


As you can see the means are more tightly distributed than medians.



enter image description here



In the attached image the red histogram is for medians — as you can see it is less tall and has fatter tail which also confirms the assertion of the author.



I’m flabbergasted by this, though! How can median which is more stable tends to ultimately vary more across samples? It seems paradoxical! Any insights would be appreciated.










share|cite|improve this question







New contributor



Alok Lal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$











  • $begingroup$
    Yeah, but try it by sampling from nums <- rt(n = 10**6, 1.1). That t1.1 distribution will give a bunch of extreme values, not necessarily balanced between positive and negative (just as good a chance of getting another positive extreme value as a negative extreme value to balance), that will cause a gigantic variance in $barx$. This is what median shields against. The normal distribution is unlikely to give any especially extreme values to stretch out the $barx$ distribution wider than median.
    $endgroup$
    – Dave
    9 hours ago







  • 4




    $begingroup$
    The author's statement is not generally true. (We have received many questions here related to errors in this author's books, so this is not a surprise.) The standard counterexamples are found among the "stable distributions", where the mean is anything but "stable" (in any reasonable sense of the term) and the median is far more stable.
    $endgroup$
    – whuber
    8 hours ago






  • 1




    $begingroup$
    "... the mean tends to be stable in different samples." is a nonsense statement. "stability" is not well defined. The (sample) mean is indeed quite stable in a single sample because it is a nonrandom quantity. If the data are "instable" (highly variable?) the mean is "instable" too.
    $endgroup$
    – AdamO
    7 hours ago














5












5








5


1



$begingroup$


Section 1.7.2 of Discovering Statistics Using R by Andy Fields, et all, while listing a virtues of mean vs median, states:




... the mean tends to be stable in different samples.




This after explaining median's many virtues, e.g.




... The median is relatively unaffected by extreme scores at either end of the distribution ...




Given that median is relatively unaffected by extreme scores, I'd have thought it to be more stable across samples. So I was puzzled by authors's assertion. To confirm I ran a simulation — I generated 1M random numbers and sampled 100 numbers 1000 times and computed mean and median of each sample and then computed the sd of those sample means and medians.



nums = rnorm(n = 10**6, mean = 0, sd = 1)
hist(nums)
length(nums)
means=vector(mode = "numeric")
medians=vector(mode = "numeric")
for (i in 1:10**3) b = sample(x=nums, 10**2); medians[i]= median(b); means[i]=mean(b)
sd(means)
>> [1] 0.0984519
sd(medians)
>> [1] 0.1266079
p1 <- hist(means, col=rgb(0, 0, 1, 1/4))
p2 <- hist(medians, col=rgb(1, 0, 0, 1/4), add=T)


As you can see the means are more tightly distributed than medians.



enter image description here



In the attached image the red histogram is for medians — as you can see it is less tall and has fatter tail which also confirms the assertion of the author.



I’m flabbergasted by this, though! How can median which is more stable tends to ultimately vary more across samples? It seems paradoxical! Any insights would be appreciated.










share|cite|improve this question







New contributor



Alok Lal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$




Section 1.7.2 of Discovering Statistics Using R by Andy Fields, et all, while listing a virtues of mean vs median, states:




... the mean tends to be stable in different samples.




This after explaining median's many virtues, e.g.




... The median is relatively unaffected by extreme scores at either end of the distribution ...




Given that median is relatively unaffected by extreme scores, I'd have thought it to be more stable across samples. So I was puzzled by authors's assertion. To confirm I ran a simulation — I generated 1M random numbers and sampled 100 numbers 1000 times and computed mean and median of each sample and then computed the sd of those sample means and medians.



nums = rnorm(n = 10**6, mean = 0, sd = 1)
hist(nums)
length(nums)
means=vector(mode = "numeric")
medians=vector(mode = "numeric")
for (i in 1:10**3) b = sample(x=nums, 10**2); medians[i]= median(b); means[i]=mean(b)
sd(means)
>> [1] 0.0984519
sd(medians)
>> [1] 0.1266079
p1 <- hist(means, col=rgb(0, 0, 1, 1/4))
p2 <- hist(medians, col=rgb(1, 0, 0, 1/4), add=T)


As you can see the means are more tightly distributed than medians.



enter image description here



In the attached image the red histogram is for medians — as you can see it is less tall and has fatter tail which also confirms the assertion of the author.



I’m flabbergasted by this, though! How can median which is more stable tends to ultimately vary more across samples? It seems paradoxical! Any insights would be appreciated.







mean median






share|cite|improve this question







New contributor



Alok Lal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.










share|cite|improve this question







New contributor



Alok Lal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








share|cite|improve this question




share|cite|improve this question






New contributor



Alok Lal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








asked 9 hours ago









Alok LalAlok Lal

261 bronze badge




261 bronze badge




New contributor



Alok Lal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




New contributor




Alok Lal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.













  • $begingroup$
    Yeah, but try it by sampling from nums <- rt(n = 10**6, 1.1). That t1.1 distribution will give a bunch of extreme values, not necessarily balanced between positive and negative (just as good a chance of getting another positive extreme value as a negative extreme value to balance), that will cause a gigantic variance in $barx$. This is what median shields against. The normal distribution is unlikely to give any especially extreme values to stretch out the $barx$ distribution wider than median.
    $endgroup$
    – Dave
    9 hours ago







  • 4




    $begingroup$
    The author's statement is not generally true. (We have received many questions here related to errors in this author's books, so this is not a surprise.) The standard counterexamples are found among the "stable distributions", where the mean is anything but "stable" (in any reasonable sense of the term) and the median is far more stable.
    $endgroup$
    – whuber
    8 hours ago






  • 1




    $begingroup$
    "... the mean tends to be stable in different samples." is a nonsense statement. "stability" is not well defined. The (sample) mean is indeed quite stable in a single sample because it is a nonrandom quantity. If the data are "instable" (highly variable?) the mean is "instable" too.
    $endgroup$
    – AdamO
    7 hours ago

















  • $begingroup$
    Yeah, but try it by sampling from nums <- rt(n = 10**6, 1.1). That t1.1 distribution will give a bunch of extreme values, not necessarily balanced between positive and negative (just as good a chance of getting another positive extreme value as a negative extreme value to balance), that will cause a gigantic variance in $barx$. This is what median shields against. The normal distribution is unlikely to give any especially extreme values to stretch out the $barx$ distribution wider than median.
    $endgroup$
    – Dave
    9 hours ago







  • 4




    $begingroup$
    The author's statement is not generally true. (We have received many questions here related to errors in this author's books, so this is not a surprise.) The standard counterexamples are found among the "stable distributions", where the mean is anything but "stable" (in any reasonable sense of the term) and the median is far more stable.
    $endgroup$
    – whuber
    8 hours ago






  • 1




    $begingroup$
    "... the mean tends to be stable in different samples." is a nonsense statement. "stability" is not well defined. The (sample) mean is indeed quite stable in a single sample because it is a nonrandom quantity. If the data are "instable" (highly variable?) the mean is "instable" too.
    $endgroup$
    – AdamO
    7 hours ago
















$begingroup$
Yeah, but try it by sampling from nums <- rt(n = 10**6, 1.1). That t1.1 distribution will give a bunch of extreme values, not necessarily balanced between positive and negative (just as good a chance of getting another positive extreme value as a negative extreme value to balance), that will cause a gigantic variance in $barx$. This is what median shields against. The normal distribution is unlikely to give any especially extreme values to stretch out the $barx$ distribution wider than median.
$endgroup$
– Dave
9 hours ago





$begingroup$
Yeah, but try it by sampling from nums <- rt(n = 10**6, 1.1). That t1.1 distribution will give a bunch of extreme values, not necessarily balanced between positive and negative (just as good a chance of getting another positive extreme value as a negative extreme value to balance), that will cause a gigantic variance in $barx$. This is what median shields against. The normal distribution is unlikely to give any especially extreme values to stretch out the $barx$ distribution wider than median.
$endgroup$
– Dave
9 hours ago





4




4




$begingroup$
The author's statement is not generally true. (We have received many questions here related to errors in this author's books, so this is not a surprise.) The standard counterexamples are found among the "stable distributions", where the mean is anything but "stable" (in any reasonable sense of the term) and the median is far more stable.
$endgroup$
– whuber
8 hours ago




$begingroup$
The author's statement is not generally true. (We have received many questions here related to errors in this author's books, so this is not a surprise.) The standard counterexamples are found among the "stable distributions", where the mean is anything but "stable" (in any reasonable sense of the term) and the median is far more stable.
$endgroup$
– whuber
8 hours ago




1




1




$begingroup$
"... the mean tends to be stable in different samples." is a nonsense statement. "stability" is not well defined. The (sample) mean is indeed quite stable in a single sample because it is a nonrandom quantity. If the data are "instable" (highly variable?) the mean is "instable" too.
$endgroup$
– AdamO
7 hours ago





$begingroup$
"... the mean tends to be stable in different samples." is a nonsense statement. "stability" is not well defined. The (sample) mean is indeed quite stable in a single sample because it is a nonrandom quantity. If the data are "instable" (highly variable?) the mean is "instable" too.
$endgroup$
– AdamO
7 hours ago











3 Answers
3






active

oldest

votes


















2












$begingroup$

Comment: Just to echo back your simulation, using a distribution for which SDs of means and medians have the opposite result:



Specifically, nums are now from a Laplace distribution (also called 'double exponential'), which can be simulated as the difference of two exponential distribution with the same rate (here the default rate 1).
[Perhaps see Wikipedia on Laplace distributions.]



set.seed(2019)
nums = rexp(10^6) - rexp(10^6)
means=vector(mode = "numeric")
medians=vector(mode = "numeric")
for (i in 1:10^3) b = sample(x=nums, 10^2);
medians[i]= median(b); means[i]=mean(b)
sd(means)
[1] 0.1442126
sd(medians)
[1] 0.1095946 # <-- smaller

hist(nums, prob=T, br=70, ylim=c(0,.5), col="skyblue2")
curve(.5*exp(-abs(x)), add=T, col="red")


enter image description here



Note: Another easy possibility, explicitly mentioned in @whuber's link, is Cauchy, which
can be simulated as Student's t distribution with one degree of freedom, rt(10^6, 1). However, its tails are so heavy that making a nice histogram is problematic.






share|cite|improve this answer











$endgroup$




















    2












    $begingroup$

    As @whuber and others have said, the statement is not true in general. And if you’re willing to be more intuitive — I can’t keep up with the deep math geeks around here — you might look at other ways mean and median are stable or not. For these examples, assume an odd number of points so I can keep my descriptions consistent and simple.



    1. Imagine you have spread of points on a number line. Now imagine you take all of the points above the middle and move them up to 10x their values. The median is unchanged, the mean moved significantly. So the median seems more stable.


    2. Now imagine these points are fairly spread out. Move the center point up and down. A one-unit move changes the median by one, but barely moved the mean. The median now seems less stable and more sensitive to small movements of a single point.


    3. Now imagine taking the highest point and moving it smoothly from the highest to the lowest point. The mean will also smoothly move. But the median will jump: it won’t move at all until your high point becomes lower than the previous median, then it starts following the point until it goes below the next point, then the median jumps instantly to that point and again doesn’t move as you continue moving your point downwards. This is very digital-like and obviously the median is not as smooth as the mean. In fact it has a discontinuity where it instantaneously changes value. Unstable.


    So different transformations of your points cause either mean or median to look less smooth or stable in some sense. The math heavy-hitters here have shown you distributions from which you can sample, which more closely matches your experiment, but hopefully this intuition helps as well.






    share|cite|improve this answer









    $endgroup$




















      1












      $begingroup$

      Suppose you have $n$ data points from some underlying continuous distribution with mean $mu$ and variance $sigma^2 < infty$. Let $f$ be the density function for this distribution and let $m$ be its median. To simplify this result further, let $tildef$ be the corresponding standardised density function, given by $tildef(z) = sigma cdot f(mu+sigma z)$ for all $z in mathbbR$. The asymptotic variance of the sample mean and sample median are given respectively by:



      $$mathbbV(barX_n) = fracsigma^2n
      quad quad quad quad quad
      mathbbV(tildeX_n) rightarrow fracsigma^2n cdot frac14 cdot tildefBig( fracm-musigma Big)^-2.$$



      We therefore have:



      $$fracmathbbV(barX_n)mathbbV(tildeX_n) rightarrow 4 cdot tildefBig( fracm-musigma Big)^2.$$



      As you can see, the relative size of the variance of the sample mean and sample median is determined (asymptotically) by the standardised density value at the true median. Thus, for large $n$ we have the asymptotic correspondence:



      $$mathbbV(barX_n) < mathbbV(tildeX_n)
      quad quad iff quad quad
      f_* equiv tildef Big( fracm-musigma Big) < frac12.$$



      That is, for large $n$, and speaking asymptotically, the variance of the sample mean will be lower than the variance of the sample median if and only if the standardised density at the standardised median value is less than one-half. The data you used in your simulation example was generated from a normal distribution, so you have $f_* = 1 / sqrt2 pi = 0.3989423 < 1/2$. Thus, it is unsurprising that you found a higher variance for the sample median in that example.






      share|cite|improve this answer









      $endgroup$















        Your Answer








        StackExchange.ready(function()
        var channelOptions =
        tags: "".split(" "),
        id: "65"
        ;
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function()
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled)
        StackExchange.using("snippets", function()
        createEditor();
        );

        else
        createEditor();

        );

        function createEditor()
        StackExchange.prepareEditor(
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: false,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: null,
        bindNavPrevention: true,
        postfix: "",
        imageUploader:
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        ,
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        );



        );






        Alok Lal is a new contributor. Be nice, and check out our Code of Conduct.









        draft saved

        draft discarded


















        StackExchange.ready(
        function ()
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f415755%2fwhy-does-mean-tend-be-more-stable-in-different-samples-than-median%23new-answer', 'question_page');

        );

        Post as a guest















        Required, but never shown

























        3 Answers
        3






        active

        oldest

        votes








        3 Answers
        3






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        2












        $begingroup$

        Comment: Just to echo back your simulation, using a distribution for which SDs of means and medians have the opposite result:



        Specifically, nums are now from a Laplace distribution (also called 'double exponential'), which can be simulated as the difference of two exponential distribution with the same rate (here the default rate 1).
        [Perhaps see Wikipedia on Laplace distributions.]



        set.seed(2019)
        nums = rexp(10^6) - rexp(10^6)
        means=vector(mode = "numeric")
        medians=vector(mode = "numeric")
        for (i in 1:10^3) b = sample(x=nums, 10^2);
        medians[i]= median(b); means[i]=mean(b)
        sd(means)
        [1] 0.1442126
        sd(medians)
        [1] 0.1095946 # <-- smaller

        hist(nums, prob=T, br=70, ylim=c(0,.5), col="skyblue2")
        curve(.5*exp(-abs(x)), add=T, col="red")


        enter image description here



        Note: Another easy possibility, explicitly mentioned in @whuber's link, is Cauchy, which
        can be simulated as Student's t distribution with one degree of freedom, rt(10^6, 1). However, its tails are so heavy that making a nice histogram is problematic.






        share|cite|improve this answer











        $endgroup$

















          2












          $begingroup$

          Comment: Just to echo back your simulation, using a distribution for which SDs of means and medians have the opposite result:



          Specifically, nums are now from a Laplace distribution (also called 'double exponential'), which can be simulated as the difference of two exponential distribution with the same rate (here the default rate 1).
          [Perhaps see Wikipedia on Laplace distributions.]



          set.seed(2019)
          nums = rexp(10^6) - rexp(10^6)
          means=vector(mode = "numeric")
          medians=vector(mode = "numeric")
          for (i in 1:10^3) b = sample(x=nums, 10^2);
          medians[i]= median(b); means[i]=mean(b)
          sd(means)
          [1] 0.1442126
          sd(medians)
          [1] 0.1095946 # <-- smaller

          hist(nums, prob=T, br=70, ylim=c(0,.5), col="skyblue2")
          curve(.5*exp(-abs(x)), add=T, col="red")


          enter image description here



          Note: Another easy possibility, explicitly mentioned in @whuber's link, is Cauchy, which
          can be simulated as Student's t distribution with one degree of freedom, rt(10^6, 1). However, its tails are so heavy that making a nice histogram is problematic.






          share|cite|improve this answer











          $endgroup$















            2












            2








            2





            $begingroup$

            Comment: Just to echo back your simulation, using a distribution for which SDs of means and medians have the opposite result:



            Specifically, nums are now from a Laplace distribution (also called 'double exponential'), which can be simulated as the difference of two exponential distribution with the same rate (here the default rate 1).
            [Perhaps see Wikipedia on Laplace distributions.]



            set.seed(2019)
            nums = rexp(10^6) - rexp(10^6)
            means=vector(mode = "numeric")
            medians=vector(mode = "numeric")
            for (i in 1:10^3) b = sample(x=nums, 10^2);
            medians[i]= median(b); means[i]=mean(b)
            sd(means)
            [1] 0.1442126
            sd(medians)
            [1] 0.1095946 # <-- smaller

            hist(nums, prob=T, br=70, ylim=c(0,.5), col="skyblue2")
            curve(.5*exp(-abs(x)), add=T, col="red")


            enter image description here



            Note: Another easy possibility, explicitly mentioned in @whuber's link, is Cauchy, which
            can be simulated as Student's t distribution with one degree of freedom, rt(10^6, 1). However, its tails are so heavy that making a nice histogram is problematic.






            share|cite|improve this answer











            $endgroup$



            Comment: Just to echo back your simulation, using a distribution for which SDs of means and medians have the opposite result:



            Specifically, nums are now from a Laplace distribution (also called 'double exponential'), which can be simulated as the difference of two exponential distribution with the same rate (here the default rate 1).
            [Perhaps see Wikipedia on Laplace distributions.]



            set.seed(2019)
            nums = rexp(10^6) - rexp(10^6)
            means=vector(mode = "numeric")
            medians=vector(mode = "numeric")
            for (i in 1:10^3) b = sample(x=nums, 10^2);
            medians[i]= median(b); means[i]=mean(b)
            sd(means)
            [1] 0.1442126
            sd(medians)
            [1] 0.1095946 # <-- smaller

            hist(nums, prob=T, br=70, ylim=c(0,.5), col="skyblue2")
            curve(.5*exp(-abs(x)), add=T, col="red")


            enter image description here



            Note: Another easy possibility, explicitly mentioned in @whuber's link, is Cauchy, which
            can be simulated as Student's t distribution with one degree of freedom, rt(10^6, 1). However, its tails are so heavy that making a nice histogram is problematic.







            share|cite|improve this answer














            share|cite|improve this answer



            share|cite|improve this answer








            edited 7 hours ago

























            answered 7 hours ago









            BruceETBruceET

            10.1k1 gold badge8 silver badges24 bronze badges




            10.1k1 gold badge8 silver badges24 bronze badges























                2












                $begingroup$

                As @whuber and others have said, the statement is not true in general. And if you’re willing to be more intuitive — I can’t keep up with the deep math geeks around here — you might look at other ways mean and median are stable or not. For these examples, assume an odd number of points so I can keep my descriptions consistent and simple.



                1. Imagine you have spread of points on a number line. Now imagine you take all of the points above the middle and move them up to 10x their values. The median is unchanged, the mean moved significantly. So the median seems more stable.


                2. Now imagine these points are fairly spread out. Move the center point up and down. A one-unit move changes the median by one, but barely moved the mean. The median now seems less stable and more sensitive to small movements of a single point.


                3. Now imagine taking the highest point and moving it smoothly from the highest to the lowest point. The mean will also smoothly move. But the median will jump: it won’t move at all until your high point becomes lower than the previous median, then it starts following the point until it goes below the next point, then the median jumps instantly to that point and again doesn’t move as you continue moving your point downwards. This is very digital-like and obviously the median is not as smooth as the mean. In fact it has a discontinuity where it instantaneously changes value. Unstable.


                So different transformations of your points cause either mean or median to look less smooth or stable in some sense. The math heavy-hitters here have shown you distributions from which you can sample, which more closely matches your experiment, but hopefully this intuition helps as well.






                share|cite|improve this answer









                $endgroup$

















                  2












                  $begingroup$

                  As @whuber and others have said, the statement is not true in general. And if you’re willing to be more intuitive — I can’t keep up with the deep math geeks around here — you might look at other ways mean and median are stable or not. For these examples, assume an odd number of points so I can keep my descriptions consistent and simple.



                  1. Imagine you have spread of points on a number line. Now imagine you take all of the points above the middle and move them up to 10x their values. The median is unchanged, the mean moved significantly. So the median seems more stable.


                  2. Now imagine these points are fairly spread out. Move the center point up and down. A one-unit move changes the median by one, but barely moved the mean. The median now seems less stable and more sensitive to small movements of a single point.


                  3. Now imagine taking the highest point and moving it smoothly from the highest to the lowest point. The mean will also smoothly move. But the median will jump: it won’t move at all until your high point becomes lower than the previous median, then it starts following the point until it goes below the next point, then the median jumps instantly to that point and again doesn’t move as you continue moving your point downwards. This is very digital-like and obviously the median is not as smooth as the mean. In fact it has a discontinuity where it instantaneously changes value. Unstable.


                  So different transformations of your points cause either mean or median to look less smooth or stable in some sense. The math heavy-hitters here have shown you distributions from which you can sample, which more closely matches your experiment, but hopefully this intuition helps as well.






                  share|cite|improve this answer









                  $endgroup$















                    2












                    2








                    2





                    $begingroup$

                    As @whuber and others have said, the statement is not true in general. And if you’re willing to be more intuitive — I can’t keep up with the deep math geeks around here — you might look at other ways mean and median are stable or not. For these examples, assume an odd number of points so I can keep my descriptions consistent and simple.



                    1. Imagine you have spread of points on a number line. Now imagine you take all of the points above the middle and move them up to 10x their values. The median is unchanged, the mean moved significantly. So the median seems more stable.


                    2. Now imagine these points are fairly spread out. Move the center point up and down. A one-unit move changes the median by one, but barely moved the mean. The median now seems less stable and more sensitive to small movements of a single point.


                    3. Now imagine taking the highest point and moving it smoothly from the highest to the lowest point. The mean will also smoothly move. But the median will jump: it won’t move at all until your high point becomes lower than the previous median, then it starts following the point until it goes below the next point, then the median jumps instantly to that point and again doesn’t move as you continue moving your point downwards. This is very digital-like and obviously the median is not as smooth as the mean. In fact it has a discontinuity where it instantaneously changes value. Unstable.


                    So different transformations of your points cause either mean or median to look less smooth or stable in some sense. The math heavy-hitters here have shown you distributions from which you can sample, which more closely matches your experiment, but hopefully this intuition helps as well.






                    share|cite|improve this answer









                    $endgroup$



                    As @whuber and others have said, the statement is not true in general. And if you’re willing to be more intuitive — I can’t keep up with the deep math geeks around here — you might look at other ways mean and median are stable or not. For these examples, assume an odd number of points so I can keep my descriptions consistent and simple.



                    1. Imagine you have spread of points on a number line. Now imagine you take all of the points above the middle and move them up to 10x their values. The median is unchanged, the mean moved significantly. So the median seems more stable.


                    2. Now imagine these points are fairly spread out. Move the center point up and down. A one-unit move changes the median by one, but barely moved the mean. The median now seems less stable and more sensitive to small movements of a single point.


                    3. Now imagine taking the highest point and moving it smoothly from the highest to the lowest point. The mean will also smoothly move. But the median will jump: it won’t move at all until your high point becomes lower than the previous median, then it starts following the point until it goes below the next point, then the median jumps instantly to that point and again doesn’t move as you continue moving your point downwards. This is very digital-like and obviously the median is not as smooth as the mean. In fact it has a discontinuity where it instantaneously changes value. Unstable.


                    So different transformations of your points cause either mean or median to look less smooth or stable in some sense. The math heavy-hitters here have shown you distributions from which you can sample, which more closely matches your experiment, but hopefully this intuition helps as well.







                    share|cite|improve this answer












                    share|cite|improve this answer



                    share|cite|improve this answer










                    answered 6 hours ago









                    WayneWayne

                    16.7k2 gold badges40 silver badges80 bronze badges




                    16.7k2 gold badges40 silver badges80 bronze badges





















                        1












                        $begingroup$

                        Suppose you have $n$ data points from some underlying continuous distribution with mean $mu$ and variance $sigma^2 < infty$. Let $f$ be the density function for this distribution and let $m$ be its median. To simplify this result further, let $tildef$ be the corresponding standardised density function, given by $tildef(z) = sigma cdot f(mu+sigma z)$ for all $z in mathbbR$. The asymptotic variance of the sample mean and sample median are given respectively by:



                        $$mathbbV(barX_n) = fracsigma^2n
                        quad quad quad quad quad
                        mathbbV(tildeX_n) rightarrow fracsigma^2n cdot frac14 cdot tildefBig( fracm-musigma Big)^-2.$$



                        We therefore have:



                        $$fracmathbbV(barX_n)mathbbV(tildeX_n) rightarrow 4 cdot tildefBig( fracm-musigma Big)^2.$$



                        As you can see, the relative size of the variance of the sample mean and sample median is determined (asymptotically) by the standardised density value at the true median. Thus, for large $n$ we have the asymptotic correspondence:



                        $$mathbbV(barX_n) < mathbbV(tildeX_n)
                        quad quad iff quad quad
                        f_* equiv tildef Big( fracm-musigma Big) < frac12.$$



                        That is, for large $n$, and speaking asymptotically, the variance of the sample mean will be lower than the variance of the sample median if and only if the standardised density at the standardised median value is less than one-half. The data you used in your simulation example was generated from a normal distribution, so you have $f_* = 1 / sqrt2 pi = 0.3989423 < 1/2$. Thus, it is unsurprising that you found a higher variance for the sample median in that example.






                        share|cite|improve this answer









                        $endgroup$

















                          1












                          $begingroup$

                          Suppose you have $n$ data points from some underlying continuous distribution with mean $mu$ and variance $sigma^2 < infty$. Let $f$ be the density function for this distribution and let $m$ be its median. To simplify this result further, let $tildef$ be the corresponding standardised density function, given by $tildef(z) = sigma cdot f(mu+sigma z)$ for all $z in mathbbR$. The asymptotic variance of the sample mean and sample median are given respectively by:



                          $$mathbbV(barX_n) = fracsigma^2n
                          quad quad quad quad quad
                          mathbbV(tildeX_n) rightarrow fracsigma^2n cdot frac14 cdot tildefBig( fracm-musigma Big)^-2.$$



                          We therefore have:



                          $$fracmathbbV(barX_n)mathbbV(tildeX_n) rightarrow 4 cdot tildefBig( fracm-musigma Big)^2.$$



                          As you can see, the relative size of the variance of the sample mean and sample median is determined (asymptotically) by the standardised density value at the true median. Thus, for large $n$ we have the asymptotic correspondence:



                          $$mathbbV(barX_n) < mathbbV(tildeX_n)
                          quad quad iff quad quad
                          f_* equiv tildef Big( fracm-musigma Big) < frac12.$$



                          That is, for large $n$, and speaking asymptotically, the variance of the sample mean will be lower than the variance of the sample median if and only if the standardised density at the standardised median value is less than one-half. The data you used in your simulation example was generated from a normal distribution, so you have $f_* = 1 / sqrt2 pi = 0.3989423 < 1/2$. Thus, it is unsurprising that you found a higher variance for the sample median in that example.






                          share|cite|improve this answer









                          $endgroup$















                            1












                            1








                            1





                            $begingroup$

                            Suppose you have $n$ data points from some underlying continuous distribution with mean $mu$ and variance $sigma^2 < infty$. Let $f$ be the density function for this distribution and let $m$ be its median. To simplify this result further, let $tildef$ be the corresponding standardised density function, given by $tildef(z) = sigma cdot f(mu+sigma z)$ for all $z in mathbbR$. The asymptotic variance of the sample mean and sample median are given respectively by:



                            $$mathbbV(barX_n) = fracsigma^2n
                            quad quad quad quad quad
                            mathbbV(tildeX_n) rightarrow fracsigma^2n cdot frac14 cdot tildefBig( fracm-musigma Big)^-2.$$



                            We therefore have:



                            $$fracmathbbV(barX_n)mathbbV(tildeX_n) rightarrow 4 cdot tildefBig( fracm-musigma Big)^2.$$



                            As you can see, the relative size of the variance of the sample mean and sample median is determined (asymptotically) by the standardised density value at the true median. Thus, for large $n$ we have the asymptotic correspondence:



                            $$mathbbV(barX_n) < mathbbV(tildeX_n)
                            quad quad iff quad quad
                            f_* equiv tildef Big( fracm-musigma Big) < frac12.$$



                            That is, for large $n$, and speaking asymptotically, the variance of the sample mean will be lower than the variance of the sample median if and only if the standardised density at the standardised median value is less than one-half. The data you used in your simulation example was generated from a normal distribution, so you have $f_* = 1 / sqrt2 pi = 0.3989423 < 1/2$. Thus, it is unsurprising that you found a higher variance for the sample median in that example.






                            share|cite|improve this answer









                            $endgroup$



                            Suppose you have $n$ data points from some underlying continuous distribution with mean $mu$ and variance $sigma^2 < infty$. Let $f$ be the density function for this distribution and let $m$ be its median. To simplify this result further, let $tildef$ be the corresponding standardised density function, given by $tildef(z) = sigma cdot f(mu+sigma z)$ for all $z in mathbbR$. The asymptotic variance of the sample mean and sample median are given respectively by:



                            $$mathbbV(barX_n) = fracsigma^2n
                            quad quad quad quad quad
                            mathbbV(tildeX_n) rightarrow fracsigma^2n cdot frac14 cdot tildefBig( fracm-musigma Big)^-2.$$



                            We therefore have:



                            $$fracmathbbV(barX_n)mathbbV(tildeX_n) rightarrow 4 cdot tildefBig( fracm-musigma Big)^2.$$



                            As you can see, the relative size of the variance of the sample mean and sample median is determined (asymptotically) by the standardised density value at the true median. Thus, for large $n$ we have the asymptotic correspondence:



                            $$mathbbV(barX_n) < mathbbV(tildeX_n)
                            quad quad iff quad quad
                            f_* equiv tildef Big( fracm-musigma Big) < frac12.$$



                            That is, for large $n$, and speaking asymptotically, the variance of the sample mean will be lower than the variance of the sample median if and only if the standardised density at the standardised median value is less than one-half. The data you used in your simulation example was generated from a normal distribution, so you have $f_* = 1 / sqrt2 pi = 0.3989423 < 1/2$. Thus, it is unsurprising that you found a higher variance for the sample median in that example.







                            share|cite|improve this answer












                            share|cite|improve this answer



                            share|cite|improve this answer










                            answered 6 hours ago









                            BenBen

                            33.6k2 gold badges40 silver badges146 bronze badges




                            33.6k2 gold badges40 silver badges146 bronze badges




















                                Alok Lal is a new contributor. Be nice, and check out our Code of Conduct.









                                draft saved

                                draft discarded


















                                Alok Lal is a new contributor. Be nice, and check out our Code of Conduct.












                                Alok Lal is a new contributor. Be nice, and check out our Code of Conduct.











                                Alok Lal is a new contributor. Be nice, and check out our Code of Conduct.














                                Thanks for contributing an answer to Cross Validated!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid


                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.

                                Use MathJax to format equations. MathJax reference.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function ()
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f415755%2fwhy-does-mean-tend-be-more-stable-in-different-samples-than-median%23new-answer', 'question_page');

                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                Invision Community Contents History See also References External links Navigation menuProprietaryinvisioncommunity.comIPS Community ForumsIPS Community Forumsthis blog entry"License Changes, IP.Board 3.4, and the Future""Interview -- Matt Mecham of Ibforums""CEO Invision Power Board, Matt Mecham Is a Liar, Thief!"IPB License Explanation 1.3, 1.3.1, 2.0, and 2.1ArchivedSecurity Fixes, Updates And Enhancements For IPB 1.3.1Archived"New Demo Accounts - Invision Power Services"the original"New Default Skin"the original"Invision Power Board 3.0.0 and Applications Released"the original"Archived copy"the original"Perpetual licenses being done away with""Release Notes - Invision Power Services""Introducing: IPS Community Suite 4!"Invision Community Release Notes

                                Canceling a color specificationRandomly assigning color to Graphics3D objects?Default color for Filling in Mathematica 9Coloring specific elements of sets with a prime modified order in an array plotHow to pick a color differing significantly from the colors already in a given color list?Detection of the text colorColor numbers based on their valueCan color schemes for use with ColorData include opacity specification?My dynamic color schemes

                                Ласкавець круглолистий Зміст Опис | Поширення | Галерея | Примітки | Посилання | Навігаційне меню58171138361-22960890446Bupleurum rotundifoliumEuro+Med PlantbasePlants of the World Online — Kew ScienceGermplasm Resources Information Network (GRIN)Ласкавецькн. VI : Літери Ком — Левиправивши або дописавши її