Are there 99 percentiles, or 100 percentiles? And are they groups of numbers, or dividers or pointers to individual numbers?“-iles” terminology for the top half a percentWhat are some examples of reversed usage of “percentiles”?(very basic) One-sample test for binary dataIs calculating a percentile the same as evaluating a cumulative density function?How do you classify based on percentile ranking when most scores are the same?Removing outliers and calculating percentiles with highly variable dataDoes it take progressively fewer EXTRA correct answers to move up a grading curve in standardised exams?Does there exist terminology for percentiles/ranges sorted by question difficulty?How do I aggregate percentiles from previously aggregated percentiles?

2 Guards, 3 Keys, 2 Locks

Usefulness of Nash embedding theorem

How do I reset the TSA-unlocked indicator on my lock?

Showing a limit approaches e: base of natural log

How to prove that invoices are really UNPAID?

Is it possible to have 2 ports open on SSH with 2 different authentication schemes?

In "Avatar: The Last Airbender" can a metalbender bloodbend if there is metal in our blood?

Does my protagonist need to be the most important character?

What is the "Applicable country" field on the Icelandair check-in form?

When did 5 foot squares become standard in D&D?

What do you call a document which has no content?

Can digital computers understand infinity?

Why do adjectives come before nouns in English?

How to create a vimrc macro using :sort?

Slaad Chaos Phage: Weak Combat Ability?

3-prong to 4-prong conversion - EXTRA MISLABELLED WIRES - Dryer cable upgrade and installation

A Problem of Succession

How can I learn to write better questions to test for conceptual understanding?

Is there a push, in the United States, to use gender-neutral language and gender pronouns (when they are given)?

Extra battery in the bay of an HDD

Neither Raman nor IR Active vibrational modes

What does this text mean with capitalized letters?

Do you say "good game" after a game in which your opponent played poorly?

Why is technology bad for children?



Are there 99 percentiles, or 100 percentiles? And are they groups of numbers, or dividers or pointers to individual numbers?


“-iles” terminology for the top half a percentWhat are some examples of reversed usage of “percentiles”?(very basic) One-sample test for binary dataIs calculating a percentile the same as evaluating a cumulative density function?How do you classify based on percentile ranking when most scores are the same?Removing outliers and calculating percentiles with highly variable dataDoes it take progressively fewer EXTRA correct answers to move up a grading curve in standardised exams?Does there exist terminology for percentiles/ranges sorted by question difficulty?How do I aggregate percentiles from previously aggregated percentiles?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;









5














$begingroup$


Are there 99 percentiles, or 100 percentiles? And are they groups of numbers, or divider lines, or pointers to individual numbers?



I suppose the same question would apply for quartiles or any quantile.



I have read that the index of a number at a particular percentile(p), given n items, is i = (p / 100) * n



That suggests to me that there are 100 percentiles.. because supposing you have 100 numbers(i=1 to i=100), then each would have an index(1 to 100).



If you had 200 numbers, there'd be 100 percentiles, but would each refer to a group of two numbers. Or 100 dividers excluding either the far left or far right divider 'cos otherwise you'd get 101 dividers. Or pointers to individual numbers so the first percentile would refer to the second number, (1/100)*200=2 And the hundredth percentile would refer to the 200th number (100/100)*200=200



I have sometimes heard of there being 99 percentiles though..



Google shows the oxford dictionary that says of percentile- "each of the 100 equal groups into which a population can be divided according to the distribution of values of a particular variable." and "each of the 99 intermediate values of a random variable which divide a frequency distribution into 100 such groups."



Wikipedia says "the 20th percentile is the value below which 20% of the observations may be found" But does it actually mean "the value below or equal to which, 20% of the observations may be found" i.e. "the value for which 20% of the values are <= to it". If it were just < and not <=, then By that reasoning, the 100th percentile would be the value below which 100% of the values may be found. I have heard that as an argument that there can be no 100th percentile, because you can't have a number where there are 100% of the numbers below it. But I think maybe that argument that you can't have a 100th percentile is incorrect and is based an error that the definition of a percentile involves <= not <. (or >= not >). So the hundredth percentile would be the final number and would be >= 100% of the numbers.










share|cite|improve this question







New contributor



barlop is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$











  • 1




    $begingroup$
    I think it unlikely 100 would be a reasonable answer due to its asymmetric treatment of the extremes. Cases can be made for either 99 (as in the definition you quote) or 101.
    $endgroup$
    – whuber
    8 hours ago






  • 2




    $begingroup$
    Historically quantiles — as we now say generically — were first summary points, and then by extension the bins, classes or intervals they delimit. So three quartiles, including the median, define four bins, and so forth.
    $endgroup$
    – Nick Cox
    7 hours ago










  • $begingroup$
    @NickCox do you have a source for that?
    $endgroup$
    – barlop
    5 hours ago










  • $begingroup$
    @whuber You write "I think it unlikely 100 would be a reasonable answer due to its asymmetric treatment of the extremes." <-- can you elaborate on that?
    $endgroup$
    – barlop
    5 hours ago











  • $begingroup$
    @whuber You write "Cases can be made for either 99 (as in the definition you quote) or 101" <-- though percent means per 100, so how can you have 101? And if 101 would you number them 1st 2nd ... 101st, or 0th 1st 100th? 0th seems problematic because the th/st is for counting and counting is from 1. Even in computer science, you index from 0 but still counting is 0=no items, from 1 for the first item!
    $endgroup$
    – barlop
    5 hours ago


















5














$begingroup$


Are there 99 percentiles, or 100 percentiles? And are they groups of numbers, or divider lines, or pointers to individual numbers?



I suppose the same question would apply for quartiles or any quantile.



I have read that the index of a number at a particular percentile(p), given n items, is i = (p / 100) * n



That suggests to me that there are 100 percentiles.. because supposing you have 100 numbers(i=1 to i=100), then each would have an index(1 to 100).



If you had 200 numbers, there'd be 100 percentiles, but would each refer to a group of two numbers. Or 100 dividers excluding either the far left or far right divider 'cos otherwise you'd get 101 dividers. Or pointers to individual numbers so the first percentile would refer to the second number, (1/100)*200=2 And the hundredth percentile would refer to the 200th number (100/100)*200=200



I have sometimes heard of there being 99 percentiles though..



Google shows the oxford dictionary that says of percentile- "each of the 100 equal groups into which a population can be divided according to the distribution of values of a particular variable." and "each of the 99 intermediate values of a random variable which divide a frequency distribution into 100 such groups."



Wikipedia says "the 20th percentile is the value below which 20% of the observations may be found" But does it actually mean "the value below or equal to which, 20% of the observations may be found" i.e. "the value for which 20% of the values are <= to it". If it were just < and not <=, then By that reasoning, the 100th percentile would be the value below which 100% of the values may be found. I have heard that as an argument that there can be no 100th percentile, because you can't have a number where there are 100% of the numbers below it. But I think maybe that argument that you can't have a 100th percentile is incorrect and is based an error that the definition of a percentile involves <= not <. (or >= not >). So the hundredth percentile would be the final number and would be >= 100% of the numbers.










share|cite|improve this question







New contributor



barlop is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$











  • 1




    $begingroup$
    I think it unlikely 100 would be a reasonable answer due to its asymmetric treatment of the extremes. Cases can be made for either 99 (as in the definition you quote) or 101.
    $endgroup$
    – whuber
    8 hours ago






  • 2




    $begingroup$
    Historically quantiles — as we now say generically — were first summary points, and then by extension the bins, classes or intervals they delimit. So three quartiles, including the median, define four bins, and so forth.
    $endgroup$
    – Nick Cox
    7 hours ago










  • $begingroup$
    @NickCox do you have a source for that?
    $endgroup$
    – barlop
    5 hours ago










  • $begingroup$
    @whuber You write "I think it unlikely 100 would be a reasonable answer due to its asymmetric treatment of the extremes." <-- can you elaborate on that?
    $endgroup$
    – barlop
    5 hours ago











  • $begingroup$
    @whuber You write "Cases can be made for either 99 (as in the definition you quote) or 101" <-- though percent means per 100, so how can you have 101? And if 101 would you number them 1st 2nd ... 101st, or 0th 1st 100th? 0th seems problematic because the th/st is for counting and counting is from 1. Even in computer science, you index from 0 but still counting is 0=no items, from 1 for the first item!
    $endgroup$
    – barlop
    5 hours ago














5












5








5





$begingroup$


Are there 99 percentiles, or 100 percentiles? And are they groups of numbers, or divider lines, or pointers to individual numbers?



I suppose the same question would apply for quartiles or any quantile.



I have read that the index of a number at a particular percentile(p), given n items, is i = (p / 100) * n



That suggests to me that there are 100 percentiles.. because supposing you have 100 numbers(i=1 to i=100), then each would have an index(1 to 100).



If you had 200 numbers, there'd be 100 percentiles, but would each refer to a group of two numbers. Or 100 dividers excluding either the far left or far right divider 'cos otherwise you'd get 101 dividers. Or pointers to individual numbers so the first percentile would refer to the second number, (1/100)*200=2 And the hundredth percentile would refer to the 200th number (100/100)*200=200



I have sometimes heard of there being 99 percentiles though..



Google shows the oxford dictionary that says of percentile- "each of the 100 equal groups into which a population can be divided according to the distribution of values of a particular variable." and "each of the 99 intermediate values of a random variable which divide a frequency distribution into 100 such groups."



Wikipedia says "the 20th percentile is the value below which 20% of the observations may be found" But does it actually mean "the value below or equal to which, 20% of the observations may be found" i.e. "the value for which 20% of the values are <= to it". If it were just < and not <=, then By that reasoning, the 100th percentile would be the value below which 100% of the values may be found. I have heard that as an argument that there can be no 100th percentile, because you can't have a number where there are 100% of the numbers below it. But I think maybe that argument that you can't have a 100th percentile is incorrect and is based an error that the definition of a percentile involves <= not <. (or >= not >). So the hundredth percentile would be the final number and would be >= 100% of the numbers.










share|cite|improve this question







New contributor



barlop is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$




Are there 99 percentiles, or 100 percentiles? And are they groups of numbers, or divider lines, or pointers to individual numbers?



I suppose the same question would apply for quartiles or any quantile.



I have read that the index of a number at a particular percentile(p), given n items, is i = (p / 100) * n



That suggests to me that there are 100 percentiles.. because supposing you have 100 numbers(i=1 to i=100), then each would have an index(1 to 100).



If you had 200 numbers, there'd be 100 percentiles, but would each refer to a group of two numbers. Or 100 dividers excluding either the far left or far right divider 'cos otherwise you'd get 101 dividers. Or pointers to individual numbers so the first percentile would refer to the second number, (1/100)*200=2 And the hundredth percentile would refer to the 200th number (100/100)*200=200



I have sometimes heard of there being 99 percentiles though..



Google shows the oxford dictionary that says of percentile- "each of the 100 equal groups into which a population can be divided according to the distribution of values of a particular variable." and "each of the 99 intermediate values of a random variable which divide a frequency distribution into 100 such groups."



Wikipedia says "the 20th percentile is the value below which 20% of the observations may be found" But does it actually mean "the value below or equal to which, 20% of the observations may be found" i.e. "the value for which 20% of the values are <= to it". If it were just < and not <=, then By that reasoning, the 100th percentile would be the value below which 100% of the values may be found. I have heard that as an argument that there can be no 100th percentile, because you can't have a number where there are 100% of the numbers below it. But I think maybe that argument that you can't have a 100th percentile is incorrect and is based an error that the definition of a percentile involves <= not <. (or >= not >). So the hundredth percentile would be the final number and would be >= 100% of the numbers.







quantiles






share|cite|improve this question







New contributor



barlop is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.










share|cite|improve this question







New contributor



barlop is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








share|cite|improve this question




share|cite|improve this question



share|cite|improve this question






New contributor



barlop is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








asked 8 hours ago









barlopbarlop

1261 bronze badge




1261 bronze badge




New contributor



barlop is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




New contributor




barlop is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • 1




    $begingroup$
    I think it unlikely 100 would be a reasonable answer due to its asymmetric treatment of the extremes. Cases can be made for either 99 (as in the definition you quote) or 101.
    $endgroup$
    – whuber
    8 hours ago






  • 2




    $begingroup$
    Historically quantiles — as we now say generically — were first summary points, and then by extension the bins, classes or intervals they delimit. So three quartiles, including the median, define four bins, and so forth.
    $endgroup$
    – Nick Cox
    7 hours ago










  • $begingroup$
    @NickCox do you have a source for that?
    $endgroup$
    – barlop
    5 hours ago










  • $begingroup$
    @whuber You write "I think it unlikely 100 would be a reasonable answer due to its asymmetric treatment of the extremes." <-- can you elaborate on that?
    $endgroup$
    – barlop
    5 hours ago











  • $begingroup$
    @whuber You write "Cases can be made for either 99 (as in the definition you quote) or 101" <-- though percent means per 100, so how can you have 101? And if 101 would you number them 1st 2nd ... 101st, or 0th 1st 100th? 0th seems problematic because the th/st is for counting and counting is from 1. Even in computer science, you index from 0 but still counting is 0=no items, from 1 for the first item!
    $endgroup$
    – barlop
    5 hours ago













  • 1




    $begingroup$
    I think it unlikely 100 would be a reasonable answer due to its asymmetric treatment of the extremes. Cases can be made for either 99 (as in the definition you quote) or 101.
    $endgroup$
    – whuber
    8 hours ago






  • 2




    $begingroup$
    Historically quantiles — as we now say generically — were first summary points, and then by extension the bins, classes or intervals they delimit. So three quartiles, including the median, define four bins, and so forth.
    $endgroup$
    – Nick Cox
    7 hours ago










  • $begingroup$
    @NickCox do you have a source for that?
    $endgroup$
    – barlop
    5 hours ago










  • $begingroup$
    @whuber You write "I think it unlikely 100 would be a reasonable answer due to its asymmetric treatment of the extremes." <-- can you elaborate on that?
    $endgroup$
    – barlop
    5 hours ago











  • $begingroup$
    @whuber You write "Cases can be made for either 99 (as in the definition you quote) or 101" <-- though percent means per 100, so how can you have 101? And if 101 would you number them 1st 2nd ... 101st, or 0th 1st 100th? 0th seems problematic because the th/st is for counting and counting is from 1. Even in computer science, you index from 0 but still counting is 0=no items, from 1 for the first item!
    $endgroup$
    – barlop
    5 hours ago








1




1




$begingroup$
I think it unlikely 100 would be a reasonable answer due to its asymmetric treatment of the extremes. Cases can be made for either 99 (as in the definition you quote) or 101.
$endgroup$
– whuber
8 hours ago




$begingroup$
I think it unlikely 100 would be a reasonable answer due to its asymmetric treatment of the extremes. Cases can be made for either 99 (as in the definition you quote) or 101.
$endgroup$
– whuber
8 hours ago




2




2




$begingroup$
Historically quantiles — as we now say generically — were first summary points, and then by extension the bins, classes or intervals they delimit. So three quartiles, including the median, define four bins, and so forth.
$endgroup$
– Nick Cox
7 hours ago




$begingroup$
Historically quantiles — as we now say generically — were first summary points, and then by extension the bins, classes or intervals they delimit. So three quartiles, including the median, define four bins, and so forth.
$endgroup$
– Nick Cox
7 hours ago












$begingroup$
@NickCox do you have a source for that?
$endgroup$
– barlop
5 hours ago




$begingroup$
@NickCox do you have a source for that?
$endgroup$
– barlop
5 hours ago












$begingroup$
@whuber You write "I think it unlikely 100 would be a reasonable answer due to its asymmetric treatment of the extremes." <-- can you elaborate on that?
$endgroup$
– barlop
5 hours ago





$begingroup$
@whuber You write "I think it unlikely 100 would be a reasonable answer due to its asymmetric treatment of the extremes." <-- can you elaborate on that?
$endgroup$
– barlop
5 hours ago













$begingroup$
@whuber You write "Cases can be made for either 99 (as in the definition you quote) or 101" <-- though percent means per 100, so how can you have 101? And if 101 would you number them 1st 2nd ... 101st, or 0th 1st 100th? 0th seems problematic because the th/st is for counting and counting is from 1. Even in computer science, you index from 0 but still counting is 0=no items, from 1 for the first item!
$endgroup$
– barlop
5 hours ago





$begingroup$
@whuber You write "Cases can be made for either 99 (as in the definition you quote) or 101" <-- though percent means per 100, so how can you have 101? And if 101 would you number them 1st 2nd ... 101st, or 0th 1st 100th? 0th seems problematic because the th/st is for counting and counting is from 1. Even in computer science, you index from 0 but still counting is 0=no items, from 1 for the first item!
$endgroup$
– barlop
5 hours ago











2 Answers
2






active

oldest

votes


















3
















$begingroup$

One nice way to treat this is to start with simple math and work backwards to the more complicated case of real data. Let's start with PDF's, CDF's, and inverse CDF's (also known as quantile functions). The $x$th quantile of a distribution with pdf $f$ and cdf $F$ is $F^-1(x)$. Suppose the $z$th percentile is $F^-1(z/100)$. For a uniform 0,1 distribution, the 100th and 0th percentiles are ill-defined since $F^-1$ is only unique when $F$ is not constant. For a normal distribution, they do not exist (or they "are" $pm infty$).



For continuous distributions, non-extreme quantiles exist and are unique. For a discrete distribution such as the Poisson distribution, most percentiles don't exist because for most $z/100$, there is no $y$ with $F(y) = z/100$.



When it comes to real data, all distributions are discrete. (The empirical CDF of runif(100) or np.random.random(100) has 100 increments clustered around 0.5.) We still want to have a useful concept of quantiles and percentiles. So, we can define a quantile as any consistent estimator of the corresponding theoretical quantile. For example, the median (the 50th percentile or 0.5 quantile) of the sample 3,4, 5, 6, 7, 8 can be any number between 5 and 6. If you draw 2n samples from a unif(3,8) distribution and take any number between the nth and (n+1)th sample, you will converge on 5.5 as n increases.



(In practice, I would always compute the median of 3, 4, 5, 6, 7, 8 as 5.5 because that particular median is also a trimmed mean, which means it has other good properties as an estimator.)






share|cite|improve this answer












$endgroup$










  • 1




    $begingroup$
    Your first paragraph has some incorrect information: $F^-1$ is indeed unique in many cases, including for the uniform distribution on $[0,1]$ (when $F$ is restricted to $[0,1]$ itself). This has little to do with $F$ being "constant." I think you are making misleading arguments that mix up the roles of continuity, invertibility, and boundedness of support of distributions. Introducing estimators and referring to them also as "quantiles" is interesting but threatens to make things even more confusing.
    $endgroup$
    – whuber
    7 hours ago



















1
















$begingroup$

I was taught that an observation in the nth percentile was greater than n% of observations in the dataset under consideration. Which to me implies that there is no 0th or 100th percentile. No observation can be greater than 100% of observations because it forms part of that 100% (and a similar logic applies in the case of 0).



But I unfortunately have no source for this that I can point you to.






share|cite|improve this answer












$endgroup$










  • 2




    $begingroup$
    Do you have an authoritative reference for what you remember being taught? Note that you are implicitly adopting a definition of "percentile" as being a group of numbers. The other definition quoted in the question is that the percentile is a boundary between such groups.
    $endgroup$
    – whuber
    8 hours ago











  • $begingroup$
    @whuber Unfortunately not. And yes, I see the distinction.
    $endgroup$
    – mkt
    8 hours ago











  • $begingroup$
    That doesn't make sense to me because suppose your data is 2,2,2,2,2,2,2,2,2,2,2 so an item in one quantile is equal to an item to its left in a prior quantile. So an item in the nth quantile is not greater than all quantiles left of it. So an item in the nth percentile is not greater than n% of observations in the dataset. It's >= n% of observations in the dataset, but not simply >. And hence you can have a 100th pecentile.. what do you make of that logic?
    $endgroup$
    – barlop
    5 hours ago











  • $begingroup$
    Many definitions come under strain if all values are identical!
    $endgroup$
    – Nick Cox
    3 hours ago












Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);







barlop is a new contributor. Be nice, and check out our Code of Conduct.









draft saved

draft discarded
















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f430391%2fare-there-99-percentiles-or-100-percentiles-and-are-they-groups-of-numbers-or%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown


























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









3
















$begingroup$

One nice way to treat this is to start with simple math and work backwards to the more complicated case of real data. Let's start with PDF's, CDF's, and inverse CDF's (also known as quantile functions). The $x$th quantile of a distribution with pdf $f$ and cdf $F$ is $F^-1(x)$. Suppose the $z$th percentile is $F^-1(z/100)$. For a uniform 0,1 distribution, the 100th and 0th percentiles are ill-defined since $F^-1$ is only unique when $F$ is not constant. For a normal distribution, they do not exist (or they "are" $pm infty$).



For continuous distributions, non-extreme quantiles exist and are unique. For a discrete distribution such as the Poisson distribution, most percentiles don't exist because for most $z/100$, there is no $y$ with $F(y) = z/100$.



When it comes to real data, all distributions are discrete. (The empirical CDF of runif(100) or np.random.random(100) has 100 increments clustered around 0.5.) We still want to have a useful concept of quantiles and percentiles. So, we can define a quantile as any consistent estimator of the corresponding theoretical quantile. For example, the median (the 50th percentile or 0.5 quantile) of the sample 3,4, 5, 6, 7, 8 can be any number between 5 and 6. If you draw 2n samples from a unif(3,8) distribution and take any number between the nth and (n+1)th sample, you will converge on 5.5 as n increases.



(In practice, I would always compute the median of 3, 4, 5, 6, 7, 8 as 5.5 because that particular median is also a trimmed mean, which means it has other good properties as an estimator.)






share|cite|improve this answer












$endgroup$










  • 1




    $begingroup$
    Your first paragraph has some incorrect information: $F^-1$ is indeed unique in many cases, including for the uniform distribution on $[0,1]$ (when $F$ is restricted to $[0,1]$ itself). This has little to do with $F$ being "constant." I think you are making misleading arguments that mix up the roles of continuity, invertibility, and boundedness of support of distributions. Introducing estimators and referring to them also as "quantiles" is interesting but threatens to make things even more confusing.
    $endgroup$
    – whuber
    7 hours ago
















3
















$begingroup$

One nice way to treat this is to start with simple math and work backwards to the more complicated case of real data. Let's start with PDF's, CDF's, and inverse CDF's (also known as quantile functions). The $x$th quantile of a distribution with pdf $f$ and cdf $F$ is $F^-1(x)$. Suppose the $z$th percentile is $F^-1(z/100)$. For a uniform 0,1 distribution, the 100th and 0th percentiles are ill-defined since $F^-1$ is only unique when $F$ is not constant. For a normal distribution, they do not exist (or they "are" $pm infty$).



For continuous distributions, non-extreme quantiles exist and are unique. For a discrete distribution such as the Poisson distribution, most percentiles don't exist because for most $z/100$, there is no $y$ with $F(y) = z/100$.



When it comes to real data, all distributions are discrete. (The empirical CDF of runif(100) or np.random.random(100) has 100 increments clustered around 0.5.) We still want to have a useful concept of quantiles and percentiles. So, we can define a quantile as any consistent estimator of the corresponding theoretical quantile. For example, the median (the 50th percentile or 0.5 quantile) of the sample 3,4, 5, 6, 7, 8 can be any number between 5 and 6. If you draw 2n samples from a unif(3,8) distribution and take any number between the nth and (n+1)th sample, you will converge on 5.5 as n increases.



(In practice, I would always compute the median of 3, 4, 5, 6, 7, 8 as 5.5 because that particular median is also a trimmed mean, which means it has other good properties as an estimator.)






share|cite|improve this answer












$endgroup$










  • 1




    $begingroup$
    Your first paragraph has some incorrect information: $F^-1$ is indeed unique in many cases, including for the uniform distribution on $[0,1]$ (when $F$ is restricted to $[0,1]$ itself). This has little to do with $F$ being "constant." I think you are making misleading arguments that mix up the roles of continuity, invertibility, and boundedness of support of distributions. Introducing estimators and referring to them also as "quantiles" is interesting but threatens to make things even more confusing.
    $endgroup$
    – whuber
    7 hours ago














3














3










3







$begingroup$

One nice way to treat this is to start with simple math and work backwards to the more complicated case of real data. Let's start with PDF's, CDF's, and inverse CDF's (also known as quantile functions). The $x$th quantile of a distribution with pdf $f$ and cdf $F$ is $F^-1(x)$. Suppose the $z$th percentile is $F^-1(z/100)$. For a uniform 0,1 distribution, the 100th and 0th percentiles are ill-defined since $F^-1$ is only unique when $F$ is not constant. For a normal distribution, they do not exist (or they "are" $pm infty$).



For continuous distributions, non-extreme quantiles exist and are unique. For a discrete distribution such as the Poisson distribution, most percentiles don't exist because for most $z/100$, there is no $y$ with $F(y) = z/100$.



When it comes to real data, all distributions are discrete. (The empirical CDF of runif(100) or np.random.random(100) has 100 increments clustered around 0.5.) We still want to have a useful concept of quantiles and percentiles. So, we can define a quantile as any consistent estimator of the corresponding theoretical quantile. For example, the median (the 50th percentile or 0.5 quantile) of the sample 3,4, 5, 6, 7, 8 can be any number between 5 and 6. If you draw 2n samples from a unif(3,8) distribution and take any number between the nth and (n+1)th sample, you will converge on 5.5 as n increases.



(In practice, I would always compute the median of 3, 4, 5, 6, 7, 8 as 5.5 because that particular median is also a trimmed mean, which means it has other good properties as an estimator.)






share|cite|improve this answer












$endgroup$



One nice way to treat this is to start with simple math and work backwards to the more complicated case of real data. Let's start with PDF's, CDF's, and inverse CDF's (also known as quantile functions). The $x$th quantile of a distribution with pdf $f$ and cdf $F$ is $F^-1(x)$. Suppose the $z$th percentile is $F^-1(z/100)$. For a uniform 0,1 distribution, the 100th and 0th percentiles are ill-defined since $F^-1$ is only unique when $F$ is not constant. For a normal distribution, they do not exist (or they "are" $pm infty$).



For continuous distributions, non-extreme quantiles exist and are unique. For a discrete distribution such as the Poisson distribution, most percentiles don't exist because for most $z/100$, there is no $y$ with $F(y) = z/100$.



When it comes to real data, all distributions are discrete. (The empirical CDF of runif(100) or np.random.random(100) has 100 increments clustered around 0.5.) We still want to have a useful concept of quantiles and percentiles. So, we can define a quantile as any consistent estimator of the corresponding theoretical quantile. For example, the median (the 50th percentile or 0.5 quantile) of the sample 3,4, 5, 6, 7, 8 can be any number between 5 and 6. If you draw 2n samples from a unif(3,8) distribution and take any number between the nth and (n+1)th sample, you will converge on 5.5 as n increases.



(In practice, I would always compute the median of 3, 4, 5, 6, 7, 8 as 5.5 because that particular median is also a trimmed mean, which means it has other good properties as an estimator.)







share|cite|improve this answer















share|cite|improve this answer




share|cite|improve this answer



share|cite|improve this answer








edited 7 hours ago

























answered 7 hours ago









eric_kernfelderic_kernfeld

3,4531 gold badge10 silver badges32 bronze badges




3,4531 gold badge10 silver badges32 bronze badges










  • 1




    $begingroup$
    Your first paragraph has some incorrect information: $F^-1$ is indeed unique in many cases, including for the uniform distribution on $[0,1]$ (when $F$ is restricted to $[0,1]$ itself). This has little to do with $F$ being "constant." I think you are making misleading arguments that mix up the roles of continuity, invertibility, and boundedness of support of distributions. Introducing estimators and referring to them also as "quantiles" is interesting but threatens to make things even more confusing.
    $endgroup$
    – whuber
    7 hours ago













  • 1




    $begingroup$
    Your first paragraph has some incorrect information: $F^-1$ is indeed unique in many cases, including for the uniform distribution on $[0,1]$ (when $F$ is restricted to $[0,1]$ itself). This has little to do with $F$ being "constant." I think you are making misleading arguments that mix up the roles of continuity, invertibility, and boundedness of support of distributions. Introducing estimators and referring to them also as "quantiles" is interesting but threatens to make things even more confusing.
    $endgroup$
    – whuber
    7 hours ago








1




1




$begingroup$
Your first paragraph has some incorrect information: $F^-1$ is indeed unique in many cases, including for the uniform distribution on $[0,1]$ (when $F$ is restricted to $[0,1]$ itself). This has little to do with $F$ being "constant." I think you are making misleading arguments that mix up the roles of continuity, invertibility, and boundedness of support of distributions. Introducing estimators and referring to them also as "quantiles" is interesting but threatens to make things even more confusing.
$endgroup$
– whuber
7 hours ago





$begingroup$
Your first paragraph has some incorrect information: $F^-1$ is indeed unique in many cases, including for the uniform distribution on $[0,1]$ (when $F$ is restricted to $[0,1]$ itself). This has little to do with $F$ being "constant." I think you are making misleading arguments that mix up the roles of continuity, invertibility, and boundedness of support of distributions. Introducing estimators and referring to them also as "quantiles" is interesting but threatens to make things even more confusing.
$endgroup$
– whuber
7 hours ago














1
















$begingroup$

I was taught that an observation in the nth percentile was greater than n% of observations in the dataset under consideration. Which to me implies that there is no 0th or 100th percentile. No observation can be greater than 100% of observations because it forms part of that 100% (and a similar logic applies in the case of 0).



But I unfortunately have no source for this that I can point you to.






share|cite|improve this answer












$endgroup$










  • 2




    $begingroup$
    Do you have an authoritative reference for what you remember being taught? Note that you are implicitly adopting a definition of "percentile" as being a group of numbers. The other definition quoted in the question is that the percentile is a boundary between such groups.
    $endgroup$
    – whuber
    8 hours ago











  • $begingroup$
    @whuber Unfortunately not. And yes, I see the distinction.
    $endgroup$
    – mkt
    8 hours ago











  • $begingroup$
    That doesn't make sense to me because suppose your data is 2,2,2,2,2,2,2,2,2,2,2 so an item in one quantile is equal to an item to its left in a prior quantile. So an item in the nth quantile is not greater than all quantiles left of it. So an item in the nth percentile is not greater than n% of observations in the dataset. It's >= n% of observations in the dataset, but not simply >. And hence you can have a 100th pecentile.. what do you make of that logic?
    $endgroup$
    – barlop
    5 hours ago











  • $begingroup$
    Many definitions come under strain if all values are identical!
    $endgroup$
    – Nick Cox
    3 hours ago















1
















$begingroup$

I was taught that an observation in the nth percentile was greater than n% of observations in the dataset under consideration. Which to me implies that there is no 0th or 100th percentile. No observation can be greater than 100% of observations because it forms part of that 100% (and a similar logic applies in the case of 0).



But I unfortunately have no source for this that I can point you to.






share|cite|improve this answer












$endgroup$










  • 2




    $begingroup$
    Do you have an authoritative reference for what you remember being taught? Note that you are implicitly adopting a definition of "percentile" as being a group of numbers. The other definition quoted in the question is that the percentile is a boundary between such groups.
    $endgroup$
    – whuber
    8 hours ago











  • $begingroup$
    @whuber Unfortunately not. And yes, I see the distinction.
    $endgroup$
    – mkt
    8 hours ago











  • $begingroup$
    That doesn't make sense to me because suppose your data is 2,2,2,2,2,2,2,2,2,2,2 so an item in one quantile is equal to an item to its left in a prior quantile. So an item in the nth quantile is not greater than all quantiles left of it. So an item in the nth percentile is not greater than n% of observations in the dataset. It's >= n% of observations in the dataset, but not simply >. And hence you can have a 100th pecentile.. what do you make of that logic?
    $endgroup$
    – barlop
    5 hours ago











  • $begingroup$
    Many definitions come under strain if all values are identical!
    $endgroup$
    – Nick Cox
    3 hours ago













1














1










1







$begingroup$

I was taught that an observation in the nth percentile was greater than n% of observations in the dataset under consideration. Which to me implies that there is no 0th or 100th percentile. No observation can be greater than 100% of observations because it forms part of that 100% (and a similar logic applies in the case of 0).



But I unfortunately have no source for this that I can point you to.






share|cite|improve this answer












$endgroup$



I was taught that an observation in the nth percentile was greater than n% of observations in the dataset under consideration. Which to me implies that there is no 0th or 100th percentile. No observation can be greater than 100% of observations because it forms part of that 100% (and a similar logic applies in the case of 0).



But I unfortunately have no source for this that I can point you to.







share|cite|improve this answer















share|cite|improve this answer




share|cite|improve this answer



share|cite|improve this answer








edited 7 hours ago

























answered 8 hours ago









mktmkt

8,4516 gold badges32 silver badges93 bronze badges




8,4516 gold badges32 silver badges93 bronze badges










  • 2




    $begingroup$
    Do you have an authoritative reference for what you remember being taught? Note that you are implicitly adopting a definition of "percentile" as being a group of numbers. The other definition quoted in the question is that the percentile is a boundary between such groups.
    $endgroup$
    – whuber
    8 hours ago











  • $begingroup$
    @whuber Unfortunately not. And yes, I see the distinction.
    $endgroup$
    – mkt
    8 hours ago











  • $begingroup$
    That doesn't make sense to me because suppose your data is 2,2,2,2,2,2,2,2,2,2,2 so an item in one quantile is equal to an item to its left in a prior quantile. So an item in the nth quantile is not greater than all quantiles left of it. So an item in the nth percentile is not greater than n% of observations in the dataset. It's >= n% of observations in the dataset, but not simply >. And hence you can have a 100th pecentile.. what do you make of that logic?
    $endgroup$
    – barlop
    5 hours ago











  • $begingroup$
    Many definitions come under strain if all values are identical!
    $endgroup$
    – Nick Cox
    3 hours ago












  • 2




    $begingroup$
    Do you have an authoritative reference for what you remember being taught? Note that you are implicitly adopting a definition of "percentile" as being a group of numbers. The other definition quoted in the question is that the percentile is a boundary between such groups.
    $endgroup$
    – whuber
    8 hours ago











  • $begingroup$
    @whuber Unfortunately not. And yes, I see the distinction.
    $endgroup$
    – mkt
    8 hours ago











  • $begingroup$
    That doesn't make sense to me because suppose your data is 2,2,2,2,2,2,2,2,2,2,2 so an item in one quantile is equal to an item to its left in a prior quantile. So an item in the nth quantile is not greater than all quantiles left of it. So an item in the nth percentile is not greater than n% of observations in the dataset. It's >= n% of observations in the dataset, but not simply >. And hence you can have a 100th pecentile.. what do you make of that logic?
    $endgroup$
    – barlop
    5 hours ago











  • $begingroup$
    Many definitions come under strain if all values are identical!
    $endgroup$
    – Nick Cox
    3 hours ago







2




2




$begingroup$
Do you have an authoritative reference for what you remember being taught? Note that you are implicitly adopting a definition of "percentile" as being a group of numbers. The other definition quoted in the question is that the percentile is a boundary between such groups.
$endgroup$
– whuber
8 hours ago





$begingroup$
Do you have an authoritative reference for what you remember being taught? Note that you are implicitly adopting a definition of "percentile" as being a group of numbers. The other definition quoted in the question is that the percentile is a boundary between such groups.
$endgroup$
– whuber
8 hours ago













$begingroup$
@whuber Unfortunately not. And yes, I see the distinction.
$endgroup$
– mkt
8 hours ago





$begingroup$
@whuber Unfortunately not. And yes, I see the distinction.
$endgroup$
– mkt
8 hours ago













$begingroup$
That doesn't make sense to me because suppose your data is 2,2,2,2,2,2,2,2,2,2,2 so an item in one quantile is equal to an item to its left in a prior quantile. So an item in the nth quantile is not greater than all quantiles left of it. So an item in the nth percentile is not greater than n% of observations in the dataset. It's >= n% of observations in the dataset, but not simply >. And hence you can have a 100th pecentile.. what do you make of that logic?
$endgroup$
– barlop
5 hours ago





$begingroup$
That doesn't make sense to me because suppose your data is 2,2,2,2,2,2,2,2,2,2,2 so an item in one quantile is equal to an item to its left in a prior quantile. So an item in the nth quantile is not greater than all quantiles left of it. So an item in the nth percentile is not greater than n% of observations in the dataset. It's >= n% of observations in the dataset, but not simply >. And hence you can have a 100th pecentile.. what do you make of that logic?
$endgroup$
– barlop
5 hours ago













$begingroup$
Many definitions come under strain if all values are identical!
$endgroup$
– Nick Cox
3 hours ago




$begingroup$
Many definitions come under strain if all values are identical!
$endgroup$
– Nick Cox
3 hours ago











barlop is a new contributor. Be nice, and check out our Code of Conduct.









draft saved

draft discarded

















barlop is a new contributor. Be nice, and check out our Code of Conduct.












barlop is a new contributor. Be nice, and check out our Code of Conduct.











barlop is a new contributor. Be nice, and check out our Code of Conduct.














Thanks for contributing an answer to Cross Validated!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f430391%2fare-there-99-percentiles-or-100-percentiles-and-are-they-groups-of-numbers-or%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown









Popular posts from this blog

Invision Community Contents History See also References External links Navigation menuProprietaryinvisioncommunity.comIPS Community ForumsIPS Community Forumsthis blog entry"License Changes, IP.Board 3.4, and the Future""Interview -- Matt Mecham of Ibforums""CEO Invision Power Board, Matt Mecham Is a Liar, Thief!"IPB License Explanation 1.3, 1.3.1, 2.0, and 2.1ArchivedSecurity Fixes, Updates And Enhancements For IPB 1.3.1Archived"New Demo Accounts - Invision Power Services"the original"New Default Skin"the original"Invision Power Board 3.0.0 and Applications Released"the original"Archived copy"the original"Perpetual licenses being done away with""Release Notes - Invision Power Services""Introducing: IPS Community Suite 4!"Invision Community Release Notes

Canceling a color specificationRandomly assigning color to Graphics3D objects?Default color for Filling in Mathematica 9Coloring specific elements of sets with a prime modified order in an array plotHow to pick a color differing significantly from the colors already in a given color list?Detection of the text colorColor numbers based on their valueCan color schemes for use with ColorData include opacity specification?My dynamic color schemes

Ласкавець круглолистий Зміст Опис | Поширення | Галерея | Примітки | Посилання | Навігаційне меню58171138361-22960890446Bupleurum rotundifoliumEuro+Med PlantbasePlants of the World Online — Kew ScienceGermplasm Resources Information Network (GRIN)Ласкавецькн. VI : Літери Ком — Левиправивши або дописавши її