What to bootstrap for hypothesis testingExplaining to laypeople why bootstrapping works Bootstrap vs. permutation hypothesis testingHypothesis testing: small timeseries changesWhy the data should be resampled under null hypothesis in bootstrap hypothesis testing?lmer() parametric bootstrap testing for fixed effectsBootstrap hypothesis testing with small sample sizesIs this popular approach to Bootstrap hypothesis testing correct?Hypothesis testing using the non-parametric bootstrapBootstrap hypothesis testing of equality of distributionsWhat is the difference between bootstrap hypothesis testing/permutation test and traditional hypothesis testing?Hypothesis testing for percentage
Why do I seem to lose data using this bash pipe construction?
Is Dumbledore a human lie detector?
Do SFDX commands count toward limits?
Why do (or did, until very recently) aircraft transponders wait to be interrogated before broadcasting beacon signals?
Can a Warforged suffer from magical exhaustion?
Is it safe to dpkg --set-selections on a newer version of a distro?
What is this wall covering type?
Print "N NE E SE S SW W NW"
Course development: can I pay someone to make slides for the course?
Who is "He that flies" in Lord of the Rings?
Problem with pronounciation
How to represent jealousy in a cute way?
Why do the TIE Fighter pilot helmets have similar ridges as the rebels?
Why does there seem to be an extreme lack of public trashcans in Taiwan?
How do I type a hyphen in iOS 12?
Placement of positioning lights on A320 winglets
How to make a composition of functions prettier?
Why is the distribution of dark matter in a Galaxy different from the distribution of normal matter?
What is the logic behind charging tax _in the form of money_ for owning property when the property does not produce money?
Why are ambiguous grammars bad?
What do I need to do, tax-wise, for a sudden windfall?
How much web presence should I have?
How (un)safe is it to ride barefoot?
In American Politics, why is the Justice Department under the President?
What to bootstrap for hypothesis testing
Explaining to laypeople why bootstrapping works Bootstrap vs. permutation hypothesis testingHypothesis testing: small timeseries changesWhy the data should be resampled under null hypothesis in bootstrap hypothesis testing?lmer() parametric bootstrap testing for fixed effectsBootstrap hypothesis testing with small sample sizesIs this popular approach to Bootstrap hypothesis testing correct?Hypothesis testing using the non-parametric bootstrapBootstrap hypothesis testing of equality of distributionsWhat is the difference between bootstrap hypothesis testing/permutation test and traditional hypothesis testing?Hypothesis testing for percentage
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
$begingroup$
I have a small question about the concept behind hypothesis testing using bootstrap. Assume that I need to evaluate two independent population mean differences: population a and population b. My doubt is the following:
Should I apply bootstrap on a single population, and check the difference of the mean after that?
Mean[BOOT(a)-BOOT(b)]Alternatively, should I compute di difference:
Mean(a)-Mean(b)and then apply bootstrap?BOOT[Mean(a)-Mean(b)]
I used this code by using the second approach:
set.seed(123)
a <- rnorm(100)
b <- rnorm(100)
hist(a)
hist(b)
c = a-b
hist(c)
boot_1 = function(R,dati_oss)
n = length(dati_oss)
media_boot = vector("numeric",R)
for(i in 1:R)
ind = sample(1:n,replace=T)
media_boot[i] = mean(dati_oss[ind])
return(media_boot)
res=boot_1(500000,c)
hist(res)
stat = matrix(c(mean(c), mean(res), mean(res)-mean(c), sqrt(var(res)),
as.vector(quantile(res, c(0.025,0.975)))), 1, 6)
colnames(stat) = c("Observed", "Mean-boot", "Bias", "SE", "0.95LCI", "0.95UCI")
row.names(stat) = c("Mean")
stat
r hypothesis-testing bootstrap
$endgroup$
add a comment |
$begingroup$
I have a small question about the concept behind hypothesis testing using bootstrap. Assume that I need to evaluate two independent population mean differences: population a and population b. My doubt is the following:
Should I apply bootstrap on a single population, and check the difference of the mean after that?
Mean[BOOT(a)-BOOT(b)]Alternatively, should I compute di difference:
Mean(a)-Mean(b)and then apply bootstrap?BOOT[Mean(a)-Mean(b)]
I used this code by using the second approach:
set.seed(123)
a <- rnorm(100)
b <- rnorm(100)
hist(a)
hist(b)
c = a-b
hist(c)
boot_1 = function(R,dati_oss)
n = length(dati_oss)
media_boot = vector("numeric",R)
for(i in 1:R)
ind = sample(1:n,replace=T)
media_boot[i] = mean(dati_oss[ind])
return(media_boot)
res=boot_1(500000,c)
hist(res)
stat = matrix(c(mean(c), mean(res), mean(res)-mean(c), sqrt(var(res)),
as.vector(quantile(res, c(0.025,0.975)))), 1, 6)
colnames(stat) = c("Observed", "Mean-boot", "Bias", "SE", "0.95LCI", "0.95UCI")
row.names(stat) = c("Mean")
stat
r hypothesis-testing bootstrap
$endgroup$
$begingroup$
Did you try running your code? I get an error. Also, your code doesn't actually match your method 2.
$endgroup$
– gung♦
8 hours ago
$begingroup$
@gung where did you get the error?
$endgroup$
– an.dr.ea
8 hours ago
$begingroup$
After runningres=boot_1(500000,c).
$endgroup$
– gung♦
8 hours ago
$begingroup$
@MichaelM Thanks, I just wanted to know if approach 1 or 2 is correct.
$endgroup$
– an.dr.ea
8 hours ago
$begingroup$
@gung I don't have this problem, try with fewer reps res = boot_1 (5000, c)
$endgroup$
– an.dr.ea
8 hours ago
add a comment |
$begingroup$
I have a small question about the concept behind hypothesis testing using bootstrap. Assume that I need to evaluate two independent population mean differences: population a and population b. My doubt is the following:
Should I apply bootstrap on a single population, and check the difference of the mean after that?
Mean[BOOT(a)-BOOT(b)]Alternatively, should I compute di difference:
Mean(a)-Mean(b)and then apply bootstrap?BOOT[Mean(a)-Mean(b)]
I used this code by using the second approach:
set.seed(123)
a <- rnorm(100)
b <- rnorm(100)
hist(a)
hist(b)
c = a-b
hist(c)
boot_1 = function(R,dati_oss)
n = length(dati_oss)
media_boot = vector("numeric",R)
for(i in 1:R)
ind = sample(1:n,replace=T)
media_boot[i] = mean(dati_oss[ind])
return(media_boot)
res=boot_1(500000,c)
hist(res)
stat = matrix(c(mean(c), mean(res), mean(res)-mean(c), sqrt(var(res)),
as.vector(quantile(res, c(0.025,0.975)))), 1, 6)
colnames(stat) = c("Observed", "Mean-boot", "Bias", "SE", "0.95LCI", "0.95UCI")
row.names(stat) = c("Mean")
stat
r hypothesis-testing bootstrap
$endgroup$
I have a small question about the concept behind hypothesis testing using bootstrap. Assume that I need to evaluate two independent population mean differences: population a and population b. My doubt is the following:
Should I apply bootstrap on a single population, and check the difference of the mean after that?
Mean[BOOT(a)-BOOT(b)]Alternatively, should I compute di difference:
Mean(a)-Mean(b)and then apply bootstrap?BOOT[Mean(a)-Mean(b)]
I used this code by using the second approach:
set.seed(123)
a <- rnorm(100)
b <- rnorm(100)
hist(a)
hist(b)
c = a-b
hist(c)
boot_1 = function(R,dati_oss)
n = length(dati_oss)
media_boot = vector("numeric",R)
for(i in 1:R)
ind = sample(1:n,replace=T)
media_boot[i] = mean(dati_oss[ind])
return(media_boot)
res=boot_1(500000,c)
hist(res)
stat = matrix(c(mean(c), mean(res), mean(res)-mean(c), sqrt(var(res)),
as.vector(quantile(res, c(0.025,0.975)))), 1, 6)
colnames(stat) = c("Observed", "Mean-boot", "Bias", "SE", "0.95LCI", "0.95UCI")
row.names(stat) = c("Mean")
stat
r hypothesis-testing bootstrap
r hypothesis-testing bootstrap
edited 8 hours ago
gung♦
110k34271542
110k34271542
asked 9 hours ago
an.dr.eaan.dr.ea
214
214
$begingroup$
Did you try running your code? I get an error. Also, your code doesn't actually match your method 2.
$endgroup$
– gung♦
8 hours ago
$begingroup$
@gung where did you get the error?
$endgroup$
– an.dr.ea
8 hours ago
$begingroup$
After runningres=boot_1(500000,c).
$endgroup$
– gung♦
8 hours ago
$begingroup$
@MichaelM Thanks, I just wanted to know if approach 1 or 2 is correct.
$endgroup$
– an.dr.ea
8 hours ago
$begingroup$
@gung I don't have this problem, try with fewer reps res = boot_1 (5000, c)
$endgroup$
– an.dr.ea
8 hours ago
add a comment |
$begingroup$
Did you try running your code? I get an error. Also, your code doesn't actually match your method 2.
$endgroup$
– gung♦
8 hours ago
$begingroup$
@gung where did you get the error?
$endgroup$
– an.dr.ea
8 hours ago
$begingroup$
After runningres=boot_1(500000,c).
$endgroup$
– gung♦
8 hours ago
$begingroup$
@MichaelM Thanks, I just wanted to know if approach 1 or 2 is correct.
$endgroup$
– an.dr.ea
8 hours ago
$begingroup$
@gung I don't have this problem, try with fewer reps res = boot_1 (5000, c)
$endgroup$
– an.dr.ea
8 hours ago
$begingroup$
Did you try running your code? I get an error. Also, your code doesn't actually match your method 2.
$endgroup$
– gung♦
8 hours ago
$begingroup$
Did you try running your code? I get an error. Also, your code doesn't actually match your method 2.
$endgroup$
– gung♦
8 hours ago
$begingroup$
@gung where did you get the error?
$endgroup$
– an.dr.ea
8 hours ago
$begingroup$
@gung where did you get the error?
$endgroup$
– an.dr.ea
8 hours ago
$begingroup$
After running
res=boot_1(500000,c).$endgroup$
– gung♦
8 hours ago
$begingroup$
After running
res=boot_1(500000,c).$endgroup$
– gung♦
8 hours ago
$begingroup$
@MichaelM Thanks, I just wanted to know if approach 1 or 2 is correct.
$endgroup$
– an.dr.ea
8 hours ago
$begingroup$
@MichaelM Thanks, I just wanted to know if approach 1 or 2 is correct.
$endgroup$
– an.dr.ea
8 hours ago
$begingroup$
@gung I don't have this problem, try with fewer reps res = boot_1 (5000, c)
$endgroup$
– an.dr.ea
8 hours ago
$begingroup$
@gung I don't have this problem, try with fewer reps res = boot_1 (5000, c)
$endgroup$
– an.dr.ea
8 hours ago
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
The basic principle to apply, quoting @MichaelChernick, is: "Sampling with replacement behaves on the original sample the way the original sample behaves on a population."
Think about how you analyzed the original sample. You took the sample, calculated the mean of each of the 2 groups, and determined the difference between their mean values to get an estimate of the a-b difference.
So you proceed similarly with each bootstrapped resample: resample from the original sample, calculate the mean of each group, and determine the difference between the means of the two groups as represented in the resample. Do this a large number of times to estimate the distribution of a-b differences. Compare the mean of the bootstrapped a-b differences against the a-b difference found in the original sample to estimate the bias in the original a-b difference.
Note that the way you design the resampling might depend on the original study design. If you had two independent populations from which you took samples, then the resampling should proceed comparably, within each of the populations. If you sampled from a mixed population in which individual cases were labeled as belonging to population a versus b, then you should resample from a pool of all the cases in the original sample.
$endgroup$
add a comment |
$begingroup$
Without going into the coding of it, consider what happens if you calculate Mean(a)-Mean(b) first (proposal 2). You calculate Mean(a) and get a single number, then you calculate Mean(b) and get a single number. Then you take your two numbers, and calculate Mean(a)-Mean(b) to get a single value for your difference. No matter how many times you sample/bootstrap this single value, you will get the same number. Try it out!
Whereas if you take multiple samples of your population A and population B, then calculate the difference of means of your samples (proposal 1) you will get slightly different combinations nearly every time you sample (assuming you don't re-set the random seed at the wrong point in your code), so you will get a range of values for the difference.
I would consider proposal 1 to be a form of bootstrapping, but I wouldn't say the same about proposal 2!
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f412311%2fwhat-to-bootstrap-for-hypothesis-testing%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
The basic principle to apply, quoting @MichaelChernick, is: "Sampling with replacement behaves on the original sample the way the original sample behaves on a population."
Think about how you analyzed the original sample. You took the sample, calculated the mean of each of the 2 groups, and determined the difference between their mean values to get an estimate of the a-b difference.
So you proceed similarly with each bootstrapped resample: resample from the original sample, calculate the mean of each group, and determine the difference between the means of the two groups as represented in the resample. Do this a large number of times to estimate the distribution of a-b differences. Compare the mean of the bootstrapped a-b differences against the a-b difference found in the original sample to estimate the bias in the original a-b difference.
Note that the way you design the resampling might depend on the original study design. If you had two independent populations from which you took samples, then the resampling should proceed comparably, within each of the populations. If you sampled from a mixed population in which individual cases were labeled as belonging to population a versus b, then you should resample from a pool of all the cases in the original sample.
$endgroup$
add a comment |
$begingroup$
The basic principle to apply, quoting @MichaelChernick, is: "Sampling with replacement behaves on the original sample the way the original sample behaves on a population."
Think about how you analyzed the original sample. You took the sample, calculated the mean of each of the 2 groups, and determined the difference between their mean values to get an estimate of the a-b difference.
So you proceed similarly with each bootstrapped resample: resample from the original sample, calculate the mean of each group, and determine the difference between the means of the two groups as represented in the resample. Do this a large number of times to estimate the distribution of a-b differences. Compare the mean of the bootstrapped a-b differences against the a-b difference found in the original sample to estimate the bias in the original a-b difference.
Note that the way you design the resampling might depend on the original study design. If you had two independent populations from which you took samples, then the resampling should proceed comparably, within each of the populations. If you sampled from a mixed population in which individual cases were labeled as belonging to population a versus b, then you should resample from a pool of all the cases in the original sample.
$endgroup$
add a comment |
$begingroup$
The basic principle to apply, quoting @MichaelChernick, is: "Sampling with replacement behaves on the original sample the way the original sample behaves on a population."
Think about how you analyzed the original sample. You took the sample, calculated the mean of each of the 2 groups, and determined the difference between their mean values to get an estimate of the a-b difference.
So you proceed similarly with each bootstrapped resample: resample from the original sample, calculate the mean of each group, and determine the difference between the means of the two groups as represented in the resample. Do this a large number of times to estimate the distribution of a-b differences. Compare the mean of the bootstrapped a-b differences against the a-b difference found in the original sample to estimate the bias in the original a-b difference.
Note that the way you design the resampling might depend on the original study design. If you had two independent populations from which you took samples, then the resampling should proceed comparably, within each of the populations. If you sampled from a mixed population in which individual cases were labeled as belonging to population a versus b, then you should resample from a pool of all the cases in the original sample.
$endgroup$
The basic principle to apply, quoting @MichaelChernick, is: "Sampling with replacement behaves on the original sample the way the original sample behaves on a population."
Think about how you analyzed the original sample. You took the sample, calculated the mean of each of the 2 groups, and determined the difference between their mean values to get an estimate of the a-b difference.
So you proceed similarly with each bootstrapped resample: resample from the original sample, calculate the mean of each group, and determine the difference between the means of the two groups as represented in the resample. Do this a large number of times to estimate the distribution of a-b differences. Compare the mean of the bootstrapped a-b differences against the a-b difference found in the original sample to estimate the bias in the original a-b difference.
Note that the way you design the resampling might depend on the original study design. If you had two independent populations from which you took samples, then the resampling should proceed comparably, within each of the populations. If you sampled from a mixed population in which individual cases were labeled as belonging to population a versus b, then you should resample from a pool of all the cases in the original sample.
answered 6 hours ago
EdMEdM
23.8k234102
23.8k234102
add a comment |
add a comment |
$begingroup$
Without going into the coding of it, consider what happens if you calculate Mean(a)-Mean(b) first (proposal 2). You calculate Mean(a) and get a single number, then you calculate Mean(b) and get a single number. Then you take your two numbers, and calculate Mean(a)-Mean(b) to get a single value for your difference. No matter how many times you sample/bootstrap this single value, you will get the same number. Try it out!
Whereas if you take multiple samples of your population A and population B, then calculate the difference of means of your samples (proposal 1) you will get slightly different combinations nearly every time you sample (assuming you don't re-set the random seed at the wrong point in your code), so you will get a range of values for the difference.
I would consider proposal 1 to be a form of bootstrapping, but I wouldn't say the same about proposal 2!
$endgroup$
add a comment |
$begingroup$
Without going into the coding of it, consider what happens if you calculate Mean(a)-Mean(b) first (proposal 2). You calculate Mean(a) and get a single number, then you calculate Mean(b) and get a single number. Then you take your two numbers, and calculate Mean(a)-Mean(b) to get a single value for your difference. No matter how many times you sample/bootstrap this single value, you will get the same number. Try it out!
Whereas if you take multiple samples of your population A and population B, then calculate the difference of means of your samples (proposal 1) you will get slightly different combinations nearly every time you sample (assuming you don't re-set the random seed at the wrong point in your code), so you will get a range of values for the difference.
I would consider proposal 1 to be a form of bootstrapping, but I wouldn't say the same about proposal 2!
$endgroup$
add a comment |
$begingroup$
Without going into the coding of it, consider what happens if you calculate Mean(a)-Mean(b) first (proposal 2). You calculate Mean(a) and get a single number, then you calculate Mean(b) and get a single number. Then you take your two numbers, and calculate Mean(a)-Mean(b) to get a single value for your difference. No matter how many times you sample/bootstrap this single value, you will get the same number. Try it out!
Whereas if you take multiple samples of your population A and population B, then calculate the difference of means of your samples (proposal 1) you will get slightly different combinations nearly every time you sample (assuming you don't re-set the random seed at the wrong point in your code), so you will get a range of values for the difference.
I would consider proposal 1 to be a form of bootstrapping, but I wouldn't say the same about proposal 2!
$endgroup$
Without going into the coding of it, consider what happens if you calculate Mean(a)-Mean(b) first (proposal 2). You calculate Mean(a) and get a single number, then you calculate Mean(b) and get a single number. Then you take your two numbers, and calculate Mean(a)-Mean(b) to get a single value for your difference. No matter how many times you sample/bootstrap this single value, you will get the same number. Try it out!
Whereas if you take multiple samples of your population A and population B, then calculate the difference of means of your samples (proposal 1) you will get slightly different combinations nearly every time you sample (assuming you don't re-set the random seed at the wrong point in your code), so you will get a range of values for the difference.
I would consider proposal 1 to be a form of bootstrapping, but I wouldn't say the same about proposal 2!
answered 6 hours ago
IzyIzy
337212
337212
add a comment |
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f412311%2fwhat-to-bootstrap-for-hypothesis-testing%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Did you try running your code? I get an error. Also, your code doesn't actually match your method 2.
$endgroup$
– gung♦
8 hours ago
$begingroup$
@gung where did you get the error?
$endgroup$
– an.dr.ea
8 hours ago
$begingroup$
After running
res=boot_1(500000,c).$endgroup$
– gung♦
8 hours ago
$begingroup$
@MichaelM Thanks, I just wanted to know if approach 1 or 2 is correct.
$endgroup$
– an.dr.ea
8 hours ago
$begingroup$
@gung I don't have this problem, try with fewer reps res = boot_1 (5000, c)
$endgroup$
– an.dr.ea
8 hours ago