Can AIC be used on out-of-sample data in cross-validation to select a model over another?How can one empirically demonstrate in R which cross-validation methods the AIC and BIC are equivalent to?AIC, BIC and GCV: what is best for making decision in penalized regression methods?AIC: relative versus absolute predictive errorAIC versus cross validation in time series: the small sample caseTime series model selection: AIC vs. out-of-sample SSE and their equivalencecross-validation and over-fittingAIC, model selection and overfittingBetter AIC but worse cross validation error rate

Tikz diagonal filling pattern

How to tension rope between two trees?

How to calculate Limit of this sequence

How is the speed of nucleons in the nucleus measured?

Quote to show students don't have to fear making mistakes

Scorched receptacle

How to catch creatures that can predict the next few minutes?

Anonymous reviewer disclosed his identity. Should I thank him by name?

how would i use rm to delete all files without certain wildcard?

How are characteristic classes morphisms of infinite loop spaces? (if they are)

Is Zhent just the term for any member of the Zhentarim?

How to print variable value in next line using echo command

Can 35 mm film which went through a washing machine still be developed?

What is the origin of the minced oath “Jiminy”?

Why Does this Limit as V approaches Infinity Equal Zero

I've been fired, was allowed to announce it as if I quit and given extra notice, how to handle the questions?

Was there an autocomplete utility in MS-DOS?

Coffee Grounds and Gritty Butter Cream Icing

Has Boris Johnson ever referred to any of his opponents as "traitors"?

Use floats or doubles when writing mobile games

Why do many websites hide input when entering a OTP

Using the Grappler feat, can you grapple and pin (restrain) in the same action?

Why didn't Trudy wear a breathing mask in Avatar?

Are there manual immigration checks for non EU citizens in airports when travelling inside the EU?

Can AIC be used on out-of-sample data in cross-validation to select a model over another?

How can one empirically demonstrate in R which cross-validation methods the AIC and BIC are equivalent to?AIC, BIC and GCV: what is best for making decision in penalized regression methods?AIC: relative versus absolute predictive errorAIC versus cross validation in time series: the small sample caseTime series model selection: AIC vs. out-of-sample SSE and their equivalencecross-validation and over-fittingAIC, model selection and overfittingBetter AIC but worse cross validation error rate

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;

Following Gelman's 2017 publication entitled "Understanding predictive information criteria for Bayesian models" I understand cross-validation and information criteria (Bayesian information criterion and Akaike's information) can be used separately. Usually with enough sample size one would use cross-validation with some measures of predictive accuracy to select a given model over others. With lower sample sizes, AIC and BIC might be preferred on the training data (without cross-validation). My confusion is that whether AIC and BIC can be used along with cross-volition, for example can AIC and BIC be used on the left-out fold in a 10-fold cross-validation? The idea is to use out-of-sample information criteria to penalise for complexity (AIC) as well as model fit (BIC).

asked 8 hours ago

Arman

1268 bronze badges

add a comment
|

asked 8 hours ago

Arman

1268 bronze badges

add a comment
|

asked 8 hours ago

Arman

1268 bronze badges

cross-validation modeling model-selection aic

asked 8 hours ago

Arman

1268 bronze badges

asked 8 hours ago

Arman

1268 bronze badges

asked 8 hours ago

Arman

1268 bronze badges

asked 8 hours ago

Arman

1268 bronze badges

asked 8 hours ago

Arman

1268 bronze badges

add a comment
|

1 Answer
1

active

oldest

votes

Can AIC and BIC be used on the left-out fold in a 10-fold cross-validation?

No, that would not make sense. AIC and cross validation (CV) offer estimates of the model's log-likelihood* of new, unseen data from the same population from which the current data sample has been drawn. They do it in two different ways.

AIC measures the log-likelihood of the entire sample at once, based on parameters estimated using the entire sample, and subsequently adjusts for overfitting (which occurs when estimating log-likelihood on new data by log-likelihood of the same sample on which the estimation was done) via $p$ in $textAIC=-2(textloglik-p)$. Here $textloglik$ is the log-likelihood of the sample data according to the model and $p$ is the number of the model's degrees of freedom (a measure of the model's flexibility);

CV measures the log-likelihood on hold-out subsamples based on parameters estimated on training subsamples. Hence, there is no overfitting unlike the case of AIC.** Therefore, there is no need for replacing the CV estimates of the log-likelihood on hold-out subsamples (folds) by penalized log-likelihood such as AIC.

Analogous logic holds for BIC.

*CV can be used for other functions of the data in place of log-likelihood, too, but for comparability with AIC, I keep the discussion focused on log-likelihood.

**Actually, CV offers a slightly pessimistic estimate of the log-likelihood because training subsamples are smaller than the entire sample and hence the model has somewhat larger estimation variance than it would had it been estimated on the entire sample. In leave-one-out CV, the problem is negligible as the training subsamples are almost as large as the entire sample; in K-fold CV, the problem can be noticeable for small K but decreases as K grows.

edited 7 hours ago

answered 7 hours ago

Richard Hardy

30k6 gold badges51 silver badges144 bronze badges

add a comment
|

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f429126%2fcan-aic-be-used-on-out-of-sample-data-in-cross-validation-to-select-a-model-over%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Can AIC and BIC be used on the left-out fold in a 10-fold cross-validation?

AIC measures the log-likelihood of the entire sample at once, based on parameters estimated using the entire sample, and subsequently adjusts for overfitting (which occurs when estimating log-likelihood on new data by log-likelihood of the same sample on which the estimation was done) via $p$ in $textAIC=-2(textloglik-p)$. Here $textloglik$ is the log-likelihood of the sample data according to the model and $p$ is the number of the model's degrees of freedom (a measure of the model's flexibility);

CV measures the log-likelihood on hold-out subsamples based on parameters estimated on training subsamples. Hence, there is no overfitting unlike the case of AIC.** Therefore, there is no need for replacing the CV estimates of the log-likelihood on hold-out subsamples (folds) by penalized log-likelihood such as AIC.

Analogous logic holds for BIC.

*CV can be used for other functions of the data in place of log-likelihood, too, but for comparability with AIC, I keep the discussion focused on log-likelihood.

edited 7 hours ago

answered 7 hours ago

Richard Hardy

30k6 gold badges51 silver badges144 bronze badges

add a comment
|

Can AIC and BIC be used on the left-out fold in a 10-fold cross-validation?

AIC measures the log-likelihood of the entire sample at once, based on parameters estimated using the entire sample, and subsequently adjusts for overfitting (which occurs when estimating log-likelihood on new data by log-likelihood of the same sample on which the estimation was done) via $p$ in $textAIC=-2(textloglik-p)$. Here $textloglik$ is the log-likelihood of the sample data according to the model and $p$ is the number of the model's degrees of freedom (a measure of the model's flexibility);

CV measures the log-likelihood on hold-out subsamples based on parameters estimated on training subsamples. Hence, there is no overfitting unlike the case of AIC.** Therefore, there is no need for replacing the CV estimates of the log-likelihood on hold-out subsamples (folds) by penalized log-likelihood such as AIC.

Analogous logic holds for BIC.

*CV can be used for other functions of the data in place of log-likelihood, too, but for comparability with AIC, I keep the discussion focused on log-likelihood.

edited 7 hours ago

answered 7 hours ago

Richard Hardy

30k6 gold badges51 silver badges144 bronze badges

add a comment
|

Can AIC and BIC be used on the left-out fold in a 10-fold cross-validation?

AIC measures the log-likelihood of the entire sample at once, based on parameters estimated using the entire sample, and subsequently adjusts for overfitting (which occurs when estimating log-likelihood on new data by log-likelihood of the same sample on which the estimation was done) via $p$ in $textAIC=-2(textloglik-p)$. Here $textloglik$ is the log-likelihood of the sample data according to the model and $p$ is the number of the model's degrees of freedom (a measure of the model's flexibility);

CV measures the log-likelihood on hold-out subsamples based on parameters estimated on training subsamples. Hence, there is no overfitting unlike the case of AIC.** Therefore, there is no need for replacing the CV estimates of the log-likelihood on hold-out subsamples (folds) by penalized log-likelihood such as AIC.

Analogous logic holds for BIC.

*CV can be used for other functions of the data in place of log-likelihood, too, but for comparability with AIC, I keep the discussion focused on log-likelihood.

edited 7 hours ago

answered 7 hours ago

Richard Hardy

30k6 gold badges51 silver badges144 bronze badges

Can AIC and BIC be used on the left-out fold in a 10-fold cross-validation?

AIC measures the log-likelihood of the entire sample at once, based on parameters estimated using the entire sample, and subsequently adjusts for overfitting (which occurs when estimating log-likelihood on new data by log-likelihood of the same sample on which the estimation was done) via $p$ in $textAIC=-2(textloglik-p)$. Here $textloglik$ is the log-likelihood of the sample data according to the model and $p$ is the number of the model's degrees of freedom (a measure of the model's flexibility);

CV measures the log-likelihood on hold-out subsamples based on parameters estimated on training subsamples. Hence, there is no overfitting unlike the case of AIC.** Therefore, there is no need for replacing the CV estimates of the log-likelihood on hold-out subsamples (folds) by penalized log-likelihood such as AIC.

Analogous logic holds for BIC.

*CV can be used for other functions of the data in place of log-likelihood, too, but for comparability with AIC, I keep the discussion focused on log-likelihood.

edited 7 hours ago

answered 7 hours ago

Richard Hardy

30k6 gold badges51 silver badges144 bronze badges

edited 7 hours ago

answered 7 hours ago

Richard Hardy

30k6 gold badges51 silver badges144 bronze badges

answered 7 hours ago

Richard Hardy

30k6 gold badges51 silver badges144 bronze badges

answered 7 hours ago

Richard Hardy

30k6 gold badges51 silver badges144 bronze badges

add a comment
|

draft saved

draft discarded

Thanks for contributing an answer to Cross Validated!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Mfcttrf

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

1 Answer
1

1 Answer
1

1 Answer
1