LASSO Regression - p-values and coefficientsStandard errors for lasso prediction using RUsing regularization when doing statistical inferenceTesting for coefficients significance in Lasso logistic regressionExplanation for unstable lasso regression coefficients?How to handle with missing values in order to prepare data for feature selection with LASSO?Interpretation of coefficients of glmnet - LASSO/Cox model?Difference between trace plot and cv.glmnetInterpretation of LASSO regression coefficientsTesting for coefficients significance in Lasso logistic regressionA generalized LASSO constraintRelaxed Lasso Logistic Regression: Estimating second penalty parameterWhy Lasso classification results changes in Matlab?Implementing Lasso Regression in Numpy

How can I find where certain bash function is defined?

Rests in pickup measure (anacrusis)

Why are these traces shaped in such way?

Apparent Ring of Craters on the Moon

Is healing by fire possible?

Why is this Simple Puzzle impossible to solve?

Does this degree 12 genus 1 curve have only one point over infinitely many finite fields?

What are these (utility?) boxes at the side of the house?

What is the most important source of natural gas? coal, oil or other?

What is the largest (size) solid object ever dropped from an airplane to impact the ground in freefall?

Is CD audio quality good enough for the final delivery of music?

Canon 70D often overexposing or underexposing shots

Geological aftereffects of an asteroid impact on a large mountain range?

Ticket sales for Queen at the Live Aid

Python program to convert a 24 hour format to 12 hour format

Command to Search for Filenames Exceeding 143 Characters?

How to prevent bad sectors?

How do I align equations in three columns, justified right, center and left?

Where is the logic in castrating fighters?

Full backup on database creation

Are there situations when self-assignment is useful?

Can you heal a summoned creature?

What do different value notes on the same line mean?

When and what was the first 3D acceleration device ever released?

LASSO Regression - p-values and coefficients

Standard errors for lasso prediction using RUsing regularization when doing statistical inferenceTesting for coefficients significance in Lasso logistic regressionExplanation for unstable lasso regression coefficients?How to handle with missing values in order to prepare data for feature selection with LASSO?Interpretation of coefficients of glmnet - LASSO/Cox model?Difference between trace plot and cv.glmnetInterpretation of LASSO regression coefficientsTesting for coefficients significance in Lasso logistic regressionA generalized LASSO constraintRelaxed Lasso Logistic Regression: Estimating second penalty parameterWhy Lasso classification results changes in Matlab?Implementing Lasso Regression in Numpy

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I've run a LASSO in R using cv.glmnet. I would like to generate p-values for the coefficients that are selected.

I found the boot.lass.proj to produce bootstrapped p-values
https://rdrr.io/rforge/hdi/man/boot.lasso.proj.html

While the boot.lasso.proj program produced p-values, I assume it is doing its own lasso - but I'm not seeing a way to get the coefficients.

Would it be safe to use the p-values from hdi for the coefficients produced by cv.glmnet?

edited 9 hours ago

Richard Hardy

28.6k647134

asked 12 hours ago

jpryan28

162

New contributor

$begingroup$
What is hdi here?
$endgroup$
– Richard Hardy
9 hours ago

$begingroup$
HDI - the hdi package that contains boot.lass.proj rdrr.io/rforge/hdi
$endgroup$
– jpryan28
4 hours ago

add a comment |

I've run a LASSO in R using cv.glmnet. I would like to generate p-values for the coefficients that are selected.

I found the boot.lass.proj to produce bootstrapped p-values
https://rdrr.io/rforge/hdi/man/boot.lasso.proj.html

While the boot.lasso.proj program produced p-values, I assume it is doing its own lasso - but I'm not seeing a way to get the coefficients.

Would it be safe to use the p-values from hdi for the coefficients produced by cv.glmnet?

edited 9 hours ago

Richard Hardy

28.6k647134

asked 12 hours ago

jpryan28

162

New contributor

$begingroup$
What is hdi here?
$endgroup$
– Richard Hardy
9 hours ago

$begingroup$
HDI - the hdi package that contains boot.lass.proj rdrr.io/rforge/hdi
$endgroup$
– jpryan28
4 hours ago

add a comment |

I've run a LASSO in R using cv.glmnet. I would like to generate p-values for the coefficients that are selected.

I found the boot.lass.proj to produce bootstrapped p-values
https://rdrr.io/rforge/hdi/man/boot.lasso.proj.html

While the boot.lasso.proj program produced p-values, I assume it is doing its own lasso - but I'm not seeing a way to get the coefficients.

Would it be safe to use the p-values from hdi for the coefficients produced by cv.glmnet?

edited 9 hours ago

Richard Hardy

28.6k647134

asked 12 hours ago

jpryan28

162

New contributor

I've run a LASSO in R using cv.glmnet. I would like to generate p-values for the coefficients that are selected.

I found the boot.lass.proj to produce bootstrapped p-values
https://rdrr.io/rforge/hdi/man/boot.lasso.proj.html

While the boot.lasso.proj program produced p-values, I assume it is doing its own lasso - but I'm not seeing a way to get the coefficients.

Would it be safe to use the p-values from hdi for the coefficients produced by cv.glmnet?

p-value lasso

edited 9 hours ago

Richard Hardy

28.6k647134

asked 12 hours ago

jpryan28

162

New contributor

edited 9 hours ago

Richard Hardy

28.6k647134

asked 12 hours ago

jpryan28

162

New contributor

edited 9 hours ago

Richard Hardy

28.6k647134

edited 9 hours ago

Richard Hardy

28.6k647134

edited 9 hours ago

Richard Hardy

28.6k647134

asked 12 hours ago

jpryan28

162

New contributor

asked 12 hours ago

jpryan28

162

asked 12 hours ago

jpryan28

162

New contributor

$begingroup$
What is hdi here?
$endgroup$
– Richard Hardy
9 hours ago

$begingroup$
HDI - the hdi package that contains boot.lass.proj rdrr.io/rforge/hdi
$endgroup$
– jpryan28
4 hours ago

add a comment |

$begingroup$
What is hdi here?
$endgroup$
– Richard Hardy
9 hours ago

$begingroup$
HDI - the hdi package that contains boot.lass.proj rdrr.io/rforge/hdi
$endgroup$
– jpryan28
4 hours ago

What is hdi here?

– Richard Hardy
9 hours ago

HDI - the hdi package that contains boot.lass.proj rdrr.io/rforge/hdi

– jpryan28
4 hours ago

add a comment |

2 Answers
2

active

oldest

votes

To expand on what Ben Bolker notes in a comment on another answer, the issue of what a frequentist p-value means for a regression coefficient in LASSO is not at all easy. What's the actual null hypothesis against which you are testing the coefficient values? How do you take into account the fact that LASSO performed on multiple samples from the same population may return wholly different sets of predictors, particularly with the types of correlated predictors that often are seen in practice? How do you take into account that you have used the outcome values as part of the model-building process, for example in the cross-validation or other method you used to select the level of penalty and thus the number of retained predictors?

These issues are discussed on this site. This page is one good place to start, with links to the R hdi package that you mention and also to the selectiveInference package, which is also discussed on this page. Statistical Learning with Sparsity covers inference for LASSO in Chapter 6, with references to the literature as of a few years ago.

Please don't simply use the p-values returned by those or any other methods for LASSO as simple plug-and-play results. It's important to think why/whether you need p-values and what they really mean in LASSO. If your main interest is in prediction rather than inference, measures of predictive performance would be much more useful to you and to your audience.

answered 5 hours ago

EdM

23.4k234101

add a comment |

Recall that LASSO functions as an elimination process. In other words, it keeps the "best" feature space using CV. One possible remedy is to select the final feature space and feed it back into an lm command. This way, you would be able to compute the statistical significance of the final selected X variables. For instance, see the following code:

library(ISLR)
library(glmnet)
ds <- na.omit(Hitters)
X <- as.matrix(ds[,1:10])
lM_LASSO <- cv.glmnet(X,y = log(ds$Salary),
 intercept=TRUE, alpha=1, nfolds=nrow(ds),
 parallel = T)
opt_lam <- lM_LASSO$lambda.min
lM_LASSO <- glmnet(X,y = log(ds$Salary),
 intercept=TRUE, alpha=1, lambda = opt_lam)
W <- as.matrix(coef(lM_LASSO))
W

 1
(Intercept) 4.5630727825
AtBat -0.0021567122
Hits 0.0115095746
HmRun 0.0055676901
Runs 0.0003147141
RBI 0.0001307846
Walks 0.0069978218
Years 0.0485039070
CHits 0.0003636287

keep_X <- rownames(W)[W!=0]
keep_X <- keep_X[!keep_X == "(Intercept)"]
X <- X[,keep_X]
summary(lm(log(ds$Salary)~X))

Call:
lm(formula = log(ds$Salary) ~ X)

Residuals:
 Min 1Q Median 3Q Max 
-2.23409 -0.45747 0.06435 0.40762 3.02005 

Coefficients:
 Estimate Std. Error t value Pr(>|t|) 
(Intercept) 4.5801734 0.1559086 29.377 < 2e-16 ***
XAtBat -0.0025470 0.0010447 -2.438 0.01546 * 
XHits 0.0126216 0.0039645 3.184 0.00164 ** 
XHmRun 0.0057538 0.0103619 0.555 0.57919 
XRuns 0.0003510 0.0048428 0.072 0.94228 
XRBI 0.0002455 0.0045771 0.054 0.95727 
XWalks 0.0072372 0.0026936 2.687 0.00769 ** 
XYears 0.0487293 0.0206030 2.365 0.01877 * 
XCHits 0.0003622 0.0001564 2.316 0.02138 * 
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.6251 on 254 degrees of freedom
Multiple R-squared: 0.5209, Adjusted R-squared: 0.5058 
F-statistic: 34.52 on 8 and 254 DF, p-value: < 2.2e-16

Note that the coefficients are little different from the ones derived from the glmnet model. Finally, you can use the stargazer package to output into a well-formatted table. In this case, we have

stargazer::stargazer(lm(log(ds$Salary)~X),type = "text")
===============================================
 Dependent variable: 
 ---------------------------
 Salary) 
-----------------------------------------------
XAtBat -0.003** 
 (0.001) 

XHits 0.013*** 
 (0.004) 

XHmRun 0.006 
 (0.010) 

XRuns 0.0004 
 (0.005) 

XRBI 0.0002 
 (0.005) 

XWalks 0.007*** 
 (0.003) 

XYears 0.049** 
 (0.021) 

XCHits 0.0004** 
 (0.0002) 

Constant 4.580*** 
 (0.156) 

-----------------------------------------------
Observations 263 
R2 0.521 
Adjusted R2 0.506 
Residual Std. Error 0.625 (df = 254) 
F Statistic 34.521*** (df = 8; 254) 
===============================================
Note: *p<0.1; **p<0.05; ***p<0.01

Bootstrap

Using a bootstrap approach, I compare the above standard errors with the bootstrapped one as a robustness check:

library(boot)

W_boot <- function(ds, indices) 
 ds_boot <- ds[indices,] 
 X <- as.matrix(ds_boot[,1:10])
 y <- log(ds$Salary)
 lM_LASSO <- glmnet(X,y = log(ds$Salary),
 intercept=TRUE, alpha=1, lambda = opt_lam)
 W <- as.matrix(coef(lM_LASSO))
 return(W)
 

results <- boot(data=ds, statistic=W_boot, 
 R=10000)

se1 <- summary(lm(log(ds$Salary)~X))$coef[,2]
se2 <- apply(results$t,2,sd)
se2 <- se2[W!=0]
plot(se2~se1)
abline(a=0,b=1)

enter image description here
There seems to be a small bias for the intercept. Otherwise, the ad-hoc approach seems to be justified. In any case, you may wanna check this thread for further discussion on this.

edited 5 hours ago

answered 7 hours ago

majeed simaan

1464

4

$begingroup$
I have very serious concerns about this approach (I know it's widely done), because it doesn't take the selection process (nor the process of selecting the penalty by cross-validation) into account. Can you point to references that assert that this approach actually has correct frequentist properties? (I can imagine that it's OK in particular cases where you have lots of data and the parameters have either zero (or very small) effects or very large effects ...)
$endgroup$
– Ben Bolker
7 hours ago

$begingroup$
Good point. I don't have a specific paper to justify this methodology, but my thinking that this should not be that different from the "true" standard errors. As you mentioned above, running a bootstrap would serve as a more robust solution. I reran the code with the addition of bootstrap and compared the results with the lm standard errors. Ignoring the intercept, the results seem to be consistent.
$endgroup$
– majeed simaan
5 hours ago

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

jpryan28 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f410173%2flasso-regression-p-values-and-coefficients%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

answered 5 hours ago

EdM

23.4k234101

add a comment |

answered 5 hours ago

EdM

23.4k234101

add a comment |

answered 5 hours ago

EdM

23.4k234101

answered 5 hours ago

EdM

23.4k234101

answered 5 hours ago

EdM

23.4k234101

answered 5 hours ago

EdM

23.4k234101

answered 5 hours ago

EdM

23.4k234101

add a comment |

library(ISLR)
library(glmnet)
ds <- na.omit(Hitters)
X <- as.matrix(ds[,1:10])
lM_LASSO <- cv.glmnet(X,y = log(ds$Salary),
 intercept=TRUE, alpha=1, nfolds=nrow(ds),
 parallel = T)
opt_lam <- lM_LASSO$lambda.min
lM_LASSO <- glmnet(X,y = log(ds$Salary),
 intercept=TRUE, alpha=1, lambda = opt_lam)
W <- as.matrix(coef(lM_LASSO))
W

 1
(Intercept) 4.5630727825
AtBat -0.0021567122
Hits 0.0115095746
HmRun 0.0055676901
Runs 0.0003147141
RBI 0.0001307846
Walks 0.0069978218
Years 0.0485039070
CHits 0.0003636287

keep_X <- rownames(W)[W!=0]
keep_X <- keep_X[!keep_X == "(Intercept)"]
X <- X[,keep_X]
summary(lm(log(ds$Salary)~X))

Call:
lm(formula = log(ds$Salary) ~ X)

Residuals:
 Min 1Q Median 3Q Max 
-2.23409 -0.45747 0.06435 0.40762 3.02005 

Coefficients:
 Estimate Std. Error t value Pr(>|t|) 
(Intercept) 4.5801734 0.1559086 29.377 < 2e-16 ***
XAtBat -0.0025470 0.0010447 -2.438 0.01546 * 
XHits 0.0126216 0.0039645 3.184 0.00164 ** 
XHmRun 0.0057538 0.0103619 0.555 0.57919 
XRuns 0.0003510 0.0048428 0.072 0.94228 
XRBI 0.0002455 0.0045771 0.054 0.95727 
XWalks 0.0072372 0.0026936 2.687 0.00769 ** 
XYears 0.0487293 0.0206030 2.365 0.01877 * 
XCHits 0.0003622 0.0001564 2.316 0.02138 * 
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.6251 on 254 degrees of freedom
Multiple R-squared: 0.5209, Adjusted R-squared: 0.5058 
F-statistic: 34.52 on 8 and 254 DF, p-value: < 2.2e-16

stargazer::stargazer(lm(log(ds$Salary)~X),type = "text")
===============================================
 Dependent variable: 
 ---------------------------
 Salary) 
-----------------------------------------------
XAtBat -0.003** 
 (0.001) 

XHits 0.013*** 
 (0.004) 

XHmRun 0.006 
 (0.010) 

XRuns 0.0004 
 (0.005) 

XRBI 0.0002 
 (0.005) 

XWalks 0.007*** 
 (0.003) 

XYears 0.049** 
 (0.021) 

XCHits 0.0004** 
 (0.0002) 

Constant 4.580*** 
 (0.156) 

-----------------------------------------------
Observations 263 
R2 0.521 
Adjusted R2 0.506 
Residual Std. Error 0.625 (df = 254) 
F Statistic 34.521*** (df = 8; 254) 
===============================================
Note: *p<0.1; **p<0.05; ***p<0.01

Bootstrap

Using a bootstrap approach, I compare the above standard errors with the bootstrapped one as a robustness check:

library(boot)

W_boot <- function(ds, indices) 
 ds_boot <- ds[indices,] 
 X <- as.matrix(ds_boot[,1:10])
 y <- log(ds$Salary)
 lM_LASSO <- glmnet(X,y = log(ds$Salary),
 intercept=TRUE, alpha=1, lambda = opt_lam)
 W <- as.matrix(coef(lM_LASSO))
 return(W)
 

results <- boot(data=ds, statistic=W_boot, 
 R=10000)

se1 <- summary(lm(log(ds$Salary)~X))$coef[,2]
se2 <- apply(results$t,2,sd)
se2 <- se2[W!=0]
plot(se2~se1)
abline(a=0,b=1)

edited 5 hours ago

answered 7 hours ago

majeed simaan

1464

4

$begingroup$
I have very serious concerns about this approach (I know it's widely done), because it doesn't take the selection process (nor the process of selecting the penalty by cross-validation) into account. Can you point to references that assert that this approach actually has correct frequentist properties? (I can imagine that it's OK in particular cases where you have lots of data and the parameters have either zero (or very small) effects or very large effects ...)
$endgroup$
– Ben Bolker
7 hours ago

$begingroup$
Good point. I don't have a specific paper to justify this methodology, but my thinking that this should not be that different from the "true" standard errors. As you mentioned above, running a bootstrap would serve as a more robust solution. I reran the code with the addition of bootstrap and compared the results with the lm standard errors. Ignoring the intercept, the results seem to be consistent.
$endgroup$
– majeed simaan
5 hours ago

add a comment |

library(ISLR)
library(glmnet)
ds <- na.omit(Hitters)
X <- as.matrix(ds[,1:10])
lM_LASSO <- cv.glmnet(X,y = log(ds$Salary),
 intercept=TRUE, alpha=1, nfolds=nrow(ds),
 parallel = T)
opt_lam <- lM_LASSO$lambda.min
lM_LASSO <- glmnet(X,y = log(ds$Salary),
 intercept=TRUE, alpha=1, lambda = opt_lam)
W <- as.matrix(coef(lM_LASSO))
W

 1
(Intercept) 4.5630727825
AtBat -0.0021567122
Hits 0.0115095746
HmRun 0.0055676901
Runs 0.0003147141
RBI 0.0001307846
Walks 0.0069978218
Years 0.0485039070
CHits 0.0003636287

keep_X <- rownames(W)[W!=0]
keep_X <- keep_X[!keep_X == "(Intercept)"]
X <- X[,keep_X]
summary(lm(log(ds$Salary)~X))

Call:
lm(formula = log(ds$Salary) ~ X)

Residuals:
 Min 1Q Median 3Q Max 
-2.23409 -0.45747 0.06435 0.40762 3.02005 

Coefficients:
 Estimate Std. Error t value Pr(>|t|) 
(Intercept) 4.5801734 0.1559086 29.377 < 2e-16 ***
XAtBat -0.0025470 0.0010447 -2.438 0.01546 * 
XHits 0.0126216 0.0039645 3.184 0.00164 ** 
XHmRun 0.0057538 0.0103619 0.555 0.57919 
XRuns 0.0003510 0.0048428 0.072 0.94228 
XRBI 0.0002455 0.0045771 0.054 0.95727 
XWalks 0.0072372 0.0026936 2.687 0.00769 ** 
XYears 0.0487293 0.0206030 2.365 0.01877 * 
XCHits 0.0003622 0.0001564 2.316 0.02138 * 
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.6251 on 254 degrees of freedom
Multiple R-squared: 0.5209, Adjusted R-squared: 0.5058 
F-statistic: 34.52 on 8 and 254 DF, p-value: < 2.2e-16

stargazer::stargazer(lm(log(ds$Salary)~X),type = "text")
===============================================
 Dependent variable: 
 ---------------------------
 Salary) 
-----------------------------------------------
XAtBat -0.003** 
 (0.001) 

XHits 0.013*** 
 (0.004) 

XHmRun 0.006 
 (0.010) 

XRuns 0.0004 
 (0.005) 

XRBI 0.0002 
 (0.005) 

XWalks 0.007*** 
 (0.003) 

XYears 0.049** 
 (0.021) 

XCHits 0.0004** 
 (0.0002) 

Constant 4.580*** 
 (0.156) 

-----------------------------------------------
Observations 263 
R2 0.521 
Adjusted R2 0.506 
Residual Std. Error 0.625 (df = 254) 
F Statistic 34.521*** (df = 8; 254) 
===============================================
Note: *p<0.1; **p<0.05; ***p<0.01

Bootstrap

Using a bootstrap approach, I compare the above standard errors with the bootstrapped one as a robustness check:

library(boot)

W_boot <- function(ds, indices) 
 ds_boot <- ds[indices,] 
 X <- as.matrix(ds_boot[,1:10])
 y <- log(ds$Salary)
 lM_LASSO <- glmnet(X,y = log(ds$Salary),
 intercept=TRUE, alpha=1, lambda = opt_lam)
 W <- as.matrix(coef(lM_LASSO))
 return(W)
 

results <- boot(data=ds, statistic=W_boot, 
 R=10000)

se1 <- summary(lm(log(ds$Salary)~X))$coef[,2]
se2 <- apply(results$t,2,sd)
se2 <- se2[W!=0]
plot(se2~se1)
abline(a=0,b=1)

edited 5 hours ago

answered 7 hours ago

majeed simaan

1464

4

$begingroup$
I have very serious concerns about this approach (I know it's widely done), because it doesn't take the selection process (nor the process of selecting the penalty by cross-validation) into account. Can you point to references that assert that this approach actually has correct frequentist properties? (I can imagine that it's OK in particular cases where you have lots of data and the parameters have either zero (or very small) effects or very large effects ...)
$endgroup$
– Ben Bolker
7 hours ago

$begingroup$
Good point. I don't have a specific paper to justify this methodology, but my thinking that this should not be that different from the "true" standard errors. As you mentioned above, running a bootstrap would serve as a more robust solution. I reran the code with the addition of bootstrap and compared the results with the lm standard errors. Ignoring the intercept, the results seem to be consistent.
$endgroup$
– majeed simaan
5 hours ago

add a comment |

library(ISLR)
library(glmnet)
ds <- na.omit(Hitters)
X <- as.matrix(ds[,1:10])
lM_LASSO <- cv.glmnet(X,y = log(ds$Salary),
 intercept=TRUE, alpha=1, nfolds=nrow(ds),
 parallel = T)
opt_lam <- lM_LASSO$lambda.min
lM_LASSO <- glmnet(X,y = log(ds$Salary),
 intercept=TRUE, alpha=1, lambda = opt_lam)
W <- as.matrix(coef(lM_LASSO))
W

 1
(Intercept) 4.5630727825
AtBat -0.0021567122
Hits 0.0115095746
HmRun 0.0055676901
Runs 0.0003147141
RBI 0.0001307846
Walks 0.0069978218
Years 0.0485039070
CHits 0.0003636287

keep_X <- rownames(W)[W!=0]
keep_X <- keep_X[!keep_X == "(Intercept)"]
X <- X[,keep_X]
summary(lm(log(ds$Salary)~X))

Call:
lm(formula = log(ds$Salary) ~ X)

Residuals:
 Min 1Q Median 3Q Max 
-2.23409 -0.45747 0.06435 0.40762 3.02005 

Coefficients:
 Estimate Std. Error t value Pr(>|t|) 
(Intercept) 4.5801734 0.1559086 29.377 < 2e-16 ***
XAtBat -0.0025470 0.0010447 -2.438 0.01546 * 
XHits 0.0126216 0.0039645 3.184 0.00164 ** 
XHmRun 0.0057538 0.0103619 0.555 0.57919 
XRuns 0.0003510 0.0048428 0.072 0.94228 
XRBI 0.0002455 0.0045771 0.054 0.95727 
XWalks 0.0072372 0.0026936 2.687 0.00769 ** 
XYears 0.0487293 0.0206030 2.365 0.01877 * 
XCHits 0.0003622 0.0001564 2.316 0.02138 * 
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.6251 on 254 degrees of freedom
Multiple R-squared: 0.5209, Adjusted R-squared: 0.5058 
F-statistic: 34.52 on 8 and 254 DF, p-value: < 2.2e-16

stargazer::stargazer(lm(log(ds$Salary)~X),type = "text")
===============================================
 Dependent variable: 
 ---------------------------
 Salary) 
-----------------------------------------------
XAtBat -0.003** 
 (0.001) 

XHits 0.013*** 
 (0.004) 

XHmRun 0.006 
 (0.010) 

XRuns 0.0004 
 (0.005) 

XRBI 0.0002 
 (0.005) 

XWalks 0.007*** 
 (0.003) 

XYears 0.049** 
 (0.021) 

XCHits 0.0004** 
 (0.0002) 

Constant 4.580*** 
 (0.156) 

-----------------------------------------------
Observations 263 
R2 0.521 
Adjusted R2 0.506 
Residual Std. Error 0.625 (df = 254) 
F Statistic 34.521*** (df = 8; 254) 
===============================================
Note: *p<0.1; **p<0.05; ***p<0.01

Bootstrap

Using a bootstrap approach, I compare the above standard errors with the bootstrapped one as a robustness check:

library(boot)

W_boot <- function(ds, indices) 
 ds_boot <- ds[indices,] 
 X <- as.matrix(ds_boot[,1:10])
 y <- log(ds$Salary)
 lM_LASSO <- glmnet(X,y = log(ds$Salary),
 intercept=TRUE, alpha=1, lambda = opt_lam)
 W <- as.matrix(coef(lM_LASSO))
 return(W)
 

results <- boot(data=ds, statistic=W_boot, 
 R=10000)

se1 <- summary(lm(log(ds$Salary)~X))$coef[,2]
se2 <- apply(results$t,2,sd)
se2 <- se2[W!=0]
plot(se2~se1)
abline(a=0,b=1)

edited 5 hours ago

answered 7 hours ago

majeed simaan

1464

library(ISLR)
library(glmnet)
ds <- na.omit(Hitters)
X <- as.matrix(ds[,1:10])
lM_LASSO <- cv.glmnet(X,y = log(ds$Salary),
 intercept=TRUE, alpha=1, nfolds=nrow(ds),
 parallel = T)
opt_lam <- lM_LASSO$lambda.min
lM_LASSO <- glmnet(X,y = log(ds$Salary),
 intercept=TRUE, alpha=1, lambda = opt_lam)
W <- as.matrix(coef(lM_LASSO))
W

 1
(Intercept) 4.5630727825
AtBat -0.0021567122
Hits 0.0115095746
HmRun 0.0055676901
Runs 0.0003147141
RBI 0.0001307846
Walks 0.0069978218
Years 0.0485039070
CHits 0.0003636287

keep_X <- rownames(W)[W!=0]
keep_X <- keep_X[!keep_X == "(Intercept)"]
X <- X[,keep_X]
summary(lm(log(ds$Salary)~X))

Call:
lm(formula = log(ds$Salary) ~ X)

Residuals:
 Min 1Q Median 3Q Max 
-2.23409 -0.45747 0.06435 0.40762 3.02005 

Coefficients:
 Estimate Std. Error t value Pr(>|t|) 
(Intercept) 4.5801734 0.1559086 29.377 < 2e-16 ***
XAtBat -0.0025470 0.0010447 -2.438 0.01546 * 
XHits 0.0126216 0.0039645 3.184 0.00164 ** 
XHmRun 0.0057538 0.0103619 0.555 0.57919 
XRuns 0.0003510 0.0048428 0.072 0.94228 
XRBI 0.0002455 0.0045771 0.054 0.95727 
XWalks 0.0072372 0.0026936 2.687 0.00769 ** 
XYears 0.0487293 0.0206030 2.365 0.01877 * 
XCHits 0.0003622 0.0001564 2.316 0.02138 * 
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.6251 on 254 degrees of freedom
Multiple R-squared: 0.5209, Adjusted R-squared: 0.5058 
F-statistic: 34.52 on 8 and 254 DF, p-value: < 2.2e-16

stargazer::stargazer(lm(log(ds$Salary)~X),type = "text")
===============================================
 Dependent variable: 
 ---------------------------
 Salary) 
-----------------------------------------------
XAtBat -0.003** 
 (0.001) 

XHits 0.013*** 
 (0.004) 

XHmRun 0.006 
 (0.010) 

XRuns 0.0004 
 (0.005) 

XRBI 0.0002 
 (0.005) 

XWalks 0.007*** 
 (0.003) 

XYears 0.049** 
 (0.021) 

XCHits 0.0004** 
 (0.0002) 

Constant 4.580*** 
 (0.156) 

-----------------------------------------------
Observations 263 
R2 0.521 
Adjusted R2 0.506 
Residual Std. Error 0.625 (df = 254) 
F Statistic 34.521*** (df = 8; 254) 
===============================================
Note: *p<0.1; **p<0.05; ***p<0.01

Bootstrap

Using a bootstrap approach, I compare the above standard errors with the bootstrapped one as a robustness check:

library(boot)

W_boot <- function(ds, indices) 
 ds_boot <- ds[indices,] 
 X <- as.matrix(ds_boot[,1:10])
 y <- log(ds$Salary)
 lM_LASSO <- glmnet(X,y = log(ds$Salary),
 intercept=TRUE, alpha=1, lambda = opt_lam)
 W <- as.matrix(coef(lM_LASSO))
 return(W)
 

results <- boot(data=ds, statistic=W_boot, 
 R=10000)

se1 <- summary(lm(log(ds$Salary)~X))$coef[,2]
se2 <- apply(results$t,2,sd)
se2 <- se2[W!=0]
plot(se2~se1)
abline(a=0,b=1)

edited 5 hours ago

answered 7 hours ago

majeed simaan

1464

edited 5 hours ago

answered 7 hours ago

majeed simaan

1464

answered 7 hours ago

majeed simaan

1464

answered 7 hours ago

majeed simaan

1464

4

$begingroup$
I have very serious concerns about this approach (I know it's widely done), because it doesn't take the selection process (nor the process of selecting the penalty by cross-validation) into account. Can you point to references that assert that this approach actually has correct frequentist properties? (I can imagine that it's OK in particular cases where you have lots of data and the parameters have either zero (or very small) effects or very large effects ...)
$endgroup$
– Ben Bolker
7 hours ago

$begingroup$
Good point. I don't have a specific paper to justify this methodology, but my thinking that this should not be that different from the "true" standard errors. As you mentioned above, running a bootstrap would serve as a more robust solution. I reran the code with the addition of bootstrap and compared the results with the lm standard errors. Ignoring the intercept, the results seem to be consistent.
$endgroup$
– majeed simaan
5 hours ago

add a comment |

4

$begingroup$
I have very serious concerns about this approach (I know it's widely done), because it doesn't take the selection process (nor the process of selecting the penalty by cross-validation) into account. Can you point to references that assert that this approach actually has correct frequentist properties? (I can imagine that it's OK in particular cases where you have lots of data and the parameters have either zero (or very small) effects or very large effects ...)
$endgroup$
– Ben Bolker
7 hours ago

$begingroup$
Good point. I don't have a specific paper to justify this methodology, but my thinking that this should not be that different from the "true" standard errors. As you mentioned above, running a bootstrap would serve as a more robust solution. I reran the code with the addition of bootstrap and compared the results with the lm standard errors. Ignoring the intercept, the results seem to be consistent.
$endgroup$
– majeed simaan
5 hours ago

I have very serious concerns about this approach (I know it's widely done), because it doesn't take the selection process (nor the process of selecting the penalty by cross-validation) into account. Can you point to references that assert that this approach actually has correct frequentist properties? (I can imagine that it's OK in particular cases where you have lots of data and the parameters have either zero (or very small) effects or very large effects ...)

– Ben Bolker
7 hours ago

Good point. I don't have a specific paper to justify this methodology, but my thinking that this should not be that different from the "true" standard errors. As you mentioned above, running a bootstrap would serve as a more robust solution. I reran the code with the addition of bootstrap and compared the results with the lm standard errors. Ignoring the intercept, the results seem to be consistent.

– majeed simaan
5 hours ago

add a comment |

jpryan28 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

jpryan28 is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Cross Validated!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Mfcttrf

2 Answers
2

Bootstrap

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Bootstrap

Bootstrap

Bootstrap

Bootstrap

Post as a guest

Popular posts from this blog

2 Answers 2

Bootstrap

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Bootstrap

Bootstrap

Bootstrap

Bootstrap

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

2 Answers
2

2 Answers
2

2 Answers
2