Question
8a Random x vector
8b betas
8c
8d
8e
- STUDY!
8f
\(R^2, BIC, CP, Adjusted R^2\)
Plot \(R^2, BIC, CP, Adjusted R^2\)
Other Solutions:

Question

p262

In this exercise, we will generate simulated data, and will then use this data to perform best subset selection.

Use the rnorm() function to generate a predictor X of length n = 100, as well as a noise vector ε of length n = 100.
Generate a response vector Y of length n = 100 according to the model Y = β0 +β1X +β2X2 +β3X3 +ε, where β0, β1, β2, and β3 are constants of your choice.
Use the regsubsets() function to perform best subset selection in order to choose the best model containing the predictors X, X2, . . . , X10. What is the best model obtained according to Cp, BIC, and adjusted R2? Show some plots to provide evidence for your answer, and report the coefficients of the best model ob- tained. Note you will need to use the data.frame() function to create a single data set containing both X and Y . 2b setting.
Repeat (c), using forward stepwise selection and also using back- wards stepwise selection. How does your answer compare to the results in (c)?
Now fit a lasso model to the simulated data, again using X,X2, . . . , X 10 as predictors. Use cross-validation to select the optimal value of λ. Create plots of the cross-validation error as a function of λ. Report the resulting coefficient estimates, and discuss the results obtained.
Now generate a response vector Y according to the model Y = β0 + β7X7 + ε, and perform best subset selection and the lasso. Discuss the results obtained

library(ISLR)

8a Random x vector

set.seed(112)
x <- rnorm(100)
epsilon <- rnorm(100)

8b betas

beta0 = 1
beta1 = 1
beta2 = 1
beta3 = 1

beta0 = 3
beta1 = 2
beta2 = -3
beta3 = 0.3

Y <- beta0 + beta1 * x + beta2 * x^2 + beta3 * x^3 + epsilon

8c

library(leaps)
regfit.full=regsubsets(Y~., poly(x, 10))
reg.summary=summary(regfit.full)

names(reg.summary)

## [1] "which"  "rsq"    "rss"    "adjr2"  "cp"     "bic"    "outmat" "obj"

reg.summary$rsq

## [1] 0.7978113 0.9092099 0.9376857 0.9402457 0.9415817 0.9423889 0.9426592
## [8] 0.9427198

reg.summary$bic

## [1] -150.6451 -226.1050 -259.1357 -258.7256 -256.3816 -253.1678 -249.0329
## [8] -244.5335

reg.summary$cp

## [1] 218.238194  47.104395   4.847869   2.869052   2.792670   3.538211   5.118088
## [8]   7.023931

reg.summary$adjr2

## [1] 0.7957482 0.9073380 0.9357383 0.9377298 0.9384744 0.9386720 0.9382963
## [8] 0.9376842

par(mfrow=c(2,2))
plot(reg.summary$rsq)
plot(reg.summary$bic)
plot(reg.summary$cp)
plot(reg.summary$adjr2)

Answer: polynomial 3

8d

regfit.full=regsubsets(Y~., poly(x, 10), method = "forward")
reg.summary=summary(regfit.full)

names(reg.summary)

## [1] "which"  "rsq"    "rss"    "adjr2"  "cp"     "bic"    "outmat" "obj"

reg.summary$rsq

## [1] 0.7978113 0.9092099 0.9376857 0.9402457 0.9415817 0.9423889 0.9426592
## [8] 0.9427198

reg.summary$bic

## [1] -150.6451 -226.1050 -259.1357 -258.7256 -256.3816 -253.1678 -249.0329
## [8] -244.5335

reg.summary$cp

## [1] 218.238194  47.104395   4.847869   2.869052   2.792670   3.538211   5.118088
## [8]   7.023931

reg.summary$adjr2

## [1] 0.7957482 0.9073380 0.9357383 0.9377298 0.9384744 0.9386720 0.9382963
## [8] 0.9376842

par(mfrow=c(2,2))
plot(reg.summary$rsq)
plot(reg.summary$bic)
plot(reg.summary$cp)
plot(reg.summary$adjr2)

regfit.full=regsubsets(Y~., poly(x, 10), method = "backward")
reg.summary=summary(regfit.full)

names(reg.summary)

## [1] "which"  "rsq"    "rss"    "adjr2"  "cp"     "bic"    "outmat" "obj"

reg.summary$rsq

## [1] 0.7978113 0.9092099 0.9376857 0.9402457 0.9415817 0.9423889 0.9426592
## [8] 0.9427198

reg.summary$bic

## [1] -150.6451 -226.1050 -259.1357 -258.7256 -256.3816 -253.1678 -249.0329
## [8] -244.5335

reg.summary$cp

## [1] 218.238194  47.104395   4.847869   2.869052   2.792670   3.538211   5.118088
## [8]   7.023931

reg.summary$adjr2

## [1] 0.7957482 0.9073380 0.9357383 0.9377298 0.9384744 0.9386720 0.9382963
## [8] 0.9376842

par(mfrow=c(2,2))
plot(reg.summary$rsq)
plot(reg.summary$bic)
plot(reg.summary$cp)
plot(reg.summary$adjr2)

Answer: Looks the same

8e

STUDY!

library(glmnet)

## Loading required package: Matrix

## Loaded glmnet 4.0-2

set.seed(1)

df = data.frame(x = x,
                y = Y)
# train = sample(1:nrow(df), nrow(df)*.7)

x = model.matrix(y ~ poly(x, 10, raw = T), data = df)[, -1]
mod.lasso = cv.glmnet(x, Y, alpha = 1)

best.lambda = mod.lasso$lambda.min
best.lambda

## [1] 0.01324949

plot(mod.lasso)

full.model = glmnet(x, Y, alpha = 1)
predict(full.model, s = best.lambda, type = "coefficients")

## 11 x 1 sparse Matrix of class "dgCMatrix"
##                                    1
## (Intercept)             2.9416141471
## poly(x, 10, raw = T)1   1.7941318291
## poly(x, 10, raw = T)2  -2.9890600901
## poly(x, 10, raw = T)3   0.4157700821
## poly(x, 10, raw = T)4   .           
## poly(x, 10, raw = T)5   .           
## poly(x, 10, raw = T)6   .           
## poly(x, 10, raw = T)7   .           
## poly(x, 10, raw = T)8   .           
## poly(x, 10, raw = T)9   .           
## poly(x, 10, raw = T)10 -0.0001158552

#dim(train)
#dim(test)

#cv.out <- cv.glmnet(x[train], y[train], alpha=1)

8f

beta7 = 7

x <- rnorm(100)
Y <- beta0 + beta7 * x^7 + epsilon

df = data.frame(y = Y, x = x)
                
bss.fit <- regsubsets(y ~ poly(x, 10, raw = T), data = df, nvmax = 19)
summary(bss.fit)

## Subset selection object
## Call: regsubsets.formula(y ~ poly(x, 10, raw = T), data = df, nvmax = 19)
## 10 Variables  (and intercept)
##                        Forced in Forced out
## poly(x, 10, raw = T)1      FALSE      FALSE
## poly(x, 10, raw = T)2      FALSE      FALSE
## poly(x, 10, raw = T)3      FALSE      FALSE
## poly(x, 10, raw = T)4      FALSE      FALSE
## poly(x, 10, raw = T)5      FALSE      FALSE
## poly(x, 10, raw = T)6      FALSE      FALSE
## poly(x, 10, raw = T)7      FALSE      FALSE
## poly(x, 10, raw = T)8      FALSE      FALSE
## poly(x, 10, raw = T)9      FALSE      FALSE
## poly(x, 10, raw = T)10     FALSE      FALSE
## 1 subsets of each size up to 10
## Selection Algorithm: exhaustive
##           poly(x, 10, raw = T)1 poly(x, 10, raw = T)2 poly(x, 10, raw = T)3
## 1  ( 1 )  " "                   " "                   " "                  
## 2  ( 1 )  " "                   " "                   " "                  
## 3  ( 1 )  " "                   " "                   " "                  
## 4  ( 1 )  " "                   " "                   "*"                  
## 5  ( 1 )  " "                   "*"                   "*"                  
## 6  ( 1 )  " "                   "*"                   " "                  
## 7  ( 1 )  "*"                   "*"                   "*"                  
## 8  ( 1 )  "*"                   "*"                   "*"                  
## 9  ( 1 )  "*"                   "*"                   "*"                  
## 10  ( 1 ) "*"                   "*"                   "*"                  
##           poly(x, 10, raw = T)4 poly(x, 10, raw = T)5 poly(x, 10, raw = T)6
## 1  ( 1 )  " "                   " "                   " "                  
## 2  ( 1 )  " "                   "*"                   " "                  
## 3  ( 1 )  " "                   " "                   "*"                  
## 4  ( 1 )  "*"                   " "                   " "                  
## 5  ( 1 )  "*"                   " "                   "*"                  
## 6  ( 1 )  "*"                   "*"                   "*"                  
## 7  ( 1 )  "*"                   " "                   "*"                  
## 8  ( 1 )  "*"                   " "                   "*"                  
## 9  ( 1 )  "*"                   "*"                   "*"                  
## 10  ( 1 ) "*"                   "*"                   "*"                  
##           poly(x, 10, raw = T)7 poly(x, 10, raw = T)8 poly(x, 10, raw = T)9
## 1  ( 1 )  "*"                   " "                   " "                  
## 2  ( 1 )  "*"                   " "                   " "                  
## 3  ( 1 )  "*"                   "*"                   " "                  
## 4  ( 1 )  "*"                   "*"                   " "                  
## 5  ( 1 )  "*"                   " "                   " "                  
## 6  ( 1 )  "*"                   " "                   " "                  
## 7  ( 1 )  "*"                   " "                   " "                  
## 8  ( 1 )  "*"                   " "                   "*"                  
## 9  ( 1 )  "*"                   " "                   "*"                  
## 10  ( 1 ) "*"                   "*"                   "*"                  
##           poly(x, 10, raw = T)10
## 1  ( 1 )  " "                   
## 2  ( 1 )  " "                   
## 3  ( 1 )  " "                   
## 4  ( 1 )  " "                   
## 5  ( 1 )  " "                   
## 6  ( 1 )  "*"                   
## 7  ( 1 )  "*"                   
## 8  ( 1 )  "*"                   
## 9  ( 1 )  "*"                   
## 10  ( 1 ) "*"

par(mfrow=c(1,1))
plot(bss.fit, scale = "r2")

plot(bss.fit, scale = "adjr2")

plot(bss.fit, scale = "Cp")

plot(bss.fit, scale = "bic")

\(R^2, BIC, CP, Adjusted R^2\)

reg.summary=summary(bss.fit)

names(reg.summary)

## [1] "which"  "rsq"    "rss"    "adjr2"  "cp"     "bic"    "outmat" "obj"

reg.summary$rsq

##  [1] 0.9999861 0.9999872 0.9999873 0.9999875 0.9999877 0.9999878 0.9999878
##  [8] 0.9999878 0.9999878 0.9999878

reg.summary$bic

##  [1] -1109.217 -1112.552 -1109.068 -1106.251 -1102.627 -1098.919 -1094.378
##  [8] -1089.772 -1085.167 -1080.562

reg.summary$cp

##  [1]  5.1452773 -0.5757435  0.3823426  0.7453179  1.8590822  3.0567305
##  [7]  5.0000982  7.0000732  9.0000255 11.0000000

reg.summary$adjr2

##  [1] 0.9999860 0.9999869 0.9999869 0.9999870 0.9999870 0.9999870 0.9999868
##  [8] 0.9999867 0.9999866 0.9999864

Plot \(R^2, BIC, CP, Adjusted R^2\)

par(mfrow=c(1,1))
plot(reg.summary$rsq)

plot(reg.summary$bic)

plot(reg.summary$cp)

plot(reg.summary$adjr2)

ISLR Q6.8 - Best Subset Selection

Question

8a Random x vector

8b betas

8c

8d

8e

STUDY!

8f

\(R^2, BIC, CP, Adjusted R^2\)

Plot \(R^2, BIC, CP, Adjusted R^2\)

Other Solutions: