# Migrating from pense version 1.x to 2.x

#### 2021-04-14

Version 2.x released September 2020 offers many new features and improved speed. These changes make pense 2.x incompatible with code written for pense 1.x and results will not be identical!

The most visible changes are to functions pense() and pensem(), which now only fit models but do not evaluate prediction performance anymore. Instead, the new functions pense_cv() and pensem_cv() are now used for fitting models and estimating their prediction performance.

First we need to load the new version of the pense package:

library(pense)
packageVersion("pense")
#> [1] '2.0.3'

This guide uses the following simulated data:

set.seed(1234)
x <- matrix(rt(50 * 10, df = 5), ncol = 10)
y <- 0.5 * x[, 1] - 2 * x[, 2] + 1.5 * x[, 3] + rt(nrow(x), df = 3)

# The new _cv() family of functions

The most basic usage in old versions was to call function pense() to fit models with nlambda different penalization levels and evaluate each model’s prediction performance with K-fold cross-validation. In the new version, this will now raise an error:

set.seed(1234)
fitted_with_cv <- pense(x, y, alpha = 0.6, nlambda = 40, warm_reset = 5L, cv_k = 5)
#> Error: The cv_k argument of pense() was deprecated in pense 2.0.0 and is now defunct.
#> Please use pense_cv() instead.

As the error message says, if model fitting and cross-validation is required, use pense_cv() instead. The simple solution is to replace pense() with pense_cv():

set.seed(1234)
fitted_with_cv <- pense_cv(x, y, alpha = 0.6, nlambda = 40, warm_reset = 5L, cv_k = 5)
#> Warning: The warm_reset argument of pense() is deprecated as of pense 2.0.0.
#> Please use the nlambda_enpy argument instead.

However, this raises a warning that the argument warm_reset= is also deprecated in favor of argument nlambda_enpy=. The new version uses more consistent and self-explaining naming for arguments. Therefore, the most basic way for computing regularized S-estimates of linear regression and estimating their prediction performance via CV is the following:

set.seed(1234)
fitted_with_cv <- pense_cv(x, y, alpha = 0.6, nlambda = 40, nlambda_enpy = 5L, cv_k = 5)

To only fit the models, without estimating prediction performance, use pense() (note the absence of the cv_k= argument):

fitted_no_cv <- pense(x, y, alpha = 0.6, nlambda = 40, nlambda_enpy = 5L)

## Structure of returned objects

The structure of the returned objects is different from pense versions 1.x. In old versions, the estimated coefficients were stored as a sparse matrix object, with one column per fit (i.e., per penalization level). In new versions, estimates are stored as a list with one entry per penalization level. For fitted models with prediction performance, as in fitted_with_cv, the returned object additionally contains information on the prediction performance of all fitted models:

str(fitted_no_cv, max.level = 1)
#> List of 4
#>  $estimates:List of 40 #>$ call     : language pense(x = x, y = y, alpha = 0.6, nlambda = 40, nlambda_enpy = 5L)
#>  $alpha : num 0.6 #>$ lambda   : num [1:40] 3.82 3.16 2.61 2.16 1.78 ...
#>  - attr(*, "class")= chr [1:2] "pense" "pense_fit"
str(fitted_with_cv, max.level = 1)
#> List of 7
#>  $call : language pense_cv(x = x, y = y, cv_k = 5, alpha = 0.6, nlambda = 40, nlambda_enpy = 5L) #>$ lambda         : num [1:40] 3.82 3.16 2.61 2.16 1.78 ...
#>  $alpha : num 0.6 #>$ cvres          :'data.frame': 40 obs. of  3 variables:
#>  $cv_replications: num [1:40, 1] 2.96 2.9 2.79 2.67 2.5 ... #>$ cv_measure     : chr "tau_size"
#>  $estimates :List of 40 #> - attr(*, "class")= chr [1:2] "pense" "pense_cvfit" # Extracting coefficient estimates As in previous versions, the coefficient estimates are best accessed via the coefficients() method. For fits with estimated prediction performance, the method returns the coefficients of the model yielding the lowest prediction error: coefficients(fitted_with_cv) #> (Intercept) X1 X2 X3 X4 X5 #> 0.10172752 0.52970226 -1.70085495 1.40631808 -0.09281353 0.14095462 #> X6 X7 X8 X9 X10 #> 0.06652198 -0.03318776 0.00000000 0.19113248 0.05875205 Fits computed with pense(), however, do not have information about prediction performance, hence coefficients(fitted_no_cv) would not know what penalization level to use. In this case, the desired penalization level must be specified manually: coefficients(fitted_no_cv, lambda = fitted_no_cv$lambda[10])
#> (Intercept)          X1          X2          X3          X4          X5
#>   0.3406859   0.2959405  -1.0300811   1.0751673   0.0000000   0.0000000
#>          X6          X7          X8          X9         X10
#>   0.2256299   0.0000000   0.0000000   0.0000000   0.0000000

An important difference to previous versions is that new versions do not correct for the bias introduced by the Ridge penalty. The correction, which was adopted from the original LS elastic net paper (Zou and Hastie 2005), was dropped as it does not adequately counter the effects of the Ridge penalty on the S-estimates. Specifying the correction= argument results in an error in new versions of the package:

coefficients(fitted_with_cv, correction = TRUE)
#> Error: The correction argument of coef() was deprecated in pense 2.0.0 and is now defunct.

The same applies to functions residuals() and predict() for extracting residuals of the fits and using the estimated coefficients for prediction, respectively.

Many of the optional arguments to pense() and pensem() have been renamed. Options to control the algorithms have been regrouped to remove ambiguity and redundancies. For example, previous versions had several options to specify the numerical tolerance used in different parts of the algorithm. This could have lead to unstable algorithms and undesired results. In new versions, the numerical tolerance is only specified on the top level in the call to pense() and friends.

## Controlling the PENSE algorithm

In previous versions of the package, all options for the PENSE algorithm have been set via the pense(options=) argument and the pense_options() function.

In new versions, the options are either given directly to the pense() function, or supplied via arguments algorithm_opts or mscale_opts:

pense versions 1.x pense versions 2.x and above
pense_options(eps=) pense(eps=)
pense_options(delta=) pense(bdp=)
pense_options(cc=) pense(cc=)
pense_options(maxit=) mm_algorithm_options(max_it=)
pense_options(mscale_maxit=) mscale_algorithm_options(max_it=)
pense_options(mscale_eps=) mscale_algorithm_options(eps=)
all other options unsupported

For example, in previous versions the breakdown point of the PENSE estimate was set with pense_options(delta=):

pense(x, y, alpha = 0.6, nlambda = 40, nlambda_enpy = 5L, options = pense_options(delta = 0.33))

In the new versions, the breakdown point is set directly in the call to pense():

pense(x, y, alpha = 0.6, nlambda = 40, nlambda_enpy = 5L, bdp = 0.33)

## Controlling the PENSEM algorithm

As with pense(), options for controlling the algorithm used by pensem_cv() are moved from mstep_options() to:

pense versions 1.x pense versions 2.x and above
mstep_options(cc=) pensem_cv(cc=)
mstep_options(eps=) pensem_cv(eps=)
mstep_options(maxit=) mm_algorithm_options(max_it=)
all other options unsupported

## Selecting the EN algorithm

In previous versions, the EN algorithm (the workhorse to compute PENSE estimates), was specified via pense(en_options=). In new versions, the user can select the EN algorithm separately for computing PENSE estimates and for computing starting points via ENPY separately. Therefore, the algorithm and it’s options are now set through arguments algorithm_opts= and enpy_opts=.

pense(x, y, alpha = 0.6, nlambda = 40, nlambda_enpy = 5L,
algorithm_opts = mm_algorithm_options(en_algorithm_opts = en_lars_options()),
enpy_opts = enpy_options(en_algorithm_opts = en_admm_options()))

Moreover, functions en_options_aug_lars() and en_options_dal() are replaced by en_lars_options() and en_dal_options(), respectively. The new functions accept similar arguments, but with slightly different names. They do not accept the numerical tolerance eps anymore, as this is now set directly in pense() and friends. The LARS algorithm now always uses the Gram matrix, afforded by a more efficient implementation (argument use_gram is now ignored). Similarly, the DAL algorithm always uses a conjugate gradient pre-conditioner, doesn’t print status information, and adaptively chooses the initial step size as either eta_start_conservative or eta_start_aggressive, based on the change in penalization level. Hence, arguments verbosity, preconditioner, and eta_start are now defunct:

pense versions 1.x pense versions 2.x and above
en_options_dal(eps=) pense(eps=)
en_options_dal(maxit=) en_dal_options(max_it=)
en_options_dal(eta_mult=) en_dal_options(eta_multiplier=)
en_options_dal(eta_start=) en_dal_options(eta_start_conservative=, eta_start_aggressive=)
all other options unsupported

# References

Zou, Hui, and Trevor Hastie. 2005. “Regularization and Variable Selection via the Elastic Net.” Journal of the Royal Statistical Society. Series B (Statistical Methodology) 67: 301–20. http://www.jstor.org/stable/3647580.