An implementation of LARS(3) a stage-wise homotopy-based algorithm for l1-regularized linear regression (LASSO) and l1+l2 regularized linear regression (Elastic Net).

Other Alias

mlpack::regression::LARS

SYNOPSIS


Public Member Functions


LARS (const bool useCholesky, const double lambda1=0.0, const double lambda2=0.0, const double tolerance=1e-16)
Set the parameters to LARS.
LARS (const bool useCholesky, const arma::mat &gramMatrix, const double lambda1=0.0, const double lambda2=0.0, const double tolerance=1e-16)
Set the parameters to LARS, and pass in a precalculated Gram matrix.
const std::vector< size_t > & ActiveSet () const
Access the set of active dimensions.
const std::vector< arma::vec > & BetaPath () const
Access the set of coefficients after each iteration; the solution is the last element.
const std::vector< double > & LambdaPath () const
Access the set of values for lambda1 after each iteration; the solution is the last element.
const arma::mat & MatUtriCholFactor () const
Access the upper triangular cholesky factor.
void Regress (const arma::mat &data, const arma::vec &responses, arma::vec &beta, const bool transposeData=true)
Run LARS.
std::string ToString () const

Private Member Functions


void Activate (const size_t varInd)
Add dimension varInd to active set.
void CholeskyDelete (const size_t colToKill)

void CholeskyInsert (const arma::vec &newX, const arma::mat &X)

void CholeskyInsert (double sqNormNewX, const arma::vec &newGramCol)

void ComputeYHatDirection (const arma::mat &matX, const arma::vec &betaDirection, arma::vec &yHatDirection)

void Deactivate (const size_t activeVarInd)
Remove activeVarInd'th element from active set.
void GivensRotate (const arma::vec::fixed< 2 > &x, arma::vec::fixed< 2 > &rotatedX, arma::mat &G)

void Ignore (const size_t varInd)
Add dimension varInd to ignores set (never removed).
void InterpolateBeta ()

Private Attributes


std::vector< size_t > activeSet
Active set of dimensions.
std::vector< arma::vec > betaPath
Solution path.
bool elasticNet
True if this is the elastic net problem.
std::vector< size_t > ignoreSet
Set of ignored variables (for dimensions in span{active set dimensions}).
std::vector< bool > isActive
Active set membership indicator (for each dimension).
std::vector< bool > isIgnored
Membership indicator for set of ignored variables.
double lambda1
Regularization parameter for l1 penalty.
double lambda2
Regularization parameter for l2 penalty.
std::vector< double > lambdaPath
Value of lambda_1 for each solution in solution path.
bool lasso
True if this is the LASSO problem.
const arma::mat & matGram
Reference to the Gram matrix we will use.
arma::mat matGramInternal
Gram matrix.
arma::mat matUtriCholFactor
Upper triangular cholesky factor; initially 0x0 matrix.
double tolerance
Tolerance for main loop.
bool useCholesky
Whether or not to use Cholesky decomposition when solving linear system.

Detailed Description

An implementation of LARS, a stage-wise homotopy-based algorithm for l1-regularized linear regression (LASSO) and l1+l2 regularized linear regression (Elastic Net).

Let $ X $ be a matrix where each row is a point and each column is a dimension and let $ y $ be a vector of responses.

The Elastic Net problem is to solve

\.PP where $ vector of regression coefficients.

If $


_1 > 0 $ and $
_2 = 0 $, the problem is the LASSO. If $
_1 > 0 $ and $
_2 > 0 $, the problem is the elastic net. If $
_1 = 0 $ and $
_2 > 0 $, the problem is ridge regression. If $
_1 = 0 $ and $
_2 = 0 $, the problem is unregularized linear regression.

Note: This algorithm is not recommended for use (in terms of efficiency) when $


_1 $ = 0.

For more details, see the following papers:

@article{efron2004least,
  title={Least angle regression},
  author={Efron, B. and Hastie, T. and Johnstone, I. and Tibshirani, R.},
  journal={The Annals of statistics},
  volume={32},
  number={2},
  pages={407--499},
  year={2004},
  publisher={Institute of Mathematical Statistics}
}

@article{zou2005regularization,
  title={Regularization and variable selection via the elastic net},
  author={Zou, H. and Hastie, T.},
  journal={Journal of the Royal Statistical Society Series B},
  volume={67},
  number={2},
  pages={301--320},
  year={2005},
  publisher={Royal Statistical Society}
}


 

Definition at line 99 of file lars.hpp.

Constructor & Destructor Documentation

mlpack::regression::LARS::LARS (const booluseCholesky, const doublelambda1 = 0.0, const doublelambda2 = 0.0, const doubletolerance = 1e-16)

Set the parameters to LARS. Both lambda1 and lambda2 default to 0.

Parameters:

useCholesky Whether or not to use Cholesky decomposition when solving linear system (as opposed to using the full Gram matrix).
lambda1 Regularization parameter for l1-norm penalty.
lambda2 Regularization parameter for l2-norm penalty.
tolerance Run until the maximum correlation of elements in (X^T y) is less than this.

mlpack::regression::LARS::LARS (const booluseCholesky, const arma::mat &gramMatrix, const doublelambda1 = 0.0, const doublelambda2 = 0.0, const doubletolerance = 1e-16)

Set the parameters to LARS, and pass in a precalculated Gram matrix. Both lambda1 and lambda2 default to 0.

Parameters:

useCholesky Whether or not to use Cholesky decomposition when solving linear system (as opposed to using the full Gram matrix).
gramMatrix Gram matrix.
lambda1 Regularization parameter for l1-norm penalty.
lambda2 Regularization parameter for l2-norm penalty.
tolerance Run until the maximum correlation of elements in (X^T y) is less than this.

Member Function Documentation

void mlpack::regression::LARS::Activate (const size_tvarInd) [private]

Add dimension varInd to active set.

Parameters:

varInd Dimension to add to active set.

const std::vector<size_t>& mlpack::regression::LARS::ActiveSet () const [inline]

Access the set of active dimensions.

Definition at line 155 of file lars.hpp.

References activeSet.

const std::vector<arma::vec>& mlpack::regression::LARS::BetaPath () const [inline]

Access the set of coefficients after each iteration; the solution is the last element.

Definition at line 159 of file lars.hpp.

References betaPath.

void mlpack::regression::LARS::CholeskyDelete (const size_tcolToKill) [private]

void mlpack::regression::LARS::CholeskyInsert (const arma::vec &newX, const arma::mat &X) [private]

void mlpack::regression::LARS::CholeskyInsert (doublesqNormNewX, const arma::vec &newGramCol) [private]

void mlpack::regression::LARS::ComputeYHatDirection (const arma::mat &matX, const arma::vec &betaDirection, arma::vec &yHatDirection) [private]

void mlpack::regression::LARS::Deactivate (const size_tactiveVarInd) [private]

Remove activeVarInd'th element from active set.

Parameters:

activeVarInd Index of element to remove from active set.

void mlpack::regression::LARS::GivensRotate (const arma::vec::fixed< 2 > &x, arma::vec::fixed< 2 > &rotatedX, arma::mat &G) [private]

void mlpack::regression::LARS::Ignore (const size_tvarInd) [private]

Add dimension varInd to ignores set (never removed).

Parameters:

varInd Dimension to add to ignores set.

void mlpack::regression::LARS::InterpolateBeta () [private]

const std::vector<double>& mlpack::regression::LARS::LambdaPath () const [inline]

Access the set of values for lambda1 after each iteration; the solution is the last element.

Definition at line 163 of file lars.hpp.

References lambdaPath.

const arma::mat& mlpack::regression::LARS::MatUtriCholFactor () const [inline]

Access the upper triangular cholesky factor.

Definition at line 166 of file lars.hpp.

References matUtriCholFactor.

void mlpack::regression::LARS::Regress (const arma::mat &data, const arma::vec &responses, arma::vec &beta, const booltransposeData = true)

Run LARS. The input matrix (like all MLPACK matrices) should be column-major -- each column is an observation and each row is a dimension. However, because LARS is more efficient on a row-major matrix, this method will (internally) transpose the matrix. If this transposition is not necessary (i.e., you want to pass in a row-major matrix), pass 'false' for the transposeData parameter.

Parameters:

data Column-major input data (or row-major input data if rowMajor = true).
responses A vector of targets.
beta Vector to store the solution (the coefficients) in.
rowMajor Set to false if the data is row-major.

std::string mlpack::regression::LARS::ToString () const

Member Data Documentation

std::vector<size_t> mlpack::regression::LARS::activeSet [private]

Active set of dimensions.

Definition at line 204 of file lars.hpp.

Referenced by ActiveSet().

std::vector<arma::vec> mlpack::regression::LARS::betaPath [private]

Solution path.

Definition at line 198 of file lars.hpp.

Referenced by BetaPath().

bool mlpack::regression::LARS::elasticNet [private]

True if this is the elastic net problem.

Definition at line 190 of file lars.hpp.

std::vector<size_t> mlpack::regression::LARS::ignoreSet [private]

Set of ignored variables (for dimensions in span{active set dimensions}).

Definition at line 212 of file lars.hpp.

std::vector<bool> mlpack::regression::LARS::isActive [private]

Active set membership indicator (for each dimension).

Definition at line 207 of file lars.hpp.

std::vector<bool> mlpack::regression::LARS::isIgnored [private]

Membership indicator for set of ignored variables.

Definition at line 215 of file lars.hpp.

double mlpack::regression::LARS::lambda1 [private]

Regularization parameter for l1 penalty.

Definition at line 187 of file lars.hpp.

double mlpack::regression::LARS::lambda2 [private]

Regularization parameter for l2 penalty.

Definition at line 192 of file lars.hpp.

std::vector<double> mlpack::regression::LARS::lambdaPath [private]

Value of lambda_1 for each solution in solution path.

Definition at line 201 of file lars.hpp.

Referenced by LambdaPath().

bool mlpack::regression::LARS::lasso [private]

True if this is the LASSO problem.

Definition at line 185 of file lars.hpp.

const arma::mat& mlpack::regression::LARS::matGram [private]

Reference to the Gram matrix we will use.

Definition at line 176 of file lars.hpp.

arma::mat mlpack::regression::LARS::matGramInternal [private]

Gram matrix.

Definition at line 173 of file lars.hpp.

arma::mat mlpack::regression::LARS::matUtriCholFactor [private]

Upper triangular cholesky factor; initially 0x0 matrix.

Definition at line 179 of file lars.hpp.

Referenced by MatUtriCholFactor().

double mlpack::regression::LARS::tolerance [private]

Tolerance for main loop.

Definition at line 195 of file lars.hpp.

bool mlpack::regression::LARS::useCholesky [private]

Whether or not to use Cholesky decomposition when solving linear system.

Definition at line 182 of file lars.hpp.

Author

Generated automatically by Doxygen for MLPACK from the source code.