Using Stagewise (L1) as a Feature Selection Tool to Enhance Kernelized Ridge Regression (L2) Outputs for Regression Problems

  • Sara Sajwani

Student thesis: Master's Thesis

Abstract

In machine learning the aim is to fit a model to the training data. The model should be such that it is accurate and can be applied to other data sets of the same category. These models are then used to classify or make predictions. The complexity of the model depends on the data, it could be a high dimensional (more variables) or low-dimensional (lesser variables). There are three prevalent regularization methods- LASSO/L1, Ridge/Tikhonov/L2 and elastic net and are used depending on the purpose and nature of data. The thesis covers L1 and L2 regularization methods primarily as a tool towards feature selection in high dimensional data. The data is firsttuned using L1 to remove redundant variables and the best subset is used to run the kernelized L2 algorithm. The feature selection strength is compared against forward selection (FWS). The R2 value and mean square error (MSE) is used to determine the accuracy of the algorithm. Experiments have shown that this coalesced model shows as much accuracy as a full L2 model and allows for fewer predictor variables.
Date of AwardDec 2019
Original languageAmerican English

Keywords

  • Regularization
  • Regression
  • Feature Selection
  • Forward Stagewise
  • Kernels.

Cite this

'