A new ensemble modeling methods for multivariate calibration of near infrared spectroscopy
writer:Kaiyi Wang, Xihui Bian*, Xiaoyao Tan, Haitao Wang, Yankun Li
keywords:Ensemble, Monte Carlo resampling, Least absolute shrinkage and selection operator, Near infrared spectroscopy, Partial least squares
source:期刊
specific source:Analytical Methods, 2021, 13 (11): 1374-1380
Issue time:2021年
Ensemble modeling has gained increasing attention for improving the performance of quantitative models in near infrared (NIR) spectroscopy analysis. Based on Monte Carlo (MC) resampling, least absolute shrinkage and selection operator (LASSO) and partial least squares (PLS), a new ensemble strategy named MC-LASSO-PLS, is proposed for NIR spectral multivariate calibration. In the method, the training subsets for building the sub-models are generated by sampling from both samples and variables to ensure the diversity of the models. In details, a certain number of samples as sample subset are randomly selected from the training set. Then, the LASSO is used to shrink the variables of the sample subset to form the training subset, which is used to build PLS sub-model. This process is repeated N times and N sub-models are obtained. Finally, the predictions of those sub-models are used to produce the final prediction by simple average. The prediction ability of the proposed method was compared with those of LASSO-PLS, MC-PLS and PLS models on NIR spectra of corn, blend oil and orange juice samples. The superiority of MC-LASSO-PLS in prediction ability is demonstrated.