Abstract
Data heterogeneity has become a challenging problem in modern data analysis. Classic statistical modeling methods, which assume the data are independent and identically distributed, often show unsatisfactory performance on heterogeneous data. This work is motivated by a multivariate calibration problem from a soil characterization study, where the samples were collected from five different locations. Newly proposed and existing signal regression models are applied to the multivariate calibration problem, where the models are adapted to handle such spatially clustered structure. When compared to a variety of other methods, e.g. kernel ridge regression, random forests, and partial least squares, we find that our newly proposed varying-coefficient signal regression model is highly competitive, often out-performing the other methods, in terms of external prediction error.
Original language | English |
---|---|
Article number | 104386 |
Journal | Chemometrics and Intelligent Laboratory Systems |
Volume | 217 |
DOIs | |
State | Published - Oct 15 2021 |
Scopus Subject Areas
- Analytical Chemistry
- Software
- Computer Science Applications
- Process Chemistry and Technology
- Spectroscopy
Keywords
- Multivariate calibration
- P-splines
- Signal regression
- Varying-coefficient model