Search CORE

835 research outputs found

PCA model building with missing data: New proposals and a comparative study

Author: A. Ferrer
A. Folch-Fortuny
Allison
Arteaga
Arteaga
Arteaga
Arteaga
Arteaga
Camacho
Dear
Dempster
F. Arteaga
Fierro
Forina
Grung
Gómez-Carracedo
Hutzler
Karhunen
Krzanowski
Liu
López-Negrete de la Fuente
Nelson
Nelson
PLS Toolbox release 7.9.5
ProSensus MultiVariate release 15.02
Rubin
Schafer
Schneider
Seernels
SIMCA release 14
Stanimirova
Tanner
Walczak
White
Wise
Wold
Wächter
Xu
Publication venue: 'Elsevier BV'
Publication date: 15/08/2015
Field of study

[EN] This paper introduces new methods for building principal component analysis (PCA) models with missing data: projection to the model plane (PMP), known data regression (KDR), KDR with principal component regression (PCR), KDR with partial least squares regression (PLS) and trimmed scores regression (TSR). These methods are adapted from their PCA model exploitation version to deal with the more general problem of PCA model building when the training set has missing values. A comparative study is carried out comparing these new methods with the standard ones, such as the modified nonlinear iterative partial least squares (NIPALS), the it- erative algorithm (IA), the data augmentation method (DA) and the nonlinear programming approach (NLP). The performance is assessed using the mean squared prediction error of the reconstructed matrix and the cosines between the actual principal components and the ones extracted by each method. Four data sets, two simulated and two real ones, with several percentages of missing data, are used to perform the comparison. Guardar / Salir Siguiente >Research in this study was partially supported by the Spanish Ministry of Science and Innovation and FEDER funds from the European Union through grant DPI2011-28112-C04-02, and the Spanish Ministry of Economy and Competitiveness through grant ECO2013-43353-R. The authors gratefully acknowledge Salvador Garcia-Munoz for providing the Phi toolbox (version 1.7) to perform the nonlinear programming approach (NLP) method.Folch-Fortuny, A.; Arteaga Moreno, FJ.; Ferrer Riquelme, AJ. (2015). PCA model building with missing data: New proposals and a comparative study. Chemometrics and Intelligent Laboratory Systems. 146:77-88. https://doi.org/10.1016/j.chemolab.2015.05.006S778814

Crossref

RiuNet

Robust PCA as Bilinear Decomposition with Outlier-Sparsity Regularization

Author: Giannakis Georgios B.
Mateos Gonzalo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/11/2011
Field of study

Principal component analysis (PCA) is widely used for dimensionality reduction, with well-documented merits in various applications involving high-dimensional data, including computer vision, preference measurement, and bioinformatics. In this context, the fresh look advocated here permeates benefits from variable selection and compressive sampling, to robustify PCA against outliers. A least-trimmed squares estimator of a low-rank bilinear factor analysis model is shown closely related to that obtained from an

\ell_0

-(pseudo)norm-regularized criterion encouraging sparsity in a matrix explicitly modeling the outliers. This connection suggests robust PCA schemes based on convex relaxation, which lead naturally to a family of robust estimators encompassing Huber's optimal M-class as a special case. Outliers are identified by tuning a regularization parameter, which amounts to controlling sparsity of the outlier matrix along the whole robustification path of (group) least-absolute shrinkage and selection operator (Lasso) solutions. Beyond its neat ties to robust statistics, the developed outlier-aware PCA framework is versatile to accommodate novel and scalable algorithms to: i) track the low-rank signal subspace robustly, as new data are acquired in real time; and ii) determine principal components robustly in (possibly) infinite-dimensional feature spaces. Synthetic and real data tests corroborate the effectiveness of the proposed robust PCA schemes, when used to identify aberrant responses in personality assessment surveys, as well as unveil communities in social networks, and intruders from video surveillance data.Comment: 30 pages, submitted to IEEE Transactions on Signal Processin

arXiv.org e-Print Archive

CiteSeerX

Crossref

Estimation of Incident Photosynthetically Active Radiation From Moderate Resolution Imaging Spectrometer Data

Author: Fang Hongliang
Liang Shunlin
Liu Ronggao
Running Steven W.
Tsay Si-Chee
Zheng Tao
Publication venue: ScholarWorks at University of Montana
Publication date: 01/01/2006
Field of study

Incident photosynthetically active radiation (PAR) is a key variable needed by almost all terrestrial ecosystem models. Unfortunately, the current incident PAR products estimated from remotely sensed data at spatial and temporal resolutions are not sufficient for carbon cycle modeling and various applications. In this study, the authors develop a new method based on the look-up table approach for estimating instantaneous incident PAR from the polar-orbiting Moderate Resolution Imaging Spectrometer (MODIS) data. Since the top-of-atmosphere (TOA) radiance depends on both surface reflectance and atmospheric properties that largely determine the incident PAR, our first step is to estimate surface reflectance. The approach assumes known aerosol properties for the observations with minimum blue reflectance from a temporal window of each pixel. Their inverted surface reflectance is then interpolated to determine the surface reflectance of other observations. The second step is to calculate PAR by matching the computed TOA reflectance from the look-up table with the TOA values of the satellite observations. Both the direct and diffuse PAR components, as well as the total shortwave radiation, are determined in exactly the same fashion. The calculation of a daily average PAR value from one or two instantaneous PAR values is also explored. Ground measurements from seven FLUXNET sites are used for validating the algorithm. The results indicate that this approach can produce reasonable PAR product at 1 km resolution and is suitable for global applications, although more quantitative validation activities are still needed

University of Montana

Recommended from our members

Airborne gravity and precise positioning for geologic applications

Author: Arko Robert A.
Bell Robin E.
Blankenship Donald D.
Brozena John M.
Childers Vicki A.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1999
Field of study

Airborne gravimetry has become an important geophysical tool primarily because of advancements in methodology and instrumentation made in the past decade. Airborne gravity is especially useful when measured in conjunction with other geophysical data, such as magnetics, radar, and laser altimetry. The aerogeophysical survey over the West Antarctic ice sheet described in this paper is one such interdisciplinary study. This paper outlines in detail the instrumentation, survey and data processing methodology employed to perform airborne gravimetry from the multiinstrumented Twin Otter aircraft. Precise positioning from carrier-phase Global Positioning System (GPS) observations are combined with measurements of acceleration made by the gravity meter in the aircraft to obtain the free-air gravity anomaly measurement at aircraft altitude. GPS data are processed using the Kinematic and Rapid Static (KARS) software program, and aircraft vertical acceleration and corrections for gravity data reduction are calculated from the GPS position solution. Accuracies for the free-air anomaly are determined from crossover analysis after significant editing (2.98 mGal rms) and from a repeat track (1.39 mGal rms). The aerogeophysical survey covered a 300,000 km2 region in West Antarctica over the course of five field seasons. The gravity data from the West Antarctic survey reveal the major geologic structures of the West Antarctic rift system, including the Whitmore Mountains, the Byrd Subglacial Basin, the Sinuous Ridge, the Ross Embayment, and Siple Dome. These measurements, in conjunction with magnetics and ice-penetrating radar, provide the information required to reveal the tectonic fabric and history of this important region

Columbia University Academic Commons

Airborne gravity and precise positioning for geologic applications

Author: R. E. Bell
V. A. Childers
R. A. Arko
D. D. Blankenship
J. M. Brozena
Behrendt
Bell Aerospace Textron
Bell
Bell
Bell
Bell
Blankenship
Brozena
Brozena
Brozena
Cooper
Dalziel
Drewry
Harlan
LaCoste
Mader
National Research Council
Robertson
Rose
Smith
Turcotte
Wessel
Wilson
Wilson
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1999
Field of study

Crossref

Columbia University Academic Commons

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

High-Dimensional Linear and Functional Analysis of Multivariate Grapevine Data

Author: Jha Uday Kant
Publication venue: RIT Scholar Works
Publication date: 19/05/2017
Field of study

Variable selection plays a major role in multivariate high-dimensional statistical modeling. Hence, we need to select a consistent model, which avoids overfitting in prediction, enhances model interpretability and identifies relevant variables. We explore various continuous, nearly unbiased, sparse and accurate technique of linear model using coefficients paths like penalized maximum likelihood and nonconvex penalties, and iterative Sure Independence Screening (SIS). The convex penalized (pseudo-) likelihood approach based on the elastic net uses a mixture of the ℓ1 (Lasso) and ℓ2 (ridge regression) simultaneously achieve automatic variable selection, continuous shrinkage, and selection of the groups of correlated variables. Variable selection using coefficients paths for minimax concave penalty (MCP), starts applying penalization at the same rate as Lasso, and then smoothly relaxes the rate down to zero as the absolute value of the coefficient increases. The sure screening method is based on correlation learning, which computes component wise estimators using AIC for tuning the regularization parameter of the penalized likelihood Lasso. To reflect the eternal nature of spectral data, we use the Functional Data approach by approximating the finite linear combination of basis functions using B-splines. MCP, SIS and Functional regression are based on the intuition that the predictors are independent. However, high-dimensional grapevine dataset suffers from ill-conditioning of the covariance matrix due to multicollinearity. Under collinearity, the Elastic-Net Regularization path via Coordinate Descent yields the best result to control the sparsity of the model and cross-validation to reduce bias in variable selection. Iterative stepwise multiple linear regression reduces complexity and enhances the predictability of the model by selecting only significant predictors

RIT Scholar Works

Vol. 15, No. 1 (Full Issue)

Author: Editors JMASM
Publication venue: DigitalCommons@WayneState
Publication date: 01/05/2016
Field of study

Digital Commons@Wayne State University

Robust and Regularized Algorithms for Vehicle Tractive Force Prediction and Mass Estimation

Author: Rhode Stephan
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2018
Field of study

This work provides novel robust and regularized algorithms for parameter estimation with applications in vehicle tractive force prediction and mass estimation. Given a large record of real world data from test runs on public roads, recursive algorithms adjusted the unknown vehicle parameters under a broad variation of statistical assumptions for two linear gray-box models

KITopen

Directory of Open Access Books (DOAB)