Search CORE

63 research outputs found

Recommended from our members

Penalty-free sparse PCA

Author: Adachi Kohei
Trendafilov Nickolay
Publication venue
Publication date: 01/01/2014
Field of study

Open Research Online (The Open University)

On the Procrustean analogue of individual differences scaling (INDSCAL)

Author: Gower John C.
Trendafilov Nickolay T.
Unkel Steffen
Publication venue
Publication date: 10/05/2013
Field of study

In this paper, individual differences scaling (INDSCAL) is revisited, considering INDSCAL as being embedded within a hierarchy of individual difference scaling models. We explore the members of this family, distinguishing (i) models, (ii) the role of identification and substantive constraints, (iii) criteria for fitting models and (iv) algorithms to optimise the criteria. Model formulations may be based either on data that are in the form of proximities or on configurational matrices. In its configurational version, individual difference scaling may be formulated as a form of generalized Procrustes analysis. Algorithms are introduced for fitting the new models. An application from sensory evaluation illustrates the performance of the methods and their solutions

Open Access LMU

Recommended from our members

Exploratory factor analysis of large data matrices

Author: Fontanella Sara
Trendafilov Nickolay T.
Publication venue: 'Wiley'
Publication date: 01/02/2019
Field of study

Nowadays, the most interesting applications have data with many more variables than observations and require dimension reduction. With such data, standard exploratory factor analysis (EFA) cannot be applied. Recently, a generalized EFA (GEFA) model was proposed to deal with any type of data: both vertical data(fewer variables than observations) and horizontal data (more variables than observations). The associated algorithm, GEFALS, is very efficient, but still cannot handle data with thousands of variables. The present work modifies GEFALS and proposes a new very fast version, GEFAN. This is achieved by aligning the dimensions of the parameter matrices to their ranks, thus, avoiding redundant calculations. The GEFALS and GEFAN algorithms are compared numerically with well-known data

Open Research Online (The Open University)

Recommended from our members

Some inequalities contrasting principal component and factor analyses solutions

Author: Adachi Kohei
Trendafilov Nickolay T.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2019
Field of study

Principal component analysis (PCA) and factor analysis (FA) are two time-honored dimension reduction methods. In this paper, some inequalities are presented to contrast PCA and FA solutions for the same data set. For this reason, we take advantage of the recently established matrix decomposition (MD) formulation of FA. In summary, the resulting inequalities show that [1] FA gives a better fit to the data than PCA, [2] PCA extracts a larger amount of common “information” than FA, and [3] For each variable, its unique variance in FA is larger than its residual variance in PCA minus the one in FA. The resulting inequalities can be useful to suggest whether PCA or FA should be used for a particular data set. The answers can also be valid for the classic FA formulation not relying on the MD-FA definition, as both “types” FA provide almost equal solutions. Additionally, the inequalities give theoretical explanation of some empirically observed tendencies in PCA and FA solutions, e.g., that the absolute values of PCA loadings tend to be larger than those for FA loadings, and that the unique variances in FA tend to be larger than the residual variances of PCA

Open Research Online (The Open University)

Recommended from our members

Sparse exploratory factor analysis

Author: Adachi Kohei
Fontanella Sara
Trendafilov Nickolay
Publication venue
Publication date: 01/01/2014
Field of study

Sparse principal component analysis is a very active research area in the last decade. In the same time, there are very few works on sparse factor analysis. We propose a new contribution to the area by exploring a procedure for sparse factor analysis where the unknown parameters are found simultaneously

Open Research Online (The Open University)

An improved estimation procedure for robust PARAFAC model fitting

Author: Gallo Michele
Simonacci Violetta
Tordorov Valentin
Trendafilov Nickolay
Publication venue
Publication date: 01/01/2022
Field of study

Different techniques exist to analyze multi-way data but PARAFAC is one of the most popular. The usual way of parameter estimation in PARAFAC is an alternating least squares (ALS) procedure which yields least-squares solutions and provides consistent outcomes. Together with these desirable features, the ALS procedure suffers several major flaws which might be particularly problematic for large-scale problems: slow convergence and sensitiveness to degeneracy conditions such as over-factoring, collinearity, bad initialization and local minima. Furthermore, it is well-known that algorithms which rely on least squares easily break down in the presence of outliers. The issue of nonrobustness of the ALS procedure was addressed by [1] and software for computing the robust PARAFAC model is available in the R package rrcov3way (see [4]). The other issues were addressed in a number of works proposing algorithms more efficient than ALS. However, often these do not provide stable results because the increased speed might come at the expense of accuracy. An integrated algorithm was proposed by [2] and [3] which seems to combine improved speed and stability. The purpose of this work is to elaborate this algorithm by adding capabilities for dealing with outliers which might be present in the data. The idea of a robust version of ALS is to identify enough ”good“ observations and to perform the classical ALS on these observations. This is repeated until no significant change is observed. Finally, a reweighting step is carried out to improve the efficiency of the estimators. In order to identify the ”good” observations a robust version of principal component analysis on the unfolded array is used. It is obvious that the robust procedure will be much more time consuming than the classical one, repeating many times the ALS optimization. Therefore, any improvement of the performance of the parameter estimation procedure will contribute to the improvement of the performance of the complete robust procedure. We combine the robust PARAFAC procedure proposed by [1] with the highly efficient estimation algorithm INT2 proposed by [2] in order to obtain a fast estimation and robust to outliers PARAFAC modeling technique which we call R-INT2. As before it starts by robust principal components to identify any outlying points and then iterates using the INT2 algorithm until no significant change is observed. After convergence a reweighting step with INT2 is conducted which produces the final solution. The performance of the newly proposed algorithm R-INT2 for robust estimation of trilinear PARAFAC models is demonstrated in a brief simulation study comparing classical ALS, the robust version based on ALS ([1]) and the new R-INT2. First of, all we want to verify that R-INT2 works well on data sets with and without contamination by identifying the outliers at least as good as the robust ALS algorithm retrieving solutions with good statistical quality. At the same time we want to verify that the convergence is improved significantly and thus the computational time is reduced. Future work should bring a thorough investigation of the properties of the algorithm, comparison to the many existing fast alternatives and studying the possibilities for combination with other computational algorithms

Archivio della ricerca - Università degli studi di Napoli Federico II

Recommended from our members

Simple Structure Detection Through Bayesian Exploratory Multidimensional IRT Models

Author: Fontanella Lara
Fontanella Sara
Trendafilov Nickolay
Valentini Pasquale
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2019
Field of study

In modern validity theory, a major concern is the construct validity of a test, which is commonly assessed through confirmatory or exploratory factor analysis. In the framework of Bayesian exploratory Multidimensional Item Response Theory (MIRT) models, we discuss two methods aimed at investigating the underlying structure of a test, in order to verify if the latent model adheres to a chosen simple factorial structure. This purpose is achieved without imposing hard constraints on the discrimination parameter matrix to address the rotational indeterminacy. The first approach prescribes a 2-step procedure. The parameter estimates are obtained through an unconstrained MCMC sampler. The simple structure is, then, inspected with a post-processing step based on the Consensus Simple Target Rotation technique. In the second approach, both rotational invariance and simple structure retrieval are addressed within the MCMC sampling scheme, by introducing a sparsity-inducing prior on the discrimination parameters. Through simulation as well as real-world studies, we demonstrate that the proposed methods are able to correctly infer the underlying sparse structure and to retrieve interpretable solutions

Open Research Online (The Open University)

FigShare

Sparse Exploratory Factor Analysis

Author: A Edelman
C Hage
HH Harman
IT Jolliffe
J Choi
K Hirose
K Hirose
KG Jöreskog
Kohei Adachi
MATLAB
N Boumal
N Buono Del
Nickolay T. Trendafilov
NT Trendafilov
NT Trendafilov
NT Trendafilov
NT Trendafilov
P-A Absil
R Luss
SA Mulaik
Sara Fontanella
Z Wen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/07/2017
Field of study

Sparse principal component analysis is a very active research area in the last decade. It produces component loadings with many zero entries which facilitates their interpretation and helps avoid redundant variables. The classic factor analysis is another popular dimension reduction technique which shares similar interpretation problems and could greatly benefit from sparse solutions. Unfortunately, there are very few works considering sparse versions of the classic factor analysis. Our goal is to contribute further in this direction. We revisit the most popular procedures for exploratory factor analysis, maximum likelihood and least squares. Sparse factor loadings are obtained for them by, first, adopting a special reparameterization and, second, by introducing additional [Formula: see text]-norm penalties into the standard factor analysis problems. As a result, we propose sparse versions of the major factor analysis procedures. We illustrate the developed algorithms on well-known psychometric problems. Our sparse solutions are critically compared to ones obtained by other existing methods

Crossref

Open Research Online (The Open University)

Spiral - Imperial College Digital Repository

Sparsest factor analysis for clustering variables: a matrix decomposition approach

Author: A Stegeman
AJ Izenman
BS Everitt
C Spearman
CC Aggarwal
D Knowles
DM Zou
G Gan
GAF Seber
HH Harman
IT Jolliffe
J de Leeuw
JMF ten Berge
JMF ten Berge
K Adachi
K Adachi
K Adachi
K Hirose
K Hirose
Kohei Adachi
L Eldén
LR Goldberg
M Rattray
M Vichi
MJ Zaki
Nickolay T. Trendafilov
Nickolay T. Trendafilov
NT Trendafilov
NT Trendafilov
PT Costa
R Mazumder
R Reyment
S Unkel
SA Mulaik
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/04/2017
Field of study

We propose a new procedure for sparse factor analysis (FA) such that each variable loads only one common factor. Thus, the loading matrix has a single nonzero element in each row and zeros elsewhere. Such a loading matrix is the sparsest possible for certain number of variables and common factors. For this reason, the proposed method is named sparsest FA (SSFA). It may also be called FA-based variable clustering, since the variables loading the same common factor can be classified into a cluster. In SSFA, all model parts of FA (common factors, their correlations, loadings, unique factors, and unique variances) are treated as fixed unknown parameter matrices and their least squares function is minimized through specific data matrix decomposition. A useful feature of the algorithm is that the matrix of common factor scores is re-parameterized using QR decomposition in order to efficiently estimate factor correlations. A simulation study shows that the proposed procedure can exactly identify the true sparsest models. Real data examples demonstrate the usefulness of the variable clustering performed by SSFA

Crossref

Open Research Online (The Open University)

Semi-sparse PCA

Author: A Edelman
DM Witten
EJ Candès
GH Golub
H Shen
HH Harman
IT Jolliffe
J Leeuw De
J-F Cai
JH Steiger
JH Steiger
K Adachi
L Eldén
Lars Eldén
M Journée
N Trendafilov
Nickolay Trendafilov
NT Trendafilov
P-A Absil
S Unkel
SA Armstrong
SA Mulaik
SA Mulaik
X Yuan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/03/2019
Field of study

It is well-known that the classical exploratory factor analysis (EFA) of data with more observations than variables has several types of indeterminacy. We study the factor indeterminacy and show some new aspects of this problem by considering EFA as a specific data matrix decomposition. We adopt a new approach to the EFA estimation and achieve a new characterization of the factor indeterminacy problem. A new alternative model is proposed, which gives determinate factors and can be seen as a semi-sparse principal component analysis (PCA). An alternating algorithm is developed, where in each step a Procrustes problem is solved. It is demonstrated that the new model/algorithm can act as a specific sparse PCA and as a low-rank-plus-sparse matrix decomposition. Numerical examples with several large data sets illustrate the versatility of the new model, and the performance and behaviour of its algorithmic implementation

Publikationer från Linköpings universitet

Crossref

Open Research Online (The Open University)

Digitala Vetenskapliga Arkivet - Academic Archive On-line