Search CORE

2,152,675 research outputs found

PhyloPars: estimation of missing parameter values using phylogeny.

Author: Brandt B.W.
Bruggeman J.
Heringa J.
Publication venue
Publication date: 01/01/2009
Field of study

phylogen

Oxford University Research Archive

Comment on "Spatio-temporal filling of missing points in geophysical data sets" by D. Kondrashov and M. Ghil, Nonlin. Processes Geophys., 13, 151–159, 2006

Author: Schneider T.
Publication venue: European Geosciences Union
Publication date: 15/01/2007
Field of study

Kondrashov and Ghil (2006) (KG hereafter) describe a method for imputing missing values in incomplete datasets that can exploit both spatial and temporal covariability to estimate missing values from available values. Temporal covariability has not been exploited as widely as spatial covariability in imputing missing values in geophysical datasets, but, as KG show, doing so can improve estimates of missing values. However, there are several inaccuracies in KG’s paper. Since similar inaccuracies have surfaced in other recent papers, for example, in the literature on paleo-climate reconstructions, I would like to point them out here

HAL-INSU

Caltech Authors

Random Forests with Missing Values in the Covariates

Author: Hothorn Torsten
Rieger Anna
Strobl Carolin
Publication venue
Publication date: 01/01/2010
Field of study

Open Access LMU ( Ludwig-Maximilians-Univ. München)

Missing Value Imputation With Unsupervised Backpropagation

Author: Gashler Michael S.
Martinez Tony
Morris Richard
Smith Michael R.
Publication venue
Publication date: 18/12/2013
Field of study

Many data mining and data analysis techniques operate on dense matrices or complete tables of data. Real-world data sets, however, often contain unknown values. Even many classification algorithms that are designed to operate with missing values still exhibit deteriorated accuracy. One approach to handling missing values is to fill in (impute) the missing values. In this paper, we present a technique for unsupervised learning called Unsupervised Backpropagation (UBP), which trains a multi-layer perceptron to fit to the manifold sampled by a set of observed point-vectors. We evaluate UBP with the task of imputing missing values in datasets, and show that UBP is able to predict missing values with significantly lower sum-squared error than other collaborative filtering and imputation techniques. We also demonstrate with 24 datasets and 9 supervised learning algorithms that classification accuracy is usually higher when randomly-withheld values are imputed using UBP, rather than with other methods

arXiv.org e-Print Archive

CiteSeerX

Approximating Clustering of Fingerprint Vectors with Missing Values

Author: A. Figueroa
C.H. Papadimitriou
G. Ausiello
Giancarlo Mauri
Gianluca Della Vedova
L. Valinsky
L. Valinsky
M. Chlebík
P. Alimonti
Paola Bonizzoni
R. Drmanac
Riccardo Dondi
S. Drmanac
S. Drmanac
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/11/2005
Field of study

The problem of clustering fingerprint vectors is an interesting problem in Computational Biology that has been proposed in (Figureroa et al. 2004). In this paper we show some improvements in closing the gaps between the known lower bounds and upper bounds on the approximability of some variants of the biological problem. Namely we are able to prove that the problem is APX-hard even when each fingerprint contains only two unknown position. Moreover we have studied some variants of the orginal problem, and we give two 2-approximation algorithm for the IECMV and OECMV problems when the number of unknown entries for each vector is at most a constant.Comment: 13 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Capturing Missing Tuples and Missing Values

Author: Fan Wenfei
Geerts Floris
Publication venue
Publication date: 01/01/2010
Field of study

Edinburgh Research Explorer

Institutional Repository Universiteit Antwerpen

Generalized canonical correlation analysis with missing values

Author: Takane Y.
Velden M. van de
Publication venue
Publication date
Field of study

Two new methods for dealing with missing values in generalized canonicalcorrelation analysis are introduced. The first approach, which does notrequire iterations, is a generalization of the Test Equating method availablefor principal component analysis. In the second approach, missing values areimputed in such a way that the generalized canonical correlation analysisobjective function does not increase in subsequent steps. Convergence isachieved when the value of the objective function remains constant. By meansof a simulation study, we assess the performance of the new methods. Wecompare the results with those of two available methods; the missing-datapassive method, introduced Gifi's homogeneity analysis framework, and theGENCOM algorithm developed by Green and Carroll.generalized canoncial correlation analysis;missing values

Research Papers in Economics