3 research outputs found
Graph Embedding with Data Uncertainty
spectral-based subspace learning is a common data preprocessing step in many
machine learning pipelines. The main aim is to learn a meaningful low
dimensional embedding of the data. However, most subspace learning methods do
not take into consideration possible measurement inaccuracies or artifacts that
can lead to data with high uncertainty. Thus, learning directly from raw data
can be misleading and can negatively impact the accuracy. In this paper, we
propose to model artifacts in training data using probability distributions;
each data point is represented by a Gaussian distribution centered at the
original data point and having a variance modeling its uncertainty. We
reformulate the Graph Embedding framework to make it suitable for learning from
distributions and we study as special cases the Linear Discriminant Analysis
and the Marginal Fisher Analysis techniques. Furthermore, we propose two
schemes for modeling data uncertainty based on pair-wise distances in an
unsupervised and a supervised contexts.Comment: 20 pages, 4 figure