A random matrix analysis and improvement of semi-supervised learning for
  large dimensional data

Couillet, Romain; Mai, Xiaoyi

research

A random matrix analysis and improvement of semi-supervised learning for large dimensional data

Authors: Romain Couillet
Xiaoyi Mai
Publication date: 9 November 2017
Publisher

Abstract

This article provides an original understanding of the behavior of a class of graph-oriented semi-supervised learning algorithms in the limit of large and numerous data. It is demonstrated that the intuition at the root of these methods collapses in this limit and that, as a result, most of them become inconsistent. Corrective measures and a new data-driven parametrization scheme are proposed along with a theoretical analysis of the asymptotic performances of the resulting approach. A surprisingly close behavior between theoretical performances on Gaussian mixture models and on real datasets is also illustrated throughout the article, thereby suggesting the importance of the proposed analysis for dealing with practical data. As a result, significant performance gains are observed on practical data classification using the proposed parametrization

Similar works

Full text

Available Versions

HAL-CentraleSupelec

oai:HAL:hal-04409163v1

Last time updated on 02/02/2024