Search CORE

25 research outputs found

Learning from Noisy Label Distributions

Author: A Culotta
CM Bishop
F Pedregosa
TG Dietterich
Publication venue
Publication date: 10/08/2017
Field of study

In this paper, we consider a novel machine learning problem, that is, learning a classifier from noisy label distributions. In this problem, each instance with a feature vector belongs to at least one group. Then, instead of the true label of each instance, we observe the label distribution of the instances associated with a group, where the label distribution is distorted by an unknown noise. Our goals are to (1) estimate the true label of each instance, and (2) learn a classifier that predicts the true label of a new instance. We propose a probabilistic model that considers true label distributions of groups and parameters that represent the noise as hidden variables. The model can be learned based on a variational Bayesian method. In numerical experiments, we show that the proposed model outperforms existing methods in terms of the estimation of the true labels of instances.Comment: Accepted in ICANN201

arXiv.org e-Print Archive

Crossref

Predicting the age of social network users from user-generated texts with word embeddings

Author: Alekseev A.
Nikolenko S.
Publication venue
Publication date: 01/01/2017
Field of study

© 2016 FRUCT.Many web-based applications such as advertising or recommender systems often critically depend on the demographic information, which may be unavailable for new or anonymous users. We study the problem of predicting demographic information based on user-generated texts on a Russian-language dataset from a large social network. We evaluate the efficiency of age prediction algorithms based on word2vec word embeddings and conduct a comprehensive experimental evaluation, comparing these algorithms with each other and with classical baseline approaches

Kazan Federal University Digital Repository