1,265 research outputs found
Clasificación de sentimientos semi-supervisada y dependiente de objetivo para micro- blogs
The wealth of opinions expressed in micro-blogs, such as tweets, motivated researchers to develop techniques for automatic opinion detection.
However, accuracies of such techniques are still limited. Moreover, current techniques focus on detecting sentiment polarity regardless of the topic (target) discussed. Detecting sentiment towards a specific target, referred to as target-dependent sentiment classification, has not received adequate researchers’ attention. Literature review has shown that all target-dependent approaches use supervised learning techniques. Such techniques need a large number of labeled data. However, labeling data in social media is cumbersome and error prone. The research presented in this paper addresses this issue by employing semi-supervised learning techniques for target-dependent sentiment classification. Semisupervised learning techniques make use of labeled as well as unlabeled data. In this paper, we present a new semi-supervised learning technique that uses less number of labeled micro-blogs than that used by supervised learning techniques. Experiment results have shown that the proposed technique provides comparable accuracy.Facultad de Informátic
Clasificación de sentimientos semi-supervisada y dependiente de objetivo para micro- blogs
The wealth of opinions expressed in micro-blogs, such as tweets, motivated researchers to develop techniques for automatic opinion detection.
However, accuracies of such techniques are still limited. Moreover, current techniques focus on detecting sentiment polarity regardless of the topic (target) discussed. Detecting sentiment towards a specific target, referred to as target-dependent sentiment classification, has not received adequate researchers’ attention. Literature review has shown that all target-dependent approaches use supervised learning techniques. Such techniques need a large number of labeled data. However, labeling data in social media is cumbersome and error prone. The research presented in this paper addresses this issue by employing semi-supervised learning techniques for target-dependent sentiment classification. Semisupervised learning techniques make use of labeled as well as unlabeled data. In this paper, we present a new semi-supervised learning technique that uses less number of labeled micro-blogs than that used by supervised learning techniques. Experiment results have shown that the proposed technique provides comparable accuracy.Facultad de Informátic
DC Proximal Newton for Non-Convex Optimization Problems
We introduce a novel algorithm for solving learning problems where both the
loss function and the regularizer are non-convex but belong to the class of
difference of convex (DC) functions. Our contribution is a new general purpose
proximal Newton algorithm that is able to deal with such a situation. The
algorithm consists in obtaining a descent direction from an approximation of
the loss function and then in performing a line search to ensure sufficient
descent. A theoretical analysis is provided showing that the iterates of the
proposed algorithm {admit} as limit points stationary points of the DC
objective function. Numerical experiments show that our approach is more
efficient than current state of the art for a problem with a convex loss
functions and non-convex regularizer. We have also illustrated the benefit of
our algorithm in high-dimensional transductive learning problem where both loss
function and regularizers are non-convex
Regularized Optimal Transport and the Rot Mover's Distance
This paper presents a unified framework for smooth convex regularization of
discrete optimal transport problems. In this context, the regularized optimal
transport turns out to be equivalent to a matrix nearness problem with respect
to Bregman divergences. Our framework thus naturally generalizes a previously
proposed regularization based on the Boltzmann-Shannon entropy related to the
Kullback-Leibler divergence, and solved with the Sinkhorn-Knopp algorithm. We
call the regularized optimal transport distance the rot mover's distance in
reference to the classical earth mover's distance. We develop two generic
schemes that we respectively call the alternate scaling algorithm and the
non-negative alternate scaling algorithm, to compute efficiently the
regularized optimal plans depending on whether the domain of the regularizer
lies within the non-negative orthant or not. These schemes are based on
Dykstra's algorithm with alternate Bregman projections, and further exploit the
Newton-Raphson method when applied to separable divergences. We enhance the
separable case with a sparse extension to deal with high data dimensions. We
also instantiate our proposed framework and discuss the inherent specificities
for well-known regularizers and statistical divergences in the machine learning
and information geometry communities. Finally, we demonstrate the merits of our
methods with experiments using synthetic data to illustrate the effect of
different regularizers and penalties on the solutions, as well as real-world
data for a pattern recognition application to audio scene classification
Deep Generative Models for Reject Inference in Credit Scoring
Credit scoring models based on accepted applications may be biased and their
consequences can have a statistical and economic impact. Reject inference is
the process of attempting to infer the creditworthiness status of the rejected
applications. In this research, we use deep generative models to develop two
new semi-supervised Bayesian models for reject inference in credit scoring, in
which we model the data generating process to be dependent on a Gaussian
mixture. The goal is to improve the classification accuracy in credit scoring
models by adding reject applications. Our proposed models infer the unknown
creditworthiness of the rejected applications by exact enumeration of the two
possible outcomes of the loan (default or non-default). The efficient
stochastic gradient optimization technique used in deep generative models makes
our models suitable for large data sets. Finally, the experiments in this
research show that our proposed models perform better than classical and
alternative machine learning models for reject inference in credit scoring
- …