93,338 research outputs found
Deep Active Learning Explored Across Diverse Label Spaces
abstract: Deep learning architectures have been widely explored in computer vision and have
depicted commendable performance in a variety of applications. A fundamental challenge
in training deep networks is the requirement of large amounts of labeled training
data. While gathering large quantities of unlabeled data is cheap and easy, annotating
the data is an expensive process in terms of time, labor and human expertise.
Thus, developing algorithms that minimize the human effort in training deep models
is of immense practical importance. Active learning algorithms automatically identify
salient and exemplar samples from large amounts of unlabeled data and can augment
maximal information to supervised learning models, thereby reducing the human annotation
effort in training machine learning models. The goal of this dissertation is to
fuse ideas from deep learning and active learning and design novel deep active learning
algorithms. The proposed learning methodologies explore diverse label spaces to
solve different computer vision applications. Three major contributions have emerged
from this work; (i) a deep active framework for multi-class image classication, (ii)
a deep active model with and without label correlation for multi-label image classi-
cation and (iii) a deep active paradigm for regression. Extensive empirical studies
on a variety of multi-class, multi-label and regression vision datasets corroborate the
potential of the proposed methods for real-world applications. Additional contributions
include: (i) a multimodal emotion database consisting of recordings of facial
expressions, body gestures, vocal expressions and physiological signals of actors enacting
various emotions, (ii) four multimodal deep belief network models and (iii)
an in-depth analysis of the effect of transfer of multimodal emotion features between
source and target networks on classification accuracy and training time. These related
contributions help comprehend the challenges involved in training deep learning
models and motivate the main goal of this dissertation.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201
Deep Q-Learning for Nash Equilibria: Nash-DQN
Model-free learning for multi-agent stochastic games is an active area of
research. Existing reinforcement learning algorithms, however, are often
restricted to zero-sum games, and are applicable only in small state-action
spaces or other simplified settings. Here, we develop a new data efficient
Deep-Q-learning methodology for model-free learning of Nash equilibria for
general-sum stochastic games. The algorithm uses a local linear-quadratic
expansion of the stochastic game, which leads to analytically solvable optimal
actions. The expansion is parametrized by deep neural networks to give it
sufficient flexibility to learn the environment without the need to experience
all state-action pairs. We study symmetry properties of the algorithm stemming
from label-invariant stochastic games and as a proof of concept, apply our
algorithm to learning optimal trading strategies in competitive electronic
markets.Comment: 16 pages, 4 figure
Nuevos Modelos de Aprendizaje Híbrido para Clasificación y Ordenamiento Multi-Etiqueta
En la última década, el aprendizaje multi-etiqueta se ha convertido en una importante tarea de investigación, debido en gran parte al creciente número de problemas reales que contienen datos multi-etiqueta. En esta tesis se estudiaron dos problemas sobre datos multi-etiqueta, la mejora del rendimiento de los algoritmos en datos multi-etiqueta complejos y la mejora del rendimiento de los algoritmos a partir de datos no etiquetados. El primer problema fue tratado mediante métodos de estimación de atributos. Se evaluó la efectividad de los métodos de estimación de atributos propuestos en la mejora del rendimiento de los algoritmos de vecindad, mediante la parametrización de las funciones de distancias empleadas para recuperar los ejemplos más cercanos. Además, se demostró la efectividad de los métodos de estimación en la tarea de selección de atributos. Por otra parte, se desarrolló un algoritmo de vecindad inspirado en el enfoque de clasifcación basada en gravitación de datos. Este algoritmo garantiza un balance adecuado entre eficiencia y efectividad en su solución ante datos multi-etiqueta complejos. El segundo problema fue resuelto mediante técnicas de aprendizaje activo, lo cual permite reducir los costos del etiquetado de datos y del entrenamiento de un mejor modelo. Se propusieron dos estrategias de aprendizaje activo. La primer estrategia resuelve el problema de aprendizaje activo multi-etiqueta de una manera efectiva y eficiente, para ello se combinaron dos medidas que representan la utilidad de un ejemplo no etiquetado. La segunda estrategia propuesta se enfocó en la resolución del problema de aprendizaje activo multi-etiqueta en modo de lotes, para ello se formuló un problema multi-objetivo donde se optimizan tres medidas, y el problema de optimización planteado se resolvió mediante un algoritmo evolutivo. Como resultados complementarios derivados de esta tesis, se desarrolló una herramienta computacional que favorece la implementación de métodos de aprendizaje activo y la experimentación en esta tarea de estudio. Además, se propusieron dos aproximaciones que permiten evaluar el rendimiento de las técnicas de aprendizaje activo de una manera más adecuada y robusta que la empleada comunmente en la literatura. Todos los métodos propuestos en esta tesis han sido evaluados en un marco experimental
adecuado, se utilizaron numerosos conjuntos de datos y se compararon
los rendimientos de los algoritmos frente a otros métodos del estado del arte. Los
resultados obtenidos, los cuales fueron verificados mediante la aplicación de test
estadísticos no paramétricos, demuestran la efectividad de los métodos propuestos
y de esta manera comprueban las hipótesis planteadas en esta tesis.In the last decade, multi-label learning has become an important area of research
due to the large number of real-world problems that contain multi-label data. This
doctoral thesis is focused on the multi-label learning paradigm. Two problems were
studied, rstly, improving the performance of the algorithms on complex multi-label
data, and secondly, improving the performance through unlabeled data.
The rst problem was solved by means of feature estimation methods. The e ectiveness
of the feature estimation methods proposed was evaluated by improving
the performance of multi-label lazy algorithms. The parametrization of the distance
functions with a weight vector allowed to recover examples with relevant
label sets for classi cation. It was also demonstrated the e ectiveness of the feature
estimation methods in the feature selection task. On the other hand, a lazy
algorithm based on a data gravitation model was proposed. This lazy algorithm
has a good trade-o between e ectiveness and e ciency in the resolution of the
multi-label lazy learning.
The second problem was solved by means of active learning techniques. The active
learning methods allowed to reduce the costs of the data labeling process and
training an accurate model. Two active learning strategies were proposed. The
rst strategy e ectively solves the multi-label active learning problem. In this
strategy, two measures that represent the utility of an unlabeled example were
de ned and combined. On the other hand, the second active learning strategy proposed
resolves the batch-mode active learning problem, where the aim is to select a
batch of unlabeled examples that are informative and the information redundancy
is minimal. The batch-mode active learning was formulated as a multi-objective
problem, where three measures were optimized. The multi-objective problem was
solved through an evolutionary algorithm.
This thesis also derived in the creation of a computational framework to develop
any active learning method and to favor the experimentation process in the active
learning area. On the other hand, a methodology based on non-parametric
tests that allows a more adequate evaluation of active learning performance was
proposed. All methods proposed were evaluated by means of extensive and adequate experimental
studies. Several multi-label datasets from di erent domains were used, and
the methods were compared to the most signi cant state-of-the-art algorithms. The
results were validated using non-parametric statistical tests. The evidence showed
the e ectiveness of the methods proposed, proving the hypotheses formulated at
the beginning of this thesis
Text Classification
There is an abundance of text data in this world but most of it is raw. We need to extract information from this data to make use of it. One way to extract this information from raw text is to apply informative labels drawn from a pre-defined fixed set i.e. Text Classification. In this thesis, we focus on the general problem of text classification, and work towards solving challenges associated to binary/multi-class/multi-label classification. More specifically, we deal with the problem of (i) Zero-shot labels during testing; (ii) Active learning for text screening; (iii) Multi-label classification under low supervision; (iv) Structured label space; (v) Classifying pairs of words in raw text i.e. Relation Extraction. For (i), we use a zero-shot classification model that utilizes independently learned semantic embeddings. Regarding (ii), we propose a novel active learning algorithm that reduces problem of bias in naive active learning algorithms. For (iii), we propose neural candidate-selector architecture that starts from a set of high-recall candidate labels to obtain high-precision predictions. In the case of (iv), we proposed an attention based neural tree decoder that recursively decodes an abstract into the ontology tree. For (v), we propose using second-order relations that are derived by explicitly connecting pairs of words via context token(s) for improved relation extraction. We use a wide variety of both traditional and deep machine learning tools. More specifically, we used traditional machine learning models like multi-valued linear regression and logistic regression for (i, ii), deep convolutional neural networks for (iii), recurrent neural networks for (iv) and transformer networks for (v)
Multi-label Rule Learning
Research on multi-label classification is concerned with developing and evaluating algorithms that learn a predictive model for the automatic assignment of data points to a subset of predefined class labels. This is in contrast to traditional classification settings, where individual data points cannot be assigned to more than a single class. As many practical use cases demand a flexible categorization of data, where classes must not necessarily be mutually exclusive, multi-label classification has become an established topic of machine learning research. Nowadays, it is used for the assignment of keywords to text documents, the annotation of multimedia files, such as images, videos, or audio recordings, as well as for diverse applications in biology, chemistry, social network analysis, or marketing. During the past decade, increasing interest in the topic has resulted in a wide variety of different multi-label classification methods. Following the principles of supervised learning, they derive a model from labeled training data, which can afterward be used to obtain predictions for yet unseen data. Besides complex statistical methods, such as artificial neural networks, symbolic learning approaches have not only been shown to provide state-of-the-art performance in many applications but are also a common choice in safety-critical domains that demand human-interpretable and verifiable machine learning models. In particular, rule learning algorithms have a long history of active research in the scientific community. They are often argued to meet the requirements of interpretable machine learning due to the human-legible representation of learned knowledge in terms of logical statements. This work presents a modular framework for implementing multi-label rule learning methods. It does not only provide a unified view of existing rule-based approaches to multi-label classification, but also facilitates the development of new learning algorithms. Two novel instantiations of the framework are investigated to demonstrate its flexibility. Whereas the first one relies on traditional rule learning techniques and focuses on interpretability, the second one is based on a generalization of the gradient boosting framework and focuses on predictive performance rather than the simplicity of models. Motivated by the increasing demand for highly scalable learning algorithms that are capable of processing large amounts of training data, this work also includes an extensive discussion of algorithmic optimizations and approximation techniques for the efficient induction of rules. As the novel multi-label classification methods that are presented in this work can be viewed as instantiations of the same framework, they can both benefit from most of these principles. Their effectiveness and efficiency are compared to existing baselines experimentally
- …