1,113 research outputs found
Deep Active Learning Explored Across Diverse Label Spaces
abstract: Deep learning architectures have been widely explored in computer vision and have
depicted commendable performance in a variety of applications. A fundamental challenge
in training deep networks is the requirement of large amounts of labeled training
data. While gathering large quantities of unlabeled data is cheap and easy, annotating
the data is an expensive process in terms of time, labor and human expertise.
Thus, developing algorithms that minimize the human effort in training deep models
is of immense practical importance. Active learning algorithms automatically identify
salient and exemplar samples from large amounts of unlabeled data and can augment
maximal information to supervised learning models, thereby reducing the human annotation
effort in training machine learning models. The goal of this dissertation is to
fuse ideas from deep learning and active learning and design novel deep active learning
algorithms. The proposed learning methodologies explore diverse label spaces to
solve different computer vision applications. Three major contributions have emerged
from this work; (i) a deep active framework for multi-class image classication, (ii)
a deep active model with and without label correlation for multi-label image classi-
cation and (iii) a deep active paradigm for regression. Extensive empirical studies
on a variety of multi-class, multi-label and regression vision datasets corroborate the
potential of the proposed methods for real-world applications. Additional contributions
include: (i) a multimodal emotion database consisting of recordings of facial
expressions, body gestures, vocal expressions and physiological signals of actors enacting
various emotions, (ii) four multimodal deep belief network models and (iii)
an in-depth analysis of the effect of transfer of multimodal emotion features between
source and target networks on classification accuracy and training time. These related
contributions help comprehend the challenges involved in training deep learning
models and motivate the main goal of this dissertation.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201
Redundancy-Adaptive Multimodal Learning for Imperfect Data
Multimodal models trained on complete modality data often exhibit a
substantial decrease in performance when faced with imperfect data containing
corruptions or missing modalities. To address this robustness challenge, prior
methods have explored various approaches from aspects of augmentation,
consistency or uncertainty, but these approaches come with associated drawbacks
related to data complexity, representation, and learning, potentially
diminishing their overall effectiveness. In response to these challenges, this
study introduces a novel approach known as the Redundancy-Adaptive Multimodal
Learning (RAML). RAML efficiently harnesses information redundancy across
multiple modalities to combat the issues posed by imperfect data while
remaining compatible with the complete modality. Specifically, RAML achieves
redundancy-lossless information extraction through separate unimodal
discriminative tasks and enforces a proper norm constraint on each unimodal
feature representation. Furthermore, RAML explicitly enhances multimodal fusion
by leveraging fine-grained redundancy among unimodal features to learn
correspondences between corrupted and untainted information. Extensive
experiments on various benchmark datasets under diverse conditions have
consistently demonstrated that RAML outperforms state-of-the-art methods by a
significant margin
A detection-based pattern recognition framework and its applications
The objective of this dissertation is to present a detection-based pattern recognition framework and demonstrate its applications in automatic speech recognition and broadcast news video story segmentation.
Inspired by the studies of modern cognitive psychology and real-world pattern recognition systems, a detection-based pattern recognition framework is proposed to provide an alternative solution for some complicated pattern recognition problems. The primitive features are first detected and the task-specific knowledge hierarchy is constructed level by level; then a variety of heterogeneous information sources are combined together and the high-level context is incorporated as additional information at certain stages.
A detection-based framework is a â divide-and-conquerâ design paradigm for pattern recognition problems, which will decompose a conceptually difficult problem into many elementary sub-problems that can be handled directly and reliably. Some information fusion strategies will be employed to integrate the evidence from a lower level to form the evidence at a higher level. Such a fusion procedure continues until reaching the top level. Generally, a detection-based framework has many advantages: (1) more flexibility in both detector design and fusion strategies, as these two parts
can be optimized separately; (2) parallel and distributed computational components in primitive feature detection. In such a component-based framework, any primitive component can be replaced by a new one while other components remain unchanged; (3) incremental information integration; (4) high level context information as additional information sources, which can be combined with bottom-up processing at any stage.
This dissertation presents the basic principles, criteria, and techniques for detector design and hypothesis verification based on the statistical detection and decision theory. In addition, evidence fusion strategies were investigated in this dissertation. Several novel detection algorithms and evidence fusion methods were proposed and their effectiveness was justified in automatic speech recognition and broadcast news video segmentation system. We believe such a detection-based framework can be employed
in more applications in the future.Ph.D.Committee Chair: Lee, Chin-Hui; Committee Member: Clements, Mark; Committee Member: Ghovanloo, Maysam; Committee Member: Romberg, Justin; Committee Member: Yuan, Min
Active Learning in Physics: From 101, to Progress, and Perspective
Active Learning (AL) is a family of machine learning (ML) algorithms that
predates the current era of artificial intelligence. Unlike traditional
approaches that require labeled samples for training, AL iteratively selects
unlabeled samples to be annotated by an expert. This protocol aims to
prioritize the most informative samples, leading to improved model performance
compared to training with all labeled samples. In recent years, AL has gained
increasing attention, particularly in the field of physics. This paper presents
a comprehensive and accessible introduction to the theory of AL reviewing the
latest advancements across various domains. Additionally, we explore the
potential integration of AL with quantum ML, envisioning a synergistic fusion
of these two fields rather than viewing AL as a mere extension of classical ML
into the quantum realm.Comment: 15 page
- …