1,113 research outputs found

    Deep Active Learning Explored Across Diverse Label Spaces

    Get PDF
    abstract: Deep learning architectures have been widely explored in computer vision and have depicted commendable performance in a variety of applications. A fundamental challenge in training deep networks is the requirement of large amounts of labeled training data. While gathering large quantities of unlabeled data is cheap and easy, annotating the data is an expensive process in terms of time, labor and human expertise. Thus, developing algorithms that minimize the human effort in training deep models is of immense practical importance. Active learning algorithms automatically identify salient and exemplar samples from large amounts of unlabeled data and can augment maximal information to supervised learning models, thereby reducing the human annotation effort in training machine learning models. The goal of this dissertation is to fuse ideas from deep learning and active learning and design novel deep active learning algorithms. The proposed learning methodologies explore diverse label spaces to solve different computer vision applications. Three major contributions have emerged from this work; (i) a deep active framework for multi-class image classication, (ii) a deep active model with and without label correlation for multi-label image classi- cation and (iii) a deep active paradigm for regression. Extensive empirical studies on a variety of multi-class, multi-label and regression vision datasets corroborate the potential of the proposed methods for real-world applications. Additional contributions include: (i) a multimodal emotion database consisting of recordings of facial expressions, body gestures, vocal expressions and physiological signals of actors enacting various emotions, (ii) four multimodal deep belief network models and (iii) an in-depth analysis of the effect of transfer of multimodal emotion features between source and target networks on classification accuracy and training time. These related contributions help comprehend the challenges involved in training deep learning models and motivate the main goal of this dissertation.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Redundancy-Adaptive Multimodal Learning for Imperfect Data

    Full text link
    Multimodal models trained on complete modality data often exhibit a substantial decrease in performance when faced with imperfect data containing corruptions or missing modalities. To address this robustness challenge, prior methods have explored various approaches from aspects of augmentation, consistency or uncertainty, but these approaches come with associated drawbacks related to data complexity, representation, and learning, potentially diminishing their overall effectiveness. In response to these challenges, this study introduces a novel approach known as the Redundancy-Adaptive Multimodal Learning (RAML). RAML efficiently harnesses information redundancy across multiple modalities to combat the issues posed by imperfect data while remaining compatible with the complete modality. Specifically, RAML achieves redundancy-lossless information extraction through separate unimodal discriminative tasks and enforces a proper norm constraint on each unimodal feature representation. Furthermore, RAML explicitly enhances multimodal fusion by leveraging fine-grained redundancy among unimodal features to learn correspondences between corrupted and untainted information. Extensive experiments on various benchmark datasets under diverse conditions have consistently demonstrated that RAML outperforms state-of-the-art methods by a significant margin

    A detection-based pattern recognition framework and its applications

    Get PDF
    The objective of this dissertation is to present a detection-based pattern recognition framework and demonstrate its applications in automatic speech recognition and broadcast news video story segmentation. Inspired by the studies of modern cognitive psychology and real-world pattern recognition systems, a detection-based pattern recognition framework is proposed to provide an alternative solution for some complicated pattern recognition problems. The primitive features are first detected and the task-specific knowledge hierarchy is constructed level by level; then a variety of heterogeneous information sources are combined together and the high-level context is incorporated as additional information at certain stages. A detection-based framework is a â divide-and-conquerâ design paradigm for pattern recognition problems, which will decompose a conceptually difficult problem into many elementary sub-problems that can be handled directly and reliably. Some information fusion strategies will be employed to integrate the evidence from a lower level to form the evidence at a higher level. Such a fusion procedure continues until reaching the top level. Generally, a detection-based framework has many advantages: (1) more flexibility in both detector design and fusion strategies, as these two parts can be optimized separately; (2) parallel and distributed computational components in primitive feature detection. In such a component-based framework, any primitive component can be replaced by a new one while other components remain unchanged; (3) incremental information integration; (4) high level context information as additional information sources, which can be combined with bottom-up processing at any stage. This dissertation presents the basic principles, criteria, and techniques for detector design and hypothesis verification based on the statistical detection and decision theory. In addition, evidence fusion strategies were investigated in this dissertation. Several novel detection algorithms and evidence fusion methods were proposed and their effectiveness was justified in automatic speech recognition and broadcast news video segmentation system. We believe such a detection-based framework can be employed in more applications in the future.Ph.D.Committee Chair: Lee, Chin-Hui; Committee Member: Clements, Mark; Committee Member: Ghovanloo, Maysam; Committee Member: Romberg, Justin; Committee Member: Yuan, Min

    Active Learning in Physics: From 101, to Progress, and Perspective

    Full text link
    Active Learning (AL) is a family of machine learning (ML) algorithms that predates the current era of artificial intelligence. Unlike traditional approaches that require labeled samples for training, AL iteratively selects unlabeled samples to be annotated by an expert. This protocol aims to prioritize the most informative samples, leading to improved model performance compared to training with all labeled samples. In recent years, AL has gained increasing attention, particularly in the field of physics. This paper presents a comprehensive and accessible introduction to the theory of AL reviewing the latest advancements across various domains. Additionally, we explore the potential integration of AL with quantum ML, envisioning a synergistic fusion of these two fields rather than viewing AL as a mere extension of classical ML into the quantum realm.Comment: 15 page
    • …
    corecore