Search CORE

11 research outputs found

An End-to-End Approach for Training Neural Network Binary Classifiers on Metrics Based on the Confusion Matrix

Author: Candon Kate
Milkessa Yofti
Tsoi Nathan
Vázquez Marynel
Publication venue
Publication date: 29/09/2021
Field of study

While neural network binary classifiers are often evaluated on metrics such as Accuracy and

F_1

-Score, they are commonly trained with a cross-entropy objective. How can this training-testing gap be addressed? While specific techniques have been adopted to optimize certain confusion matrix based metrics, it is challenging or impossible in some cases to generalize the techniques to other metrics. Adversarial learning approaches have also been proposed to optimize networks via confusion matrix based metrics, but they tend to be much slower than common training methods. In this work, we propose to approximate the Heaviside step function, typically used to compute confusion matrix based metrics, to render these metrics amenable to gradient descent. Our extensive experiments show the effectiveness of our end-to-end approach for binary classification in several domains

arXiv.org e-Print Archive

Attentional Biased Stochastic Gradient for Imbalanced Classification

Author: Jin Rong
Qi Qi
Xu Yi
Yang Tianbao
Yin Wotao
Publication venue
Publication date: 10/10/2021
Field of study

In this paper, we present a simple yet effective method (ABSGD) for addressing the data imbalance issue in deep learning. Our method is a simple modification to momentum SGD where we leverage an attentional mechanism to assign an individual importance weight to each gradient in the mini-batch. Unlike many existing heuristic-driven methods for tackling data imbalance, our method is grounded in {\it theoretically justified distributionally robust optimization (DRO)}, which is guaranteed to converge to a stationary point of an information-regularized DRO problem. The individual-level weight of a sampled data is systematically proportional to the exponential of a scaled loss value of the data, where the scaling factor is interpreted as the regularization parameter in the framework of information-regularized DRO. Compared with existing class-level weighting schemes, our method can capture the diversity between individual examples within each class. Compared with existing individual-level weighting methods using meta-learning that require three backward propagations for computing mini-batch stochastic gradients, our method is more efficient with only one backward propagation at each iteration as in standard deep learning methods. To balance between the learning of feature extraction layers and the learning of the classifier layer, we employ a two-stage method that uses SGD for pretraining followed by ABSGD for learning a robust classifier and finetuning lower layers. Our empirical studies on several benchmark datasets demonstrate the effectiveness of the proposed method.Comment: 29pages, 10 figure

arXiv.org e-Print Archive

Learning speech embeddings for speaker adaptation and speech understanding

Author: Sari Leda
Publication venue
Publication date: 01/05/2021
Field of study

In recent years, deep neural network models gained popularity as a modeling approach for many speech processing tasks including automatic speech recognition (ASR) and spoken language understanding (SLU). In this dissertation, there are two main goals. The first goal is to propose modeling approaches in order to learn speaker embeddings for speaker adaptation or to learn semantic speech embeddings. The second goal is to introduce training objectives that achieve fairness for the ASR and SLU problems. In the case of speaker adaptation, we introduce an auxiliary network to an ASR model and learn to simultaneously detect speaker changes and adapt to the speaker in an unsupervised way. We show that this joint model leads to lower error rates as compared to a two-step approach where the signal is segmented into single speaker regions and then fed into an adaptation model. We then reformulate the speaker adaptation problem from a counterfactual fairness point-of-view and introduce objective functions to match the ASR performance of the individuals in the dataset to that of their counterfactual counterparts. We show that we can achieve lower error rate in an ASR system while reducing the performance disparity between protected groups. In the second half of the dissertation, we focus on SLU and tackle two problems associated with SLU datasets. The first SLU problem is the lack of large speech corpora. To handle this issue, we propose to use available non-parallel text data so that we can leverage the information in text to guide learning of the speech embeddings. We show that this technique increases the intent classification accuracy as compared to a speech-only system. The second SLU problem is the label imbalance problem in the datasets, which is also related to fairness since a model trained on skewed data usually leads to biased results. To achieve fair SLU, we propose to maximize the F-measure instead of conventional cross-entropy minimization and show that it is possible to increase the number of classes with nonzero recall. In the last two chapters, we provide additional discussions on the impact of these projects from both technical and social perspectives, propose directions for future research and summarize the findings

Illinois Digital Environment for Access to Learning and Scholarship Repository

Recommended from our members

Spatio-Temporal Information Extraction Under Uncertainty Using Multi-Source Data Integration and Machine Learning: Applications To Human Settlement Modelling

Author: Uhl Johannes H.
Publication venue: University of Colorado Boulder
Publication date: 15/06/2019
Field of study

Due to advances in information and communication technology, new ways of acquisition, storage, and analysis of digital data have emerged. This constitutes new opportunities, but also imposes challenges for many scientific disciplines, including the geospatial sciences, where the availability, accessibility, and spatio-temporal granularity and coverage of environmental, geographic, and socioeconomic data is steadily increasing. Multi-source data measuring identical or related processes typically increase the reliability of knowledge derived but also lead to higher levels of discrepancies. In order to fully benefit from the value of such multi-source data, the contained information needs to be extracted effectively and efficiently, employing adequate data integration, mining, and analysis techniques. This work demonstrates how the integration of coherent multi-source geospatial data supports information extraction and analysis to generate new knowledge of both, the data itself and the underlying phenomenon, exemplified by the spatio-temporal distribution of human settlements. I present three applications in the field of human settlement modelling where data integration is a key component for knowledge acquisition. These three applications consist of i) a deep-learning based classification framework for fully automated extraction of built-up areas from historical maps in the spatial domain, ii) a machine-learning based time series classification framework for estimating changes in built-up areas in the temporal domain, based on multispectral remote sensing time series data, and iii) a novel framework for an in-depth accuracy assessment of model-generated data, exemplified by the Global Human Settlement Layer, for a detailed analysis of data uncertainty in the spatio-temporal domain, as well as across different scales and aggregation levels, attempting to quantify the fitness-for-use of such data.</p

CU Scholar Institutional Repository