681 research outputs found
Multitask Active Learning for Graph Anomaly Detection
In the web era, graph machine learning has been widely used on ubiquitous
graph-structured data. As a pivotal component for bolstering web security and
enhancing the robustness of graph-based applications, the significance of graph
anomaly detection is continually increasing. While Graph Neural Networks (GNNs)
have demonstrated efficacy in supervised and semi-supervised graph anomaly
detection, their performance is contingent upon the availability of sufficient
ground truth labels. The labor-intensive nature of identifying anomalies from
complex graph structures poses a significant challenge in real-world
applications. Despite that, the indirect supervision signals from other tasks
(e.g., node classification) are relatively abundant. In this paper, we propose
a novel MultItask acTIve Graph Anomaly deTEction framework, namely MITIGATE.
Firstly, by coupling node classification tasks, MITIGATE obtains the capability
to detect out-of-distribution nodes without known anomalies. Secondly, MITIGATE
quantifies the informativeness of nodes by the confidence difference across
tasks, allowing samples with conflicting predictions to provide informative yet
not excessively challenging information for subsequent training. Finally, to
enhance the likelihood of selecting representative nodes that are distant from
known patterns, MITIGATE adopts a masked aggregation mechanism for distance
measurement, considering both inherent features of nodes and current labeled
status. Empirical studies on four datasets demonstrate that MITIGATE
significantly outperforms the state-of-the-art methods for anomaly detection.
Our code is publicly available at: https://github.com/AhaChang/MITIGATE.Comment: Preprint. Under review. Code available at
https://github.com/AhaChang/MITIGAT
Achieving Hate Speech Detection in a Low Resource Setting
Online social networks provide people with convenient platforms to communicate and share life moments. However, because of the anonymous property of these social media platforms, the cases of online hate speeches are increasing. Hate speech is defined by the Cambridge Dictionary as “public speech that expresses hate or encourages violence towards a person or group based on something such as race, religion, sex, or sexual orientation”. Online hate speech has caused serious negative effects to legitimate users, including mental or emotional stress, reputational damage, and fear for one’s safety. To protect legitimate online users, automatically hate speech detection techniques are deployed on various social media. However, most of the existing hate speech detection models require a large amount of labeled data for training. In the thesis, we focus on achieving hate speech detection without using many labeled samples. In particular, we focus on three scenarios of hate speech detection and propose three corresponding approaches. (i) When we only have limited labeled data for one social media platform, we fine-tune a per-trained language model to conduct hate speech detection on the specific platform. (ii) When we have data from several social media platforms, each of which only has a small size of labeled data, we develop a multitask learning model to detect hate speech on several platforms in parallel. (iii) When we aim to conduct hate speech on a new social media platform, where we do not have any labeled data for this platform, we propose to use domain adaptation to transfer knowledge from some other related social media platforms to conduct hate speech detection on the new platform. Empirical studies show that our proposed approaches can achieve good performance on hate speech detection in a low resource setting
Transductive Auxiliary Task Self-Training for Neural Multi-Task Models
Multi-task learning and self-training are two common ways to improve a
machine learning model's performance in settings with limited training data.
Drawing heavily on ideas from those two approaches, we suggest transductive
auxiliary task self-training: training a multi-task model on (i) a combination
of main and auxiliary task training data, and (ii) test instances with
auxiliary task labels which a single-task version of the model has previously
generated. We perform extensive experiments on 86 combinations of languages and
tasks. Our results are that, on average, transductive auxiliary task
self-training improves absolute accuracy by up to 9.56% over the pure
multi-task model for dependency relation tagging and by up to 13.03% for
semantic tagging.Comment: Camera ready version, to appear at DeepLo 2019 (EMNLP workshop
Efficient Methods for the Design and Training of Neural Networks
The field of artificial intelligence has seen significant advancements with the development of neural networks, which have numerous applications in computer vision, natural language processing, and speech processing. Despite these advancements, designing and training these networks still pose numerous challenges. This thesis aims to address two critical aspects of neural network development, design and training, within the context of computer vision tasks.
The thesis focuses on three main challenges in the development of neural networks. The first challenge is finding an efficient way to perform architecture search in an extremely large or even unlimited search space. To address this challenge, the thesis proposes a Neural Search-space Evolution (NSE) scheme that enables efficient and effective architecture search in large-scale search spaces. The second challenge is to improve the efficiency of self-supervised learning for model pretraining. To address this challenge, the thesis proposes a combinatorial patches approach that significantly improves the efficiency of self-supervised learning. The third challenge is to develop an efficient and versatile multitask model that can leverage the benefits of large-scale multitask training. To address this challenge, the thesis proposes a Unified model for Human-Centric Perceptions (UniHCP) as a simple and scalable solution for a human-centric perception system that unifies multiple human-centric tasks into a neat, efficient, and scalable model.
The results of this thesis demonstrate the effectiveness of the proposed methods in improving the practicality and performance of neural network design and training. The NSE scheme, combinatorial patches approach, and UniHCP have been tested on a broad range of datasets, tasks, and settings, yielding impressive results. These findings affirm the efficacy of the proposed methods in enhancing the efficiency of the design and training process of neural networks
Recommended from our members
From Fully-Supervised, Single-Task to Scarcely-Supervised, Multi-Task Deep Learning for Medical Image Analysis
Image analysis based on machine learning has gained prominence with the advent of deep learning, particularly in medical imaging. To be effective in addressing challenging image analysis tasks, however, conventional deep neural networks require large corpora of annotated training data, which are unfortunately scarce in the medical domain, thus often rendering fully-supervised learning strategies ineffective.This thesis devises for use in a variety of medical image analysis applications a series of novel deep learning methods, ranging from fully-supervised, single-task learning to scarcely-supervised, multi-task learning that makes efficient use of annotated training data. Specifically, its main contributions include (1) fully-supervised, single-task learning for the segmentation of pulmonary lobes from chest CT scans and the analysis of scoliosis from spine X-ray images; (2) supervised, single-task, domain-generalized pulmonary segmentation in chest X-ray images and retinal vasculature segmentation in fundoscopic images; (3) largely-unsupervised, multiple-task learning via deep generative modeling for the joint synthesis and classification of medical image data; and (4) partly-supervised, multiple-task learning for the combined segmentation and classification of chest and spine X-ray images
Semi-supervised training for automatic speech recognition
State-of-the-art automatic speech recognition (ASR) systems use sequence-level objectives like Connectionist Temporal Classification (CTC) and Lattice-free Maximum Mutual Information (LF-MMI) for training neural network-based acoustic models. These methods are known to be most effective with large size datasets with hundreds or thousands of hours of data. It is difficult to obtain large amounts of supervised data other than in a few major languages like English and Mandarin. It is also difficult to obtain supervised data in a myriad of channel and envirormental conditions. On the other hand, large amounts of unsupervised audio can be obtained fairly easily. There are enormous amounts of unsupervised data available in broadcast TV, call centers and YouTube for many different languages and in many environment conditions. The goal of this research is to discover how to best leverage the available unsupervised data for training acoustic models for ASR.
In the first part of this thesis, we extend the Maximum Mutual Information (MMI) training to the semi-supervised training scenario. We show that maximizing Negative Conditional Entropy (NCE) over lattices from unsupervised data, along with state-level Minimum Bayes Risk (sMBR) on supervised data, in a multi-task architecture gives word error rate (WER) improvements without needing any confidence-based filtering.
In the second part of this thesis, we investigate using lattice-based supervision as numerator graph to incorporate uncertainities in unsupervised data in the LF-MMI training framework. We explore various aspects of creating the numerator graph including splitting lattices for minibatch training, applying tolerance to frame-level alignments, pruning beam sizes, word LM scale and inclusion of pronunciation variants. We show that the WER recovery rate (WRR) of our proposed approach is 5-10\% absolute better than that of the baseline of using 1-best transcript as supervision, and is stable in the 40-60\% range even on large-scale setups and multiple different languages.
Finally, we explore transfer learning for the scenario where we have unsupervised data in a mismatched domain. First, we look at the teacher-student learning approach for cases where parallel data is available in source and target domains. Here, we train a "student" neural network on the target domain to mimic a "teacher" neural network on the source domain data, but using sequence-level posteriors instead of the traditional approach of using frame-level posteriors. We show that the proposed approach is very effective to deal with acoustic domain mismatch in multiple scenarios of unsupervised domain adaptation -- clean to noisy speech, 8kHz to 16kHz speech, close-talk microphone to distant microphone. Second, we investigate approaches to mitigate language domain mismatch, and show that a matched language model significantly improves WRR. We finally show that our proposed semi-supervised transfer learning approach works effectively even on large-scale unsupervised datasets with 2000 hours of audio in natural and realistic conditions
- …