Search CORE

159 research outputs found

Binary Quantizer

Author: Belbahri Mouloud
Nia Vahid Partovi
Publication venue: University of Waterloo (Waterloo, Ontario, Canada)
Publication date: 24/12/2018
Field of study

One-bit quantization is a general tool to execute a complex model,such as deep neural networks, on a device with limited resources,such as cell phones. Naively compressing weights into one bityields an extensive accuracy loss. One-bit models, therefore, re-quire careful re-training. Here we introduce a class functions de-vised to be used as a regularizer for re-training one-bit models. Us-ing a regularization function, specifically devised for binary quanti-zation, avoids heuristic touch of the optimization scheme and savesconsiderable coding effort

Waterloo Library Journal Publishing Service (University of Waterloo, Canada)

Improved neural machine translation systems for low resource correction tasks

Author: Harer Jacob Alexander
Publication venue
Publication date: 14/02/2020
Field of study

Recent advances in Neural Machine Translation (NMT) systems have achieved impressive results on language translation tasks. However, the success of these systems has been limited when applied to similar low-resource tasks, such as language correction. In these cases, datasets are often small whilst still containing long sequences, leading to significant overfitting and poor generalization. In this thesis we study issues preventing widespread adoption of NMT systems into low resource tasks, with a special focus on sequence correction for both code and language. We propose two novel techniques for handling these low-resource tasks. The first uses Generative Adversarial Networks to handle datasets without paired data. This technique allows the use of available unpaired datasets which are typically much larger than paired datasets since they do not require manual annotation. We first develop a methodology for generation of discrete sequences using a Wasserstein Generative Adversarial Network, and then use this methodology to train a NMT system on unpaired data. Our second technique converts sequences into a tree-structured representation, and performs translation from tree-to-tree. This improves the handling of very long sequences since it reduces the distance between nodes in the network, and allows the network to take advantage of information contained in the tree structure to reduce overfitting

Boston University Institutional Repository (OpenBU)

Combating Unknown Bias with Effective Bias-Conflicting Scoring and Gradient Alignment

Author: Chen Chen
He Anfeng
Wang Qian-Wei
Xia Shu-Tao
Zhao Bowen
Publication venue
Publication date: 27/11/2022
Field of study

Models notoriously suffer from dataset biases which are detrimental to robustness and generalization. The identify-emphasize paradigm shows a promising effect in dealing with unknown biases. However, we find that it is still plagued by two challenges: A, the quality of the identified bias-conflicting samples is far from satisfactory; B, the emphasizing strategies just yield suboptimal performance. In this work, for challenge A, we propose an effective bias-conflicting scoring method to boost the identification accuracy with two practical strategies -- peer-picking and epoch-ensemble. For challenge B, we point out that the gradient contribution statistics can be a reliable indicator to inspect whether the optimization is dominated by bias-aligned samples. Then, we propose gradient alignment, which employs gradient statistics to balance the contributions of the mined bias-aligned and bias-conflicting samples dynamically throughout the learning process, forcing models to leverage intrinsic features to make fair decisions. Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can alleviate the impact of unknown biases and achieve state-of-the-art performance

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications