8 research outputs found
Competence-based Curriculum Learning for Neural Machine Translation
Current state-of-the-art NMT systems use large neural networks that are not
only slow to train, but also often require many heuristics and optimization
tricks, such as specialized learning rate schedules and large batch sizes. This
is undesirable as it requires extensive hyperparameter tuning. In this paper,
we propose a curriculum learning framework for NMT that reduces training time,
reduces the need for specialized heuristics or large batch sizes, and results
in overall better performance. Our framework consists of a principled way of
deciding which training samples are shown to the model at different times
during training, based on the estimated difficulty of a sample and the current
competence of the model. Filtering training samples in this manner prevents the
model from getting stuck in bad local optima, making it converge faster and
reach a better solution than the common approach of uniformly sampling training
examples. Furthermore, the proposed method can be easily applied to existing
NMT models by simply modifying their input data pipelines. We show that our
framework can help improve the training time and the performance of both
recurrent neural network models and Transformers, achieving up to a 70%
decrease in training time, while at the same time obtaining accuracy
improvements of up to 2.2 BLEU
Benchmarking Robustness to Adversarial Image Obfuscations
Automated content filtering and moderation is an important tool that allows
online platforms to build striving user communities that facilitate cooperation
and prevent abuse. Unfortunately, resourceful actors try to bypass automated
filters in a bid to post content that violate platform policies and codes of
conduct. To reach this goal, these malicious actors may obfuscate policy
violating images (e.g. overlay harmful images by carefully selected benign
images or visual patterns) to prevent machine learning models from reaching the
correct decision. In this paper, we invite researchers to tackle this specific
issue and present a new image benchmark. This benchmark, based on ImageNet,
simulates the type of obfuscations created by malicious actors. It goes beyond
ImageNet- and ImageNet- by proposing general,
drastic, adversarial modifications that preserve the original content intent.
It aims to tackle a more common adversarial threat than the one considered by
-norm bounded adversaries. We evaluate 33 pretrained models on the
benchmark and train models with different augmentations, architectures and
training methods on subsets of the obfuscations to measure generalization. We
hope this benchmark will encourage researchers to test their models and methods
and try to find new approaches that are more robust to these obfuscations
Subthalamic Nucleus and Sensorimotor Cortex Activity During Speech Production
The sensorimotor cortex is somatotopically organized to represent the vocal tract articulators such as lips, tongue, larynx, and jaw. How speech and articulatory features are encoded at the subcortical level, however, remains largely unknown. We analyzed LFP recordings from the subthalamic nucleus (STN) and simultaneous electrocorticography recordings from the sensorimotor cortex of 11 human subjects (1 female) with Parkinson´s disease during implantation of deep-brain stimulation (DBS) electrodes while they read aloud three-phoneme words. The initial phonemes involved either articulation primarily with the tongue (coronal consonants) or the lips (labial consonants). We observed significant increases in high-gamma (60?150 Hz) power in both the STN and the sensorimotor cortex that began before speech onset and persisted for the duration of speech articulation. As expected from previous reports, in the sensorimotor cortex, the primary articulators involved in the production of the initial consonants were topographically represented by high-gamma activity. We found that STN high-gamma activity also demonstrated specificity for the primary articulator, although no clear topography was observed. In general, subthalamic high-gamma activity varied along the ventral?dorsal trajectory of the electrodes, with greater high-gamma power recorded in the dorsal locations of the STN. Interestingly, the majority of significant articulator-discriminative activity in the STN occurred before that in sensorimotor cortex. These results demonstrate that articulator-specific speech information is contained within high-gamma activity of the STN, but with different spatial and temporal organization compared with similar information encoded in the sensorimotor cortex.Fil: Chrabaszcz, Anna. University of Pittsburgh; Estados UnidosFil: Neumann, Wolf Julian. Universität zu Berlin; AlemaniaFil: Stretcu, Otilia. University of Pittsburgh; Estados UnidosFil: Lipski, Witold J.. University of Pittsburgh; Estados UnidosFil: Dastolfo Hromack, Christina A.. University of Pittsburgh; Estados UnidosFil: Bush, Alan. University of Pittsburgh; Estados Unidos. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física de Buenos Aires; ArgentinaFil: Wang, Dengyu. Tsinghua University; China. University of Pittsburgh; Estados UnidosFil: Crammond, Donald J.. University of Pittsburgh; Estados UnidosFil: Shaiman, Susan. University of Pittsburgh; Estados UnidosFil: Dickey, Michael W.. University of Pittsburgh; Estados UnidosFil: Holt, Lori L.. University of Pittsburgh; Estados UnidosFil: Turner, Robert S.. University of Pittsburgh; Estados UnidosFil: Fiez, Julie A.. University of Pittsburgh; Estados UnidosFil: Richardson, R. Mark. University of Pittsburgh; Estados Unido
Contextual Parameter Generation for Knowledge Graph Link Prediction
We consider the task of knowledge graph link prediction. Given a question consisting of a source entity and a relation (e.g., Shakespeare and BornIn), the objective is to predict the most likely answer entity (e.g., England). Recent approaches tackle this problem by learning entity and relation embeddings. However, they often constrain the relationship between these embeddings to be additive (i.e., the embeddings are concatenated and then processed by a sequence of linear functions and element-wise non-linearities). We show that this type of interaction significantly limits representational power. For example, such models cannot handle cases where a different projection of the source entity is used for each relation. We propose to use contextual parameter generation to address this limitation. More specifically, we treat relations as the context in which source entities are processed to produce predictions, by using relation embeddings to generate the parameters of a model operating over source entity embeddings. This allows models to represent more complex interactions between entities and relations. We apply our method on two existing link prediction methods, including the current state-of-the-art, resulting in significant performance gains and establishing a new state-of-the-art for this task. These gains are achieved while also reducing convergence time by up to 28 times
Agile Modeling: From Concept to Classifier in Minutes
The application of computer vision to nuanced subjective use cases is
growing. While crowdsourcing has served the vision community well for most
objective tasks (such as labeling a "zebra"), it now falters on tasks where
there is substantial subjectivity in the concept (such as identifying "gourmet
tuna"). However, empowering any user to develop a classifier for their concept
is technically difficult: users are neither machine learning experts, nor have
the patience to label thousands of examples. In reaction, we introduce the
problem of Agile Modeling: the process of turning any subjective visual concept
into a computer vision model through a real-time user-in-the-loop interactions.
We instantiate an Agile Modeling prototype for image classification and show
through a user study (N=14) that users can create classifiers with minimal
effort under 30 minutes. We compare this user driven process with the
traditional crowdsourcing paradigm and find that the crowd's notion often
differs from that of the user's, especially as the concepts become more
subjective. Finally, we scale our experiments with simulations of users
training classifiers for ImageNet21k categories to further demonstrate the
efficacy