Search CORE

8 research outputs found

Competence-based Curriculum Learning for Neural Machine Translation

Author: Platanios Emmanouil Antonios
Stretcu Otilia
Neubig Graham
Poczos Barnabas
Mitchell Tom M.
Publication venue
Publication date: 11/05/1906
Field of study

Current state-of-the-art NMT systems use large neural networks that are not only slow to train, but also often require many heuristics and optimization tricks, such as specialized learning rate schedules and large batch sizes. This is undesirable as it requires extensive hyperparameter tuning. In this paper, we propose a curriculum learning framework for NMT that reduces training time, reduces the need for specialized heuristics or large batch sizes, and results in overall better performance. Our framework consists of a principled way of deciding which training samples are shown to the model at different times during training, based on the estimated difficulty of a sample and the current competence of the model. Filtering training samples in this manner prevents the model from getting stuck in bad local optima, making it converge faster and reach a better solution than the common approach of uniformly sampling training examples. Furthermore, the proposed method can be easily applied to existing NMT models by simply modifying their input data pipelines. We show that our framework can help improve the training time and the performance of both recurrent neural network models and Transformers, achieving up to a 70% decrease in training time, while at the same time obtaining accuracy improvements of up to 2.2 BLEU

arXiv.org e-Print Archive

Trinity College

Benchmarking Robustness to Adversarial Image Obfuscations

Author: Chakrabarti Ayan
Fuxman Ariel
Gowal Sven
Hazimeh Hussein
Kaya Merve
Liu Yintao
Lu Chun-Ta
Qiao Wei
Rashtchian Cyrus
Stimberg Florian
Stretcu Otilia
Tek Mehmet
Publication venue
Publication date: 29/11/2023
Field of study

Automated content filtering and moderation is an important tool that allows online platforms to build striving user communities that facilitate cooperation and prevent abuse. Unfortunately, resourceful actors try to bypass automated filters in a bid to post content that violate platform policies and codes of conduct. To reach this goal, these malicious actors may obfuscate policy violating images (e.g. overlay harmful images by carefully selected benign images or visual patterns) to prevent machine learning models from reaching the correct decision. In this paper, we invite researchers to tackle this specific issue and present a new image benchmark. This benchmark, based on ImageNet, simulates the type of obfuscations created by malicious actors. It goes beyond ImageNet-

\textrm{C}

and ImageNet-

\bar{\textrm{C}}

by proposing general, drastic, adversarial modifications that preserve the original content intent. It aims to tackle a more common adversarial threat than the one considered by

\ell_p

-norm bounded adversaries. We evaluate 33 pretrained models on the benchmark and train models with different augmentations, architectures and training methods on subsets of the obfuscations to measure generalization. We hope this benchmark will encourage researchers to test their models and methods and try to find new approaches that are more robust to these obfuscations

arXiv.org e-Print Archive

Subthalamic Nucleus and Sensorimotor Cortex Activity During Speech Production

Author: Bush Alan
Chrabaszcz Anna
Crammond Donald J.
Dastolfo Hromack Christina A.
Dickey Michael W.
Fiez Julie A.
Holt Lori L.
Lipski Witold J.
Neumann Wolf Julian
Richardson R. Mark
Shaiman Susan
Stretcu Otilia
Turner Robert S.
Wang Dengyu
Publication venue: 'Society for Neuroscience'
Publication date: 01/04/2019
Field of study

The sensorimotor cortex is somatotopically organized to represent the vocal tract articulators such as lips, tongue, larynx, and jaw. How speech and articulatory features are encoded at the subcortical level, however, remains largely unknown. We analyzed LFP recordings from the subthalamic nucleus (STN) and simultaneous electrocorticography recordings from the sensorimotor cortex of 11 human subjects (1 female) with Parkinson´s disease during implantation of deep-brain stimulation (DBS) electrodes while they read aloud three-phoneme words. The initial phonemes involved either articulation primarily with the tongue (coronal consonants) or the lips (labial consonants). We observed significant increases in high-gamma (60?150 Hz) power in both the STN and the sensorimotor cortex that began before speech onset and persisted for the duration of speech articulation. As expected from previous reports, in the sensorimotor cortex, the primary articulators involved in the production of the initial consonants were topographically represented by high-gamma activity. We found that STN high-gamma activity also demonstrated specificity for the primary articulator, although no clear topography was observed. In general, subthalamic high-gamma activity varied along the ventral?dorsal trajectory of the electrodes, with greater high-gamma power recorded in the dorsal locations of the STN. Interestingly, the majority of significant articulator-discriminative activity in the STN occurred before that in sensorimotor cortex. These results demonstrate that articulator-specific speech information is contained within high-gamma activity of the STN, but with different spatial and temporal organization compared with similar information encoded in the sensorimotor cortex.Fil: Chrabaszcz, Anna. University of Pittsburgh; Estados UnidosFil: Neumann, Wolf Julian. Universität zu Berlin; AlemaniaFil: Stretcu, Otilia. University of Pittsburgh; Estados UnidosFil: Lipski, Witold J.. University of Pittsburgh; Estados UnidosFil: Dastolfo Hromack, Christina A.. University of Pittsburgh; Estados UnidosFil: Bush, Alan. University of Pittsburgh; Estados Unidos. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física de Buenos Aires; ArgentinaFil: Wang, Dengyu. Tsinghua University; China. University of Pittsburgh; Estados UnidosFil: Crammond, Donald J.. University of Pittsburgh; Estados UnidosFil: Shaiman, Susan. University of Pittsburgh; Estados UnidosFil: Dickey, Michael W.. University of Pittsburgh; Estados UnidosFil: Holt, Lori L.. University of Pittsburgh; Estados UnidosFil: Turner, Robert S.. University of Pittsburgh; Estados UnidosFil: Fiez, Julie A.. University of Pittsburgh; Estados UnidosFil: Richardson, R. Mark. University of Pittsburgh; Estados Unido

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CONICET Digital

Contextual Parameter Generation for Knowledge Graph Link Prediction

Author: Mitchell Tom
Platanios Emmanouil Antonios
Póczos Barnabás
Stoica George
Stretcu Otilia
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 03/04/2020
Field of study

We consider the task of knowledge graph link prediction. Given a question consisting of a source entity and a relation (e.g., Shakespeare and BornIn), the objective is to predict the most likely answer entity (e.g., England). Recent approaches tackle this problem by learning entity and relation embeddings. However, they often constrain the relationship between these embeddings to be additive (i.e., the embeddings are concatenated and then processed by a sequence of linear functions and element-wise non-linearities). We show that this type of interaction significantly limits representational power. For example, such models cannot handle cases where a different projection of the source entity is used for each relation. We propose to use contextual parameter generation to address this limitation. More specifically, we treat relations as the context in which source entities are processed to produce predictions, by using relation embeddings to generate the parameters of a model operating over source entity embeddings. This allows models to represent more complex interactions between entities and relations. We apply our method on two existing link prediction methods, including the current state-of-the-art, resulting in significant performance gains and establishing a new state-of-the-art for this task. These gains are achieved while also reducing convergence time by up to 28 times

Association for the Advancement of Artificial Intelligence: AAAI Publications

Agile Modeling: From Concept to Classifier in Minutes

Author: Alldrin Neil Gordon
Avinash Aditya
Bateni MohammadHossein
Berger Gabriel
Bunner Andrew
DeSalvo Giulia
Ferrari Vittorio
Fuxman Ariel
Hata Kenji
Krishna Ranjay
Lu Chun-Ta
Luo Enming
Rey Javier A
Stretcu Otilia
Tavakkol Sasan
Vendrow Edward
Viswanathan Krishnamurthy
Zhou Wenlei
Publication venue
Publication date: 12/05/2023
Field of study

The application of computer vision to nuanced subjective use cases is growing. While crowdsourcing has served the vision community well for most objective tasks (such as labeling a "zebra"), it now falters on tasks where there is substantial subjectivity in the concept (such as identifying "gourmet tuna"). However, empowering any user to develop a classifier for their concept is technically difficult: users are neither machine learning experts, nor have the patience to label thousands of examples. In reaction, we introduce the problem of Agile Modeling: the process of turning any subjective visual concept into a computer vision model through a real-time user-in-the-loop interactions. We instantiate an Agile Modeling prototype for image classification and show through a user study (N=14) that users can create classifiers with minimal effort under 30 minutes. We compare this user driven process with the traditional crowdsourcing paradigm and find that the crowd's notion often differs from that of the user's, especially as the concepts become more subjective. Finally, we scale our experiments with simulations of users training classifiers for ImageNet21k categories to further demonstrate the efficacy

arXiv.org e-Print Archive