8 research outputs found

    Competence-based Curriculum Learning for Neural Machine Translation

    Get PDF
    Current state-of-the-art NMT systems use large neural networks that are not only slow to train, but also often require many heuristics and optimization tricks, such as specialized learning rate schedules and large batch sizes. This is undesirable as it requires extensive hyperparameter tuning. In this paper, we propose a curriculum learning framework for NMT that reduces training time, reduces the need for specialized heuristics or large batch sizes, and results in overall better performance. Our framework consists of a principled way of deciding which training samples are shown to the model at different times during training, based on the estimated difficulty of a sample and the current competence of the model. Filtering training samples in this manner prevents the model from getting stuck in bad local optima, making it converge faster and reach a better solution than the common approach of uniformly sampling training examples. Furthermore, the proposed method can be easily applied to existing NMT models by simply modifying their input data pipelines. We show that our framework can help improve the training time and the performance of both recurrent neural network models and Transformers, achieving up to a 70% decrease in training time, while at the same time obtaining accuracy improvements of up to 2.2 BLEU

    Benchmarking Robustness to Adversarial Image Obfuscations

    Full text link
    Automated content filtering and moderation is an important tool that allows online platforms to build striving user communities that facilitate cooperation and prevent abuse. Unfortunately, resourceful actors try to bypass automated filters in a bid to post content that violate platform policies and codes of conduct. To reach this goal, these malicious actors may obfuscate policy violating images (e.g. overlay harmful images by carefully selected benign images or visual patterns) to prevent machine learning models from reaching the correct decision. In this paper, we invite researchers to tackle this specific issue and present a new image benchmark. This benchmark, based on ImageNet, simulates the type of obfuscations created by malicious actors. It goes beyond ImageNet-C\textrm{C} and ImageNet-Cˉ\bar{\textrm{C}} by proposing general, drastic, adversarial modifications that preserve the original content intent. It aims to tackle a more common adversarial threat than the one considered by p\ell_p-norm bounded adversaries. We evaluate 33 pretrained models on the benchmark and train models with different augmentations, architectures and training methods on subsets of the obfuscations to measure generalization. We hope this benchmark will encourage researchers to test their models and methods and try to find new approaches that are more robust to these obfuscations

    Subthalamic Nucleus and Sensorimotor Cortex Activity During Speech Production

    Get PDF
    The sensorimotor cortex is somatotopically organized to represent the vocal tract articulators such as lips, tongue, larynx, and jaw. How speech and articulatory features are encoded at the subcortical level, however, remains largely unknown. We analyzed LFP recordings from the subthalamic nucleus (STN) and simultaneous electrocorticography recordings from the sensorimotor cortex of 11 human subjects (1 female) with Parkinson´s disease during implantation of deep-brain stimulation (DBS) electrodes while they read aloud three-phoneme words. The initial phonemes involved either articulation primarily with the tongue (coronal consonants) or the lips (labial consonants). We observed significant increases in high-gamma (60?150 Hz) power in both the STN and the sensorimotor cortex that began before speech onset and persisted for the duration of speech articulation. As expected from previous reports, in the sensorimotor cortex, the primary articulators involved in the production of the initial consonants were topographically represented by high-gamma activity. We found that STN high-gamma activity also demonstrated specificity for the primary articulator, although no clear topography was observed. In general, subthalamic high-gamma activity varied along the ventral?dorsal trajectory of the electrodes, with greater high-gamma power recorded in the dorsal locations of the STN. Interestingly, the majority of significant articulator-discriminative activity in the STN occurred before that in sensorimotor cortex. These results demonstrate that articulator-specific speech information is contained within high-gamma activity of the STN, but with different spatial and temporal organization compared with similar information encoded in the sensorimotor cortex.Fil: Chrabaszcz, Anna. University of Pittsburgh; Estados UnidosFil: Neumann, Wolf Julian. Universität zu Berlin; AlemaniaFil: Stretcu, Otilia. University of Pittsburgh; Estados UnidosFil: Lipski, Witold J.. University of Pittsburgh; Estados UnidosFil: Dastolfo Hromack, Christina A.. University of Pittsburgh; Estados UnidosFil: Bush, Alan. University of Pittsburgh; Estados Unidos. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física de Buenos Aires; ArgentinaFil: Wang, Dengyu. Tsinghua University; China. University of Pittsburgh; Estados UnidosFil: Crammond, Donald J.. University of Pittsburgh; Estados UnidosFil: Shaiman, Susan. University of Pittsburgh; Estados UnidosFil: Dickey, Michael W.. University of Pittsburgh; Estados UnidosFil: Holt, Lori L.. University of Pittsburgh; Estados UnidosFil: Turner, Robert S.. University of Pittsburgh; Estados UnidosFil: Fiez, Julie A.. University of Pittsburgh; Estados UnidosFil: Richardson, R. Mark. University of Pittsburgh; Estados Unido

    Contextual Parameter Generation for Knowledge Graph Link Prediction

    No full text
    We consider the task of knowledge graph link prediction. Given a question consisting of a source entity and a relation (e.g., Shakespeare and BornIn), the objective is to predict the most likely answer entity (e.g., England). Recent approaches tackle this problem by learning entity and relation embeddings. However, they often constrain the relationship between these embeddings to be additive (i.e., the embeddings are concatenated and then processed by a sequence of linear functions and element-wise non-linearities). We show that this type of interaction significantly limits representational power. For example, such models cannot handle cases where a different projection of the source entity is used for each relation. We propose to use contextual parameter generation to address this limitation. More specifically, we treat relations as the context in which source entities are processed to produce predictions, by using relation embeddings to generate the parameters of a model operating over source entity embeddings. This allows models to represent more complex interactions between entities and relations. We apply our method on two existing link prediction methods, including the current state-of-the-art, resulting in significant performance gains and establishing a new state-of-the-art for this task. These gains are achieved while also reducing convergence time by up to 28 times

    Agile Modeling: From Concept to Classifier in Minutes

    Full text link
    The application of computer vision to nuanced subjective use cases is growing. While crowdsourcing has served the vision community well for most objective tasks (such as labeling a "zebra"), it now falters on tasks where there is substantial subjectivity in the concept (such as identifying "gourmet tuna"). However, empowering any user to develop a classifier for their concept is technically difficult: users are neither machine learning experts, nor have the patience to label thousands of examples. In reaction, we introduce the problem of Agile Modeling: the process of turning any subjective visual concept into a computer vision model through a real-time user-in-the-loop interactions. We instantiate an Agile Modeling prototype for image classification and show through a user study (N=14) that users can create classifiers with minimal effort under 30 minutes. We compare this user driven process with the traditional crowdsourcing paradigm and find that the crowd's notion often differs from that of the user's, especially as the concepts become more subjective. Finally, we scale our experiments with simulations of users training classifiers for ImageNet21k categories to further demonstrate the efficacy
    corecore