1,024 research outputs found

    Generation-Distillation for Efficient Natural Language Understanding in Low-Data Settings

    Full text link
    Over the past year, the emergence of transfer learning with large-scale language models (LM) has led to dramatic performance improvements across a broad range of natural language understanding tasks. However, the size and memory footprint of these large LMs makes them difficult to deploy in many scenarios (e.g. on mobile phones). Recent research points to knowledge distillation as a potential solution, showing that when training data for a given task is abundant, it is possible to distill a large (teacher) LM into a small task-specific (student) network with minimal loss of performance. However, when such data is scarce, there remains a significant performance gap between large pretrained LMs and smaller task-specific models, even when training via distillation. In this paper, we bridge this gap with a novel training approach, called generation-distillation, that leverages large finetuned LMs in two ways: (1) to generate new (unlabeled) training examples, and (2) to distill their knowledge into a small network using these examples. Across three low-resource text classification datsets, we achieve comparable performance to BERT while using 300x fewer parameters, and we outperform prior approaches to distillation for text classification while using 3x fewer parameters.Comment: EMNLP 2019 Workshop on Deep Learning for Low-resource NL

    QuAC : Question Answering in Context

    Full text link
    We present QuAC, a dataset for Question Answering in Context that contains 14K information-seeking QA dialogs (100K questions in total). The dialogs involve two crowd workers: (1) a student who poses a sequence of freeform questions to learn as much as possible about a hidden Wikipedia text, and (2) a teacher who answers the questions by providing short excerpts from the text. QuAC introduces challenges not found in existing machine comprehension datasets: its questions are often more open-ended, unanswerable, or only meaningful within the dialog context, as we show in a detailed qualitative evaluation. We also report results for a number of reference models, including a recently state-of-the-art reading comprehension architecture extended to model dialog context. Our best model underperforms humans by 20 F1, suggesting that there is significant room for future work on this data. Dataset, baseline, and leaderboard available at http://quac.ai.Comment: EMNLP Camera Read

    Technical Note: PDE-constrained Optimization Formulation for Tumor Growth Model Calibration

    Full text link
    We discuss solution algorithms for calibrating a tumor growth model using imaging data posed as a deterministic inverse problem. The forward model consists of a nonlinear and time-dependent reaction-diffusion partial differential equation (PDE) with unknown parameters (diffusivity and proliferation rate) being spatial fields. We use a dimension-independent globalized, inexact Newton Conjugate Gradient algorithm to solve the PDE-constrained optimization. The required gradient and Hessian actions are also presented using the adjoint method and Lagrangian formalism

    New prospects for computational hydraulics by leveraging high-performance heterogeneous computing techniques

    Get PDF
    In the last two decades, computational hydraulics has undergone a rapid development following the advancement of data acquisition and computing technologies. Using a finite-volume Godunov-type hydrodynamic model, this work demonstrates the promise of modern high-performance computing technology to achieve real-time flood modeling at a regional scale. The software is implemented for high-performance heterogeneous computing using the OpenCL programming framework, and developed to support simulations across multiple GPUs using a domain decomposition technique and across multiple systems through an efficient implementation of the Message Passing Interface (MPI) standard. The software is applied for a convective storm induced flood event in Newcastle upon Tyne, demonstrating high computational performance across a GPU cluster, and good agreement against crowd- sourced observations. Issues relating to data availability, complex urban topography and differences in drainage capacity affect results for a small number of areas

    The Republic Of Adaria V. The Republic Of Bobbia, Kingdom Of Cazalia, Commonwealth Of Dingoth, State Of Ephraim, And Kingdom Of Finbar

    Get PDF
    The Republic of Adaria, the Republic of Bobbia, the Kingdom of Cazalia, the Commonwealth of Dingoth, the State of Ephraim, and the Kingdom of Finbar submit the present dispute to this Court by Special Agreement, dated September 1,2006, pursuant to article 40(1) of the Court\u27s Statute

    The influence of tropospheric biennial oscillation on mid-tropospheric CO_2

    Get PDF
    Mid-tropospheric CO_2 retrieved from the Atmospheric Infrared Sounder (AIRS) was used to investigate CO_2 interannual variability over the Indo-Pacific region. A signal with periodicity around two years was found for the AIRS mid-tropospheric CO_2 for the first time, which is related to the Tropospheric Biennial Oscillation (TBO) associated with the strength of the monsoon. During a strong (weak) monsoon year, the Western Walker Circulation is strong (weak), resulting in enhanced (diminished) CO_2 transport from the surface to the mid-troposphere. As a result, there are positive (negative) CO2 anomalies at mid-troposphere over the Indo-Pacific region. We simulated the influence of the TBO on the mid-tropospheric CO_2 over the Indo-Pacific region using the MOZART-2 model, and results were consistent with observations, although we found the TBO signal in the model CO_2 is to be smaller than that in the AIRS observations

    Superconductivity and Field-Induced Magnetism in Pr2x_{2-x}Cex_xCuO4_4 Single Crystals

    Full text link
    We report muon-spin rotation/relaxation (muSR) measurements on single crystals of the electron-doped high-T_c superconductor Pr2x_{2-x}Cex_xCuO4_4. In zero external magnetic field, superconductivity is found to coexist with Cu spins that are static on the muSR time scale. In an applied field, we observe a Knight shift that is primarily due to the magnetic moment induced on the Pr ions. Below the superconducting transition temperature T_c, an additional source of static magnetic order appears throughout the sample. This finding is consistent with antiferromagnetic ordering of the Cu spins in the presence of vortices. We also find that the temperature dependence of the in-plane magnetic penetration depth in the vortex state resembles that of the hole-doped cuprates at temperatures above ~ 0.2 T_c.Comment: 4 pages, 5 figure

    The Impact of New Estimates of Mixing Ratio and Flux-based Halogen Scenarios on Ozone Evolution

    Get PDF
    The evolution of ozone in the 21st century has been shown to be mainly impacted by the halogen emissions scenario and predicted changes in the circulation of the stratosphere. New estimates of mixing ratio and flux-based emission scenarios have been produced from the SPARC Lifetime Assessment 2013. Simulations using the Goddard Earth Observing System Chemistry-Climate Model (GEOSCCM) are conducted using this new A1 2014 halogen scenario and compared to ones using the A1 2010 scenario. This updated version of GEOSCCM includes a realistic representation of the Quasi-Biennial Oscillation and improvements related to the break up of the Antarctic polar vortex. We will present results of the ozone evolution over the recent past and 21st century to the A1 2010, A1 2014 mixing ratio, and an A1 2014 flux-based halogen scenario. Implications of the uncertainties in these estimates as well as those from possible circulation changes will be discussed
    corecore