1,024 research outputs found
Generation-Distillation for Efficient Natural Language Understanding in Low-Data Settings
Over the past year, the emergence of transfer learning with large-scale
language models (LM) has led to dramatic performance improvements across a
broad range of natural language understanding tasks. However, the size and
memory footprint of these large LMs makes them difficult to deploy in many
scenarios (e.g. on mobile phones). Recent research points to knowledge
distillation as a potential solution, showing that when training data for a
given task is abundant, it is possible to distill a large (teacher) LM into a
small task-specific (student) network with minimal loss of performance.
However, when such data is scarce, there remains a significant performance gap
between large pretrained LMs and smaller task-specific models, even when
training via distillation. In this paper, we bridge this gap with a novel
training approach, called generation-distillation, that leverages large
finetuned LMs in two ways: (1) to generate new (unlabeled) training examples,
and (2) to distill their knowledge into a small network using these examples.
Across three low-resource text classification datsets, we achieve comparable
performance to BERT while using 300x fewer parameters, and we outperform prior
approaches to distillation for text classification while using 3x fewer
parameters.Comment: EMNLP 2019 Workshop on Deep Learning for Low-resource NL
QuAC : Question Answering in Context
We present QuAC, a dataset for Question Answering in Context that contains
14K information-seeking QA dialogs (100K questions in total). The dialogs
involve two crowd workers: (1) a student who poses a sequence of freeform
questions to learn as much as possible about a hidden Wikipedia text, and (2) a
teacher who answers the questions by providing short excerpts from the text.
QuAC introduces challenges not found in existing machine comprehension
datasets: its questions are often more open-ended, unanswerable, or only
meaningful within the dialog context, as we show in a detailed qualitative
evaluation. We also report results for a number of reference models, including
a recently state-of-the-art reading comprehension architecture extended to
model dialog context. Our best model underperforms humans by 20 F1, suggesting
that there is significant room for future work on this data. Dataset, baseline,
and leaderboard available at http://quac.ai.Comment: EMNLP Camera Read
Technical Note: PDE-constrained Optimization Formulation for Tumor Growth Model Calibration
We discuss solution algorithms for calibrating a tumor growth model using
imaging data posed as a deterministic inverse problem. The forward model
consists of a nonlinear and time-dependent reaction-diffusion partial
differential equation (PDE) with unknown parameters (diffusivity and
proliferation rate) being spatial fields. We use a dimension-independent
globalized, inexact Newton Conjugate Gradient algorithm to solve the
PDE-constrained optimization. The required gradient and Hessian actions are
also presented using the adjoint method and Lagrangian formalism
New prospects for computational hydraulics by leveraging high-performance heterogeneous computing techniques
In the last two decades, computational hydraulics has undergone a rapid development following the advancement of data acquisition and computing technologies. Using a finite-volume Godunov-type hydrodynamic model, this work demonstrates the promise of modern high-performance computing technology to achieve real-time flood modeling at a regional scale. The software is implemented for high-performance heterogeneous computing using the OpenCL programming framework, and developed to support simulations across multiple GPUs using a domain decomposition technique and across multiple systems through an efficient implementation of the Message Passing Interface (MPI) standard. The software is applied for a convective storm induced flood event in Newcastle upon Tyne, demonstrating high computational performance across a GPU cluster, and good agreement against crowd- sourced observations. Issues relating to data availability, complex urban topography and differences in drainage capacity affect results for a small number of areas
The Republic Of Adaria V. The Republic Of Bobbia, Kingdom Of Cazalia, Commonwealth Of Dingoth, State Of Ephraim, And Kingdom Of Finbar
The Republic of Adaria, the Republic of Bobbia, the Kingdom of Cazalia, the Commonwealth of Dingoth, the State of Ephraim, and the Kingdom of Finbar submit the present dispute to this Court by Special Agreement, dated September 1,2006, pursuant to article 40(1) of the Court\u27s Statute
The influence of tropospheric biennial oscillation on mid-tropospheric CO_2
Mid-tropospheric CO_2 retrieved from the Atmospheric Infrared Sounder (AIRS) was used to investigate CO_2 interannual variability over the Indo-Pacific region. A signal with periodicity around two years was found for the AIRS mid-tropospheric CO_2 for the first time, which is related to the Tropospheric Biennial Oscillation (TBO) associated with the strength of the monsoon. During a strong (weak) monsoon year, the Western Walker Circulation is strong (weak), resulting in enhanced (diminished) CO_2 transport from the surface to the mid-troposphere. As a result, there are positive (negative) CO2 anomalies at mid-troposphere over the Indo-Pacific region. We simulated the influence of the TBO on the mid-tropospheric CO_2 over the Indo-Pacific region using the MOZART-2 model, and results were consistent with observations, although we found the TBO signal in the model CO_2 is to be smaller than that in the AIRS observations
Superconductivity and Field-Induced Magnetism in PrCeCuO Single Crystals
We report muon-spin rotation/relaxation (muSR) measurements on single
crystals of the electron-doped high-T_c superconductor PrCeCuO.
In zero external magnetic field, superconductivity is found to coexist with Cu
spins that are static on the muSR time scale. In an applied field, we observe a
Knight shift that is primarily due to the magnetic moment induced on the Pr
ions. Below the superconducting transition temperature T_c, an additional
source of static magnetic order appears throughout the sample. This finding is
consistent with antiferromagnetic ordering of the Cu spins in the presence of
vortices. We also find that the temperature dependence of the in-plane magnetic
penetration depth in the vortex state resembles that of the hole-doped cuprates
at temperatures above ~ 0.2 T_c.Comment: 4 pages, 5 figure
The Impact of New Estimates of Mixing Ratio and Flux-based Halogen Scenarios on Ozone Evolution
The evolution of ozone in the 21st century has been shown to be mainly impacted by the halogen emissions scenario and predicted changes in the circulation of the stratosphere. New estimates of mixing ratio and flux-based emission scenarios have been produced from the SPARC Lifetime Assessment 2013. Simulations using the Goddard Earth Observing System Chemistry-Climate Model (GEOSCCM) are conducted using this new A1 2014 halogen scenario and compared to ones using the A1 2010 scenario. This updated version of GEOSCCM includes a realistic representation of the Quasi-Biennial Oscillation and improvements related to the break up of the Antarctic polar vortex. We will present results of the ozone evolution over the recent past and 21st century to the A1 2010, A1 2014 mixing ratio, and an A1 2014 flux-based halogen scenario. Implications of the uncertainties in these estimates as well as those from possible circulation changes will be discussed
- …