2,525 research outputs found
Universal Language Model Fine-tuning for Text Classification
Inductive transfer learning has greatly impacted computer vision, but
existing approaches in NLP still require task-specific modifications and
training from scratch. We propose Universal Language Model Fine-tuning
(ULMFiT), an effective transfer learning method that can be applied to any task
in NLP, and introduce techniques that are key for fine-tuning a language model.
Our method significantly outperforms the state-of-the-art on six text
classification tasks, reducing the error by 18-24% on the majority of datasets.
Furthermore, with only 100 labeled examples, it matches the performance of
training from scratch on 100x more data. We open-source our pretrained models
and code.Comment: ACL 2018, fixed denominator in Equation 3, line
Learning to select data for transfer learning with Bayesian Optimization
Domain similarity measures can be used to gauge adaptability and select
suitable data for transfer learning, but existing approaches define ad hoc
measures that are deemed suitable for respective tasks. Inspired by work on
curriculum learning, we propose to \emph{learn} data selection measures using
Bayesian Optimization and evaluate them across models, domains and tasks. Our
learned measures outperform existing domain similarity measures significantly
on three tasks: sentiment analysis, part-of-speech tagging, and parsing. We
show the importance of complementing similarity with diversity, and that
learned measures are -- to some degree -- transferable across models, domains,
and even tasks.Comment: EMNLP 2017. Code available at:
https://github.com/sebastianruder/learn-to-select-dat
The geology and geophysics of the Oslo rift
The regional geology and geophysical characteristics of the Oslo graben are reviewed. The graben is part of a Permian age failed continental rift. Alkali olivine, tholefitic, and monzonitic intrusives as well as basaltic lavas outline the extent of the graben. Geophysical evidence indicates that rifting activity covered a much greater area in Skagerrak Sea as well as the Paleozoic time, possibly including the northern Skagerrak Sea as well as the Oslo graben itself. Much of the surficial geologic characteristics in the southern part of the rift have since been eroded or covered by sedimentation. Geophysical data reveal a gravity maximum along the strike of the Oslo graben, local emplacements of magnetic material throughout the Skagerrak and the graben, and a slight mantle upward beneath the rift zone. Petrologic and geophysical maps which depict regional structure are included in the text. An extensive bibliography of pertinent literature published in English between 1960 and 1980 is also provided
Simple diamagnetic monotonicities for Schroedinger operators with inhomogeneous magnetic fields of constant direction
Under certain simplifying conditions we detect monotonicity properties of the
ground-state energy and the canonical-equilibrium density matrix of a spinless
charged particle in the Euclidean plane subject to a perpendicular, possibly
inhomogeneous magnetic field and an additional scalar potential. Firstly, we
point out a simple condition warranting that the ground-state energy does not
decrease when the magnetic field and/or the potential is increased pointwise.
Secondly, we consider the case in which both the magnetic field and the
potential are constant along one direction in the plane and give a genuine
path-integral argument for corresponding monotonicities of the density-matrix
diagonal and the absolute value of certain off-diagonals. Our results
complement to some degree results of M. Loss and B. Thaller [Commun. Math.
Phys. 186 (1997) 95] and L. Erdos [J. Math. Phys. 38 (1997) 1289]
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces
We combine multi-task learning and semi-supervised learning by inducing a
joint embedding space between disparate label spaces and learning transfer
functions between label embeddings, enabling us to jointly leverage unlabelled
data and auxiliary, annotated datasets. We evaluate our approach on a variety
of sequence classification tasks with disparate label spaces. We outperform
strong single and multi-task baselines and achieve a new state-of-the-art for
topic-based sentiment analysis.Comment: To appear at NAACL 2018 (long
Towards a continuous modeling of natural language domains
Humans continuously adapt their style and language to a variety of domains.
However, a reliable definition of `domain' has eluded researchers thus far.
Additionally, the notion of discrete domains stands in contrast to the
multiplicity of heterogeneous domains that humans navigate, many of which
overlap. In order to better understand the change and variation of human
language, we draw on research in domain adaptation and extend the notion of
discrete domains to the continuous spectrum. We propose representation
learning-based models that can adapt to continuous domains and detail how these
can be used to investigate variation in language. To this end, we propose to
use dialogue modeling as a test bed due to its proximity to language modeling
and its social component.Comment: 5 pages, 3 figures, published in Uphill Battles in Language
Processing workshop, EMNLP 201
A major crustal feature in the southeastern United States inferred from the MAGSAT equivalent source anomaly field
The MAGSAT equivalent-source anomaly field evaluated at 325 km altitude depicts a prominent anomaly centered over southeast Georgia, which is adjacent to the high-amplitude positive Kentucky anomaly. To overcome the satellite resolution constraint in studying this anomaly, conventional geophysical data were included in analysis: Bouguer gravity, seismic reflection and refraction, aeromagnetic, and in-situ stress-strain measurements. This integrated geophysical approach, infers more specifically the nature and extent of the crustal and/or lithospheric source of the Georgia MAGSAT anomaly. Physical properties and tectonic evolution of the area are all important in the interpretation
- …