761 research outputs found
Mimicking Word Embeddings using Subword RNNs
Word embeddings improve generalization over lexical features by placing each
word in a lower-dimensional space, using distributional information obtained
from unlabeled data. However, the effectiveness of word embeddings for
downstream NLP tasks is limited by out-of-vocabulary (OOV) words, for which
embeddings do not exist. In this paper, we present MIMICK, an approach to
generating OOV word embeddings compositionally, by learning a function from
spellings to distributional embeddings. Unlike prior work, MIMICK does not
require re-training on the original word embedding corpus; instead, learning is
performed at the type level. Intrinsic and extrinsic evaluations demonstrate
the power of this simple approach. On 23 languages, MIMICK improves performance
over a word-based baseline for tagging part-of-speech and morphosyntactic
attributes. It is competitive with (and complementary to) a supervised
character-based model in low-resource settings.Comment: EMNLP 201
Morphological Priors for Probabilistic Neural Word Embeddings
Word embeddings allow natural language processing systems to share
statistical information across related words. These embeddings are typically
based on distributional statistics, making it difficult for them to generalize
to rare or unseen words. We propose to improve word embeddings by incorporating
morphological information, capturing shared sub-word features. Unlike previous
work that constructs word embeddings directly from morphemes, we combine
morphological and distributional information in a unified probabilistic
framework, in which the word embedding is a latent variable. The morphological
information provides a prior distribution on the latent word embeddings, which
in turn condition a likelihood function over an observed corpus. This approach
yields improvements on intrinsic word similarity evaluations, and also in the
downstream task of part-of-speech tagging.Comment: Appeared at the Conference on Empirical Methods in Natural Language
Processing (EMNLP 2016, Austin
Evidence for a Z < 8 Origin of the Source Subtracted Near Infrared Background
This letter extends our previous fluctuation analysis of the near infrared
background at 1.6 microns to the 1.1 micron (F110W) image of the Hubble Ultra
Deep field. When all detectable sources are removed the ratio of fluctuation
power in the two images is consistent with the ratio expected for faint, z<8,
sources, and is inconsistent with the expected ratio for galaxies with z>8. We
also use numerically redshifted model galaxy spectral energy distributions for
50 and 10 million year old galaxies to predict the expected fluctuation power
at 3.6 microns and 4.5 microns to compare with recent Spitzer observations. The
predicted fluctuation power for galaxies at z = 0-12 matches the observed
Spitzer fluctuation power while the predicted power for z>13 galaxies is much
higher than the observed values. As was found in the 1.6 micron (F160W)
analysis the fluctuation power in the source subtracted F110W image is two
orders of magnitude below the power in the image with all sources present. This
leads to the conclusion that the 0.8--1.8 micron near infrared background is
due to resolved galaxies in the redshift range z<8, with the majority of power
in the redshift range of 0.5--1.5.Comment: Accepted for publication in the Astrophysical Journa
WFMOS - Sounding the Dark Cosmos
Vast sound waves traveling through the relativistic plasma during the first
million years of the universe imprint a preferred scale in the density of
matter. We now have the ability to detect this characteristic fingerprint in
the clustering of galaxies at various redshifts and use it to measure the
acceleration of the expansion of the Universe. The Wide-Field Multi-Object
Spectrograph (WFMOS) would use this test to shed significant light on the true
nature of dark energy, the mysterious source of this cosmic acceleration. WFMOS
would also revolutionise studies of the kinematics of the Milky Way and provide
deep insights into the clustering of galaxies at redshifts up to z~4. In this
article we discuss the recent progress in large galaxy redshift surveys and
detail how WFMOS will help unravel the mystery of dark energy.Comment: 6 pages, pure pdf. An introduction to WFMOS and Baryon Acoustic
Oscillations for a general audienc
The Clustering of Luminous Red Galaxies in the Sloan Digital Sky Survey Imaging Data
We present the 3D real space clustering power spectrum of a sample of
\~600,000 luminous red galaxies (LRGs) measured by the Sloan Digital Sky Survey
(SDSS), using photometric redshifts. This sample of galaxies ranges from
redshift z=0.2 to 0.6 over 3,528 deg^2 of the sky, probing a volume of 1.5
(Gpc/h)^3, making it the largest volume ever used for galaxy clustering
measurements. We measure the angular clustering power spectrum in eight
redshift slices and combine these into a high precision 3D real space power
spectrum from k=0.005 (h/Mpc) to k=1 (h/Mpc). We detect power on gigaparsec
scales, beyond the turnover in the matter power spectrum, on scales
significantly larger than those accessible to current spectroscopic redshift
surveys. We also find evidence for baryonic oscillations, both in the power
spectrum, as well as in fits to the baryon density, at a 2.5 sigma confidence
level. The statistical power of these data to constrain cosmology is ~1.7 times
better than previous clustering analyses. Varying the matter density and baryon
fraction, we find \Omega_M = 0.30 \pm 0.03, and \Omega_b/\Omega_M = 0.18 \pm
0.04, The detection of baryonic oscillations also allows us to measure the
comoving distance to z=0.5; we find a best fit distance of 1.73 \pm 0.12 Gpc,
corresponding to a 6.5% error on the distance. These results demonstrate the
ability to make precise clustering measurements with photometric surveys
(abridged).Comment: 23 pages, 27 figures, submitted to MNRA
The DESI Experiment, a whitepaper for Snowmass 2013
The Dark Energy Spectroscopic Instrument (DESI) is a massively multiplexed
fiber-fed spectrograph that will make the next major advance in dark energy in
the timeframe 2018-2022. On the Mayall telescope, DESI will obtain spectra and
redshifts for at least 18 million emission-line galaxies, 4 million luminous
red galaxies and 3 million quasi-stellar objects, in order to: probe the
effects of dark energy on the expansion history using baryon acoustic
oscillations (BAO), measure the gravitational growth history through
redshift-space distortions, measure the sum of neutrino masses, and investigate
the signatures of primordial inflation. The resulting 3-D galaxy maps at z<2
and Lyman-alpha forest at z>2 will make 1%-level measurements of the distance
scale in 35 redshift bins, thus providing unprecedented constraints on
cosmological models.Comment: 14 pages, 4 figures, a White Paper for Snowmass 201
An Observational Determination of the Proton to Electron Mass Ratio in the Early Universe
In an effort to resolve the discrepancy between two measurements of the
fundamental constant mu, the proton to electron mass ratio, at early times in
the universe we reanalyze the same data used in the earlier studies. Our
analysis of the molecular hydrogen absorption lines in archival VLT/UVES
spectra of the damped Lyman alpha systems in the QSOs Q0347-383 and Q0405-443
yields a combined measurement of a (Delta mu)/mu value of (-7 +/- 8) x 10^{-6},
consistent with no change in the value of mu over a time span of 11.5
gigayears. Here we define (Delta mu) as (mu_z - mu_0) where mu_z is the value
of mu at a redshift of z and mu_0 is the present day value. Our null result is
consistent with the recent measurements of King et al. 2009, (Delta mu)/u =
(2.6 +/- 3.0) x 10^{-6}, and inconsistent with the positive detection of a
change in mu by Reinhold et al. 2006. Both of the previous studies and this
study are based on the same data but with differing analysis methods.
Improvements in the wavelength calibration over the UVES pipeline calibration
is a key element in both of the null results. This leads to the conclusion that
the fundamental constant mu is unchanged to an accuracy of 10^{-5} over the
last 80% of the age of the universe, well into the matter dominated epoch. This
limit provides constraints on models of dark energy that invoke rolling scalar
fields and also limits the parameter space of Super Symmetric or string theory
models of physics. New instruments, both planned and under construction, will
provide opportunities to greatly improve the accuracy of these measurements.Comment: Accepted for publication in the Astrophysical Journa
A Theoretical Study of Models for X2Y2 Zintl Ions
Ab initio and extended Hückel calculations have been used to discuss the bonding scheme in Xâ‚‚Yâ‚‚ neutral and ionic main group clusters. A qualitative analysis suggests that two different electron counts, 20 and 22, are possible for the butterfly structures of these systems. This results from two orbital crossings in the correlation diagram for the tetrahedral (T_d) -\u3e butterfly (C_2v) -\u3e square-planar (D_2h) transformation. Detailed ab initio computations substantiate this analysis and show that the 20-electron butterfly structure becomes increasingly favored over the tetrahedral one in Xâ‚‚Yâ‚‚ clusters when the 2 atoms have increasing electronegativity difference. These results are in agreement with the known structures for the Pbâ‚‚Sb₂²ÂÂÂÂÂÂÌ„ and Sbâ‚‚Bi₂²ÂÂÂÂÂÂÌ„ clusters (tetrahedral-like) and the Tlâ‚‚Te₂²ÂÂÂÂÂÂÌ„ one (butterfly-like)
- …