Search CORE

761 research outputs found

Mimicking Word Embeddings using Subword RNNs

Author: Eisenstein Jacob
Guthrie Robert
Pinter Yuval
Publication venue
Publication date: 01/01/2017
Field of study

Word embeddings improve generalization over lexical features by placing each word in a lower-dimensional space, using distributional information obtained from unlabeled data. However, the effectiveness of word embeddings for downstream NLP tasks is limited by out-of-vocabulary (OOV) words, for which embeddings do not exist. In this paper, we present MIMICK, an approach to generating OOV word embeddings compositionally, by learning a function from spellings to distributional embeddings. Unlike prior work, MIMICK does not require re-training on the original word embedding corpus; instead, learning is performed at the type level. Intrinsic and extrinsic evaluations demonstrate the power of this simple approach. On 23 languages, MIMICK improves performance over a word-based baseline for tagging part-of-speech and morphosyntactic attributes. It is competitive with (and complementary to) a supervised character-based model in low-resource settings.Comment: EMNLP 201

arXiv.org e-Print Archive

Crossref

Morphological Priors for Probabilistic Neural Word Embeddings

Author: Bhatia Parminder
Eisenstein Jacob
Guthrie Robert
Publication venue
Publication date: 01/01/2016
Field of study

Word embeddings allow natural language processing systems to share statistical information across related words. These embeddings are typically based on distributional statistics, making it difficult for them to generalize to rare or unseen words. We propose to improve word embeddings by incorporating morphological information, capturing shared sub-word features. Unlike previous work that constructs word embeddings directly from morphemes, we combine morphological and distributional information in a unified probabilistic framework, in which the word embedding is a latent variable. The morphological information provides a prior distribution on the latent word embeddings, which in turn condition a likelihood function over an observed corpus. This approach yields improvements on intrinsic word similarity evaluations, and also in the downstream task of part-of-speech tagging.Comment: Appeared at the Conference on Empirical Methods in Natural Language Processing (EMNLP 2016, Austin

arXiv.org e-Print Archive

Crossref

Evidence for a Z < 8 Origin of the Source Subtracted Near Infrared Background

Author: Daniel Eisenstein
Madau P.
Marcia Rieke
Robert C. Kennicutt
Rodger I. Thompson
Salvaterra R.
Xiaohui Fan
Publication venue: 'University of Chicago Press'
Publication date: 04/06/2007
Field of study

This letter extends our previous fluctuation analysis of the near infrared background at 1.6 microns to the 1.1 micron (F110W) image of the Hubble Ultra Deep field. When all detectable sources are removed the ratio of fluctuation power in the two images is consistent with the ratio expected for faint, z<8, sources, and is inconsistent with the expected ratio for galaxies with z>8. We also use numerically redshifted model galaxy spectral energy distributions for 50 and 10 million year old galaxies to predict the expected fluctuation power at 3.6 microns and 4.5 microns to compare with recent Spitzer observations. The predicted fluctuation power for galaxies at z = 0-12 matches the observed Spitzer fluctuation power while the predicted power for z>13 galaxies is much higher than the observed values. As was found in the 1.6 micron (F160W) analysis the fluctuation power in the source subtracted F110W image is two orders of magnitude below the power in the image with all sources present. This leads to the conclusion that the 0.8--1.8 micron near infrared background is due to resolved galaxies in the redshift range z<8, with the majority of power in the redshift range of 0.5--1.5.Comment: Accepted for publication in the Astrophysical Journa

arXiv.org e-Print Archive

Crossref

WFMOS - Sounding the Dark Cosmos

Author: Bassett Bruce A.
Eisenstein Daniel J.
Nichol Robert C.
Team the WFMOS Feasibility Study Dark Energy
Publication venue: 'Wiley'
Publication date: 01/01/2005
Field of study

Vast sound waves traveling through the relativistic plasma during the first million years of the universe imprint a preferred scale in the density of matter. We now have the ability to detect this characteristic fingerprint in the clustering of galaxies at various redshifts and use it to measure the acceleration of the expansion of the Universe. The Wide-Field Multi-Object Spectrograph (WFMOS) would use this test to shed significant light on the true nature of dark energy, the mysterious source of this cosmic acceleration. WFMOS would also revolutionise studies of the kinematics of the Milky Way and provide deep insights into the clustering of galaxies at redshifts up to z~4. In this article we discuss the recent progress in large galaxy redshift surveys and detail how WFMOS will help unravel the mystery of dark energy.Comment: 6 pages, pure pdf. An introduction to WFMOS and Baryon Acoustic Oscillations for a general audienc

arXiv.org e-Print Archive

Portsmouth University Research Portal (Pure)

The Clustering of Luminous Red Galaxies in the Sloan Digital Sky Survey Imaging Data

Author: Abazajian
Abazajian
Abazajian
Adelman-McCarthy
Alexey Makarov
Blake
Blake
Blake
Blake
Blanton
Blanton
Bond
Bruzual
Cole
Daniel J. Eisenstein
David J. Schlegel
David W. Hogg
Dolney
Donald G. York
Donald P. Schneider
Douglas P. Finkbeiner
Eisenstein
Eisenstein
Eisenstein
Eisenstein
Eisenstein
Eisenstein
Eisenstein
Eisenstein
Finkbeiner
Fisher
Fukugita
Gillian R. Knapp
Gladders
Goldberg
Groth
Gunn
Gunn
Górski
Hauser
Hirata
Hogg
Holtzman
Hu
Hu
Hu
Hu
Huterer
Høg
Hütsi
Hütsi
Ivezić
James E. Gunn
Jon Loveday
Jonathan Brinkmann
Kaiser
Kendall
Linder
Lupton
Matsubara
Matsubara
Max Tegmark
Meiksin
Michael A. Strauss
Michael R. Blanton
Neta A. Bahcall
Nikhil Padmanabhan
Padmanabhan
Padmanabhan
Padmanabhan
Padmanabhan
Peebles
Peebles
Pier
Press
Richards
Robert C. Nichol
Robert H. Lupton
Scherrer
Schlegel
Seljak
Seo
Seo
Slosar
Smith
Smith
Spergel
Springel
Stoughton
Strauss
Sunyaev
Tegmark
Tegmark
Tegmark
Tegmark
Tegmark
Tegmark
Uroš Seljak
Wang
White
York
Zehavi
Željko Ivezić
Publication venue: 'Wiley'
Publication date: 12/05/2006
Field of study

We present the 3D real space clustering power spectrum of a sample of \~600,000 luminous red galaxies (LRGs) measured by the Sloan Digital Sky Survey (SDSS), using photometric redshifts. This sample of galaxies ranges from redshift z=0.2 to 0.6 over 3,528 deg^2 of the sky, probing a volume of 1.5 (Gpc/h)^3, making it the largest volume ever used for galaxy clustering measurements. We measure the angular clustering power spectrum in eight redshift slices and combine these into a high precision 3D real space power spectrum from k=0.005 (h/Mpc) to k=1 (h/Mpc). We detect power on gigaparsec scales, beyond the turnover in the matter power spectrum, on scales significantly larger than those accessible to current spectroscopic redshift surveys. We also find evidence for baryonic oscillations, both in the power spectrum, as well as in fits to the baryon density, at a 2.5 sigma confidence level. The statistical power of these data to constrain cosmology is ~1.7 times better than previous clustering analyses. Varying the matter density and baryon fraction, we find \Omega_M = 0.30 \pm 0.03, and \Omega_b/\Omega_M = 0.18 \pm 0.04, The detection of baryonic oscillations also allows us to measure the comoving distance to z=0.5; we find a best fit distance of 1.73 \pm 0.12 Gpc, corresponding to a 6.5% error on the distance. These results demonstrate the ability to make precise clustering measurements with photometric surveys (abridged).Comment: 23 pages, 27 figures, submitted to MNRA

arXiv.org e-Print Archive

Crossref

Portsmouth University Research Portal (Pure)

CERN Document Server

The DESI Experiment, a whitepaper for Snowmass 2013

Author: Bebek Chris
Beers Timothy
Blum Robert
Cahn Robert
collaboration representing the DESI
Eisenstein Daniel
Flaugher Brenna
Honscheid Klaus
Kron Richard
Lahav Ofer
Levi Michael
McDonald Patrick
Roe Natalie
Schlegel David
Publication venue
Publication date: 04/08/2013
Field of study

The Dark Energy Spectroscopic Instrument (DESI) is a massively multiplexed fiber-fed spectrograph that will make the next major advance in dark energy in the timeframe 2018-2022. On the Mayall telescope, DESI will obtain spectra and redshifts for at least 18 million emission-line galaxies, 4 million luminous red galaxies and 3 million quasi-stellar objects, in order to: probe the effects of dark energy on the expansion history using baryon acoustic oscillations (BAO), measure the gravitational growth history through redshift-space distortions, measure the sum of neutrino masses, and investigate the signatures of primordial inflation. The resulting 3-D galaxy maps at z<2 and Lyman-alpha forest at z>2 will make 1%-level measurements of the distance scale in 35 redshift bins, thus providing unprecedented constraints on cosmological models.Comment: 14 pages, 4 figures, a White Paper for Snowmass 201

arXiv.org e-Print Archive

CiteSeerX

An Observational Determination of the Proton to Electron Mass Ratio in the Early Universe

Author: Bechtold Jill
Black John H.
Eisenstein Daniel
Fan Xiaohui
Kennicutt Robert C.
Martins Carlos
Prochaska J. Xavier
Shirley Yancey L.
Thompson Rodger I.
Publication venue: 'IOP Publishing'
Publication date: 01/01/2009
Field of study

In an effort to resolve the discrepancy between two measurements of the fundamental constant mu, the proton to electron mass ratio, at early times in the universe we reanalyze the same data used in the earlier studies. Our analysis of the molecular hydrogen absorption lines in archival VLT/UVES spectra of the damped Lyman alpha systems in the QSOs Q0347-383 and Q0405-443 yields a combined measurement of a (Delta mu)/mu value of (-7 +/- 8) x 10^{-6}, consistent with no change in the value of mu over a time span of 11.5 gigayears. Here we define (Delta mu) as (mu_z - mu_0) where mu_z is the value of mu at a redshift of z and mu_0 is the present day value. Our null result is consistent with the recent measurements of King et al. 2009, (Delta mu)/u = (2.6 +/- 3.0) x 10^{-6}, and inconsistent with the positive detection of a change in mu by Reinhold et al. 2006. Both of the previous studies and this study are based on the same data but with differing analysis methods. Improvements in the wavelength calibration over the UVES pipeline calibration is a key element in both of the null results. This leads to the conclusion that the fundamental constant mu is unchanged to an accuracy of 10^{-5} over the last 80% of the age of the universe, well into the matter dominated epoch. This limit provides constraints on models of dark energy that invoke rolling scalar fields and also limits the parameter space of Super Symmetric or string theory models of physics. New instruments, both planned and under construction, will provide opportunities to greatly improve the accuracy of these measurements.Comment: Accepted for publication in the Astrophysical Journa

arXiv.org e-Print Archive

CiteSeerX

Chalmers Research

Chalmers Publication Library

A Theoretical Study of Models for X2Y2 Zintl Ions

Author: Canadell Enric
Cave Robert J.
Davidson Ernest R.
Eisenstein Odile
Sautet Philippe
Publication venue: Scholarship @ Claremont
Publication date: 01/01/1989
Field of study

Ab initio and extended Hückel calculations have been used to discuss the bonding scheme in X₂Y₂ neutral and ionic main group clusters. A qualitative analysis suggests that two different electron counts, 20 and 22, are possible for the butterfly structures of these systems. This results from two orbital crossings in the correlation diagram for the tetrahedral (T_d) -\u3e butterfly (C_2v) -\u3e square-planar (D_2h) transformation. Detailed ab initio computations substantiate this analysis and show that the 20-electron butterfly structure becomes increasingly favored over the tetrahedral one in X₂Y₂ clusters when the 2 atoms have increasing electronegativity difference. These results are in agreement with the known structures for the Pb₂Sb₂²̄ and Sb₂Bi₂²̄ clusters (tetrahedral-like) and the Tl₂Te₂²̄ one (butterfly-like)

Scholarship@Claremont