761 research outputs found

    Mimicking Word Embeddings using Subword RNNs

    Full text link
    Word embeddings improve generalization over lexical features by placing each word in a lower-dimensional space, using distributional information obtained from unlabeled data. However, the effectiveness of word embeddings for downstream NLP tasks is limited by out-of-vocabulary (OOV) words, for which embeddings do not exist. In this paper, we present MIMICK, an approach to generating OOV word embeddings compositionally, by learning a function from spellings to distributional embeddings. Unlike prior work, MIMICK does not require re-training on the original word embedding corpus; instead, learning is performed at the type level. Intrinsic and extrinsic evaluations demonstrate the power of this simple approach. On 23 languages, MIMICK improves performance over a word-based baseline for tagging part-of-speech and morphosyntactic attributes. It is competitive with (and complementary to) a supervised character-based model in low-resource settings.Comment: EMNLP 201

    Morphological Priors for Probabilistic Neural Word Embeddings

    Full text link
    Word embeddings allow natural language processing systems to share statistical information across related words. These embeddings are typically based on distributional statistics, making it difficult for them to generalize to rare or unseen words. We propose to improve word embeddings by incorporating morphological information, capturing shared sub-word features. Unlike previous work that constructs word embeddings directly from morphemes, we combine morphological and distributional information in a unified probabilistic framework, in which the word embedding is a latent variable. The morphological information provides a prior distribution on the latent word embeddings, which in turn condition a likelihood function over an observed corpus. This approach yields improvements on intrinsic word similarity evaluations, and also in the downstream task of part-of-speech tagging.Comment: Appeared at the Conference on Empirical Methods in Natural Language Processing (EMNLP 2016, Austin

    Evidence for a Z < 8 Origin of the Source Subtracted Near Infrared Background

    Full text link
    This letter extends our previous fluctuation analysis of the near infrared background at 1.6 microns to the 1.1 micron (F110W) image of the Hubble Ultra Deep field. When all detectable sources are removed the ratio of fluctuation power in the two images is consistent with the ratio expected for faint, z<8, sources, and is inconsistent with the expected ratio for galaxies with z>8. We also use numerically redshifted model galaxy spectral energy distributions for 50 and 10 million year old galaxies to predict the expected fluctuation power at 3.6 microns and 4.5 microns to compare with recent Spitzer observations. The predicted fluctuation power for galaxies at z = 0-12 matches the observed Spitzer fluctuation power while the predicted power for z>13 galaxies is much higher than the observed values. As was found in the 1.6 micron (F160W) analysis the fluctuation power in the source subtracted F110W image is two orders of magnitude below the power in the image with all sources present. This leads to the conclusion that the 0.8--1.8 micron near infrared background is due to resolved galaxies in the redshift range z<8, with the majority of power in the redshift range of 0.5--1.5.Comment: Accepted for publication in the Astrophysical Journa

    WFMOS - Sounding the Dark Cosmos

    Get PDF
    Vast sound waves traveling through the relativistic plasma during the first million years of the universe imprint a preferred scale in the density of matter. We now have the ability to detect this characteristic fingerprint in the clustering of galaxies at various redshifts and use it to measure the acceleration of the expansion of the Universe. The Wide-Field Multi-Object Spectrograph (WFMOS) would use this test to shed significant light on the true nature of dark energy, the mysterious source of this cosmic acceleration. WFMOS would also revolutionise studies of the kinematics of the Milky Way and provide deep insights into the clustering of galaxies at redshifts up to z~4. In this article we discuss the recent progress in large galaxy redshift surveys and detail how WFMOS will help unravel the mystery of dark energy.Comment: 6 pages, pure pdf. An introduction to WFMOS and Baryon Acoustic Oscillations for a general audienc

    The Clustering of Luminous Red Galaxies in the Sloan Digital Sky Survey Imaging Data

    Get PDF
    We present the 3D real space clustering power spectrum of a sample of \~600,000 luminous red galaxies (LRGs) measured by the Sloan Digital Sky Survey (SDSS), using photometric redshifts. This sample of galaxies ranges from redshift z=0.2 to 0.6 over 3,528 deg^2 of the sky, probing a volume of 1.5 (Gpc/h)^3, making it the largest volume ever used for galaxy clustering measurements. We measure the angular clustering power spectrum in eight redshift slices and combine these into a high precision 3D real space power spectrum from k=0.005 (h/Mpc) to k=1 (h/Mpc). We detect power on gigaparsec scales, beyond the turnover in the matter power spectrum, on scales significantly larger than those accessible to current spectroscopic redshift surveys. We also find evidence for baryonic oscillations, both in the power spectrum, as well as in fits to the baryon density, at a 2.5 sigma confidence level. The statistical power of these data to constrain cosmology is ~1.7 times better than previous clustering analyses. Varying the matter density and baryon fraction, we find \Omega_M = 0.30 \pm 0.03, and \Omega_b/\Omega_M = 0.18 \pm 0.04, The detection of baryonic oscillations also allows us to measure the comoving distance to z=0.5; we find a best fit distance of 1.73 \pm 0.12 Gpc, corresponding to a 6.5% error on the distance. These results demonstrate the ability to make precise clustering measurements with photometric surveys (abridged).Comment: 23 pages, 27 figures, submitted to MNRA

    The DESI Experiment, a whitepaper for Snowmass 2013

    Full text link
    The Dark Energy Spectroscopic Instrument (DESI) is a massively multiplexed fiber-fed spectrograph that will make the next major advance in dark energy in the timeframe 2018-2022. On the Mayall telescope, DESI will obtain spectra and redshifts for at least 18 million emission-line galaxies, 4 million luminous red galaxies and 3 million quasi-stellar objects, in order to: probe the effects of dark energy on the expansion history using baryon acoustic oscillations (BAO), measure the gravitational growth history through redshift-space distortions, measure the sum of neutrino masses, and investigate the signatures of primordial inflation. The resulting 3-D galaxy maps at z<2 and Lyman-alpha forest at z>2 will make 1%-level measurements of the distance scale in 35 redshift bins, thus providing unprecedented constraints on cosmological models.Comment: 14 pages, 4 figures, a White Paper for Snowmass 201

    An Observational Determination of the Proton to Electron Mass Ratio in the Early Universe

    Full text link
    In an effort to resolve the discrepancy between two measurements of the fundamental constant mu, the proton to electron mass ratio, at early times in the universe we reanalyze the same data used in the earlier studies. Our analysis of the molecular hydrogen absorption lines in archival VLT/UVES spectra of the damped Lyman alpha systems in the QSOs Q0347-383 and Q0405-443 yields a combined measurement of a (Delta mu)/mu value of (-7 +/- 8) x 10^{-6}, consistent with no change in the value of mu over a time span of 11.5 gigayears. Here we define (Delta mu) as (mu_z - mu_0) where mu_z is the value of mu at a redshift of z and mu_0 is the present day value. Our null result is consistent with the recent measurements of King et al. 2009, (Delta mu)/u = (2.6 +/- 3.0) x 10^{-6}, and inconsistent with the positive detection of a change in mu by Reinhold et al. 2006. Both of the previous studies and this study are based on the same data but with differing analysis methods. Improvements in the wavelength calibration over the UVES pipeline calibration is a key element in both of the null results. This leads to the conclusion that the fundamental constant mu is unchanged to an accuracy of 10^{-5} over the last 80% of the age of the universe, well into the matter dominated epoch. This limit provides constraints on models of dark energy that invoke rolling scalar fields and also limits the parameter space of Super Symmetric or string theory models of physics. New instruments, both planned and under construction, will provide opportunities to greatly improve the accuracy of these measurements.Comment: Accepted for publication in the Astrophysical Journa

    A Theoretical Study of Models for X2Y2 Zintl Ions

    Get PDF
    Ab initio and extended Hückel calculations have been used to discuss the bonding scheme in X₂Y₂ neutral and ionic main group clusters. A qualitative analysis suggests that two different electron counts, 20 and 22, are possible for the butterfly structures of these systems. This results from two orbital crossings in the correlation diagram for the tetrahedral (T_d) -\u3e butterfly (C_2v) -\u3e square-planar (D_2h) transformation. Detailed ab initio computations substantiate this analysis and show that the 20-electron butterfly structure becomes increasingly favored over the tetrahedral one in X₂Y₂ clusters when the 2 atoms have increasing electronegativity difference. These results are in agreement with the known structures for the Pb₂Sb₂²­­­­­­̄ and Sb₂Bi₂²­­­­­­̄ clusters (tetrahedral-like) and the Tl₂Te₂²­­­­­­̄ one (butterfly-like)
    • …
    corecore