444 research outputs found

    A Continuously Growing Dataset of Sentential Paraphrases

    Full text link
    A major challenge in paraphrase research is the lack of parallel corpora. In this paper, we present a new method to collect large-scale sentential paraphrases from Twitter by linking tweets through shared URLs. The main advantage of our method is its simplicity, as it gets rid of the classifier or human in the loop needed to select data before annotation and subsequent application of paraphrase identification algorithms in the previous work. We present the largest human-labeled paraphrase corpus to date of 51,524 sentence pairs and the first cross-domain benchmarking for automatic paraphrase identification. In addition, we show that more than 30,000 new sentential paraphrases can be easily and continuously captured every month at ~70% precision, and demonstrate their utility for downstream NLP tasks through phrasal paraphrase extraction. We make our code and data freely available.Comment: 11 pages, accepted to EMNLP 201

    “I” vs “me”: the urbanization of “post-80s” and “post-90s” Chinese migrant workers

    Get PDF
    The difference in self-identity among migrant workers of the new generation leads them towards different desires regarding urbanization. In this regard, it is imperative to explore the influence of self-identity on the migrant workers’ willingness to stay. To explore the phenomenon empirically, the current study used data sourced from the China Migrants Dynamics Survey (CMDS), during the year, 2017. The study employed the Heckman two-stage selection model to explore the study objective. Further, the study also employed the machine learning methods for robustness check. The outcome showed that the “I” identity has a more significant impact on the urbanization by migrant workers belonging to the “post-90s”. In comparison, the identity of “Me” has a more significant impact on the urbanization by migrant workers belonging to the era of the 1980s. And it is clear that if “post- 80s” and “post-90s” migrant workers are uniformly divided into the union of new generation, the differences and characteristics within them may conceal. The overall findings proposes that based on the differences in migrant workers’ self-identity, both born in the 1980s and 1990s, there is a need to formulate related policies to promote their residence and boost urbanization

    An exact solution of spherical mean-field plus orbit-dependent non-separable pairing model with two non-degenerate j-orbits

    Get PDF
    An exact solution of nuclear spherical mean-field plus orbit-dependent non-separable pairing model with two non-degenerate j-orbits is presented. The extended one-variable Heine-Stieltjes polynomials associated to the Bethe ansatz equations of the solution are determined, of which the sets of the zeros give the solution of the model, and can be determined relatively easily. A comparison of the solution to that of the standard pairing interaction with constant interaction strength among pairs in any orbit is made. It is shown that the overlaps of eigenstates of the model with those of the standard pairing model are always large, especially for the ground and the first excited state. However, the quantum phase crossover in the non-separable pairing model cannot be accounted for by the standard pairing interaction.Comment: 5 pages, 1 figure, LaTe

    Learning to Predict the Cosmological Structure Formation

    Get PDF
    Matter evolved under influence of gravity from minuscule density fluctuations. Non-perturbative structure formed hierarchically over all scales, and developed non-Gaussian features in the Universe, known as the Cosmic Web. To fully understand the structure formation of the Universe is one of the holy grails of modern astrophysics. Astrophysicists survey large volumes of the Universe and employ a large ensemble of computer simulations to compare with the observed data in order to extract the full information of our own Universe. However, to evolve trillions of galaxies over billions of years even with the simplest physics is a daunting task. We build a deep neural network, the Deep Density Displacement Model (hereafter D3^3M), to predict the non-linear structure formation of the Universe from simple linear perturbation theory. Our extensive analysis, demonstrates that D3^3M outperforms the second order perturbation theory (hereafter 2LPT), the commonly used fast approximate simulation method, in point-wise comparison, 2-point correlation, and 3-point correlation. We also show that D3^3M is able to accurately extrapolate far beyond its training data, and predict structure formation for significantly different cosmological parameters. Our study proves, for the first time, that deep learning is a practical and accurate alternative to approximate simulations of the gravitational structure formation of the Universe.Comment: 8 pages, 5 figures, 1 tabl

    Detecting Galaxy-Filament Alignments in the Sloan Digital Sky Survey III

    Full text link
    Previous studies have shown the filamentary structures in the cosmic web influence the alignments of nearby galaxies. We study this effect in the LOWZ sample of the Sloan Digital Sky Survey using the "Cosmic Web Reconstruction" filament catalogue. We find that LOWZ galaxies exhibit a small but statistically significant alignment in the direction parallel to the orientation of nearby filaments. This effect is detectable even in the absence of nearby galaxy clusters, which suggests it is an effect from the matter distribution in the filament. A nonparametric regression model suggests that the alignment effect with filaments extends over separations of 30-40 Mpc. We find that galaxies that are bright and early-forming align more strongly with the directions of nearby filaments than those that are faint and late-forming; however, trends with stellar mass are less statistically significant, within the narrow range of stellar mass of this sample.Comment: 14 pages, 13 figures. Accepted to the MNRA
    • …
    corecore