Search CORE

444 research outputs found

A Continuously Growing Dataset of Sentential Paraphrases

Author: He Hua
Lan Wuwei
Qiu Siyu
Xu Wei
Publication venue
Publication date: 01/01/2017
Field of study

A major challenge in paraphrase research is the lack of parallel corpora. In this paper, we present a new method to collect large-scale sentential paraphrases from Twitter by linking tweets through shared URLs. The main advantage of our method is its simplicity, as it gets rid of the classifier or human in the loop needed to select data before annotation and subsequent application of paraphrase identification algorithms in the previous work. We present the largest human-labeled paraphrase corpus to date of 51,524 sentence pairs and the first cross-domain benchmarking for automatic paraphrase identification. In addition, we show that more than 30,000 new sentential paraphrases can be easily and continuously captured every month at ~70% precision, and demonstrate their utility for downstream NLP tasks through phrasal paraphrase extraction. We make our code and data freely available.Comment: 11 pages, accepted to EMNLP 201

arXiv.org e-Print Archive

Crossref

“I” vs “me”: the urbanization of “post-80s” and “post-90s” Chinese migrant workers

Author: Aziz Noshaba
He Jun
Xu Siyu
Publication venue: Taylor and Francis Group and Juraj Dobrila University of Pula, Faculty of economics and tourism Dr. Mijo Mirković
Publication date: 01/01/2023
Field of study

The difference in self-identity among migrant workers of the new generation leads them towards different desires regarding urbanization. In this regard, it is imperative to explore the influence of self-identity on the migrant workers’ willingness to stay. To explore the phenomenon empirically, the current study used data sourced from the China Migrants Dynamics Survey (CMDS), during the year, 2017. The study employed the Heckman two-stage selection model to explore the study objective. Further, the study also employed the machine learning methods for robustness check. The outcome showed that the “I” identity has a more significant impact on the urbanization by migrant workers belonging to the “post-90s”. In comparison, the identity of “Me” has a more significant impact on the urbanization by migrant workers belonging to the era of the 1980s. And it is clear that if “post- 80s” and “post-90s” migrant workers are uniformly divided into the union of new generation, the differences and characteristics within them may conceal. The overall findings proposes that based on the differences in migrant workers’ self-identity, both born in the 1980s and 1990s, there is a need to formulate related policies to promote their residence and boost urbanization

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

An exact solution of spherical mean-field plus orbit-dependent non-separable pairing model with two non-degenerate j-orbits

Author: Draayer J. P.
He Yingwen
Pan Feng
Yang Siyu
Yuan Shuli
Zhang Yunfeng
Publication venue: 'Elsevier BV'
Publication date: 26/10/2018
Field of study

An exact solution of nuclear spherical mean-field plus orbit-dependent non-separable pairing model with two non-degenerate j-orbits is presented. The extended one-variable Heine-Stieltjes polynomials associated to the Bethe ansatz equations of the solution are determined, of which the sets of the zeros give the solution of the model, and can be determined relatively easily. A comparison of the solution to that of the standard pairing interaction with constant interaction strength among pairs in any orbit is made. It is shown that the overlaps of eigenstates of the model with those of the standard pairing model are always large, especially for the ground and the first excited state. However, the quantum phase crossover in the non-separable pairing model cannot be accounted for by the standard pairing interaction.Comment: 5 pages, 1 figure, LaTe

arXiv.org e-Print Archive

Louisiana State University

Learning to Predict the Cosmological Structure Formation

Author: Chen Wei
Feng Yu
He Siyu
Ho Shirley
Li Yin
Póczos Barnabás
Ravanbakhsh Siamak
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/07/2019
Field of study

Matter evolved under influence of gravity from minuscule density fluctuations. Non-perturbative structure formed hierarchically over all scales, and developed non-Gaussian features in the Universe, known as the Cosmic Web. To fully understand the structure formation of the Universe is one of the holy grails of modern astrophysics. Astrophysicists survey large volumes of the Universe and employ a large ensemble of computer simulations to compare with the observed data in order to extract the full information of our own Universe. However, to evolve trillions of galaxies over billions of years even with the simplest physics is a daunting task. We build a deep neural network, the Deep Density Displacement Model (hereafter D

^3

M), to predict the non-linear structure formation of the Universe from simple linear perturbation theory. Our extensive analysis, demonstrates that D

^3

M outperforms the second order perturbation theory (hereafter 2LPT), the commonly used fast approximate simulation method, in point-wise comparison, 2-point correlation, and 3-point correlation. We also show that D

^3

M is able to accurately extrapolate far beyond its training data, and predict structure formation for significantly different cosmological parameters. Our study proves, for the first time, that deep learning is a practical and accurate alternative to approximate simulations of the gravitational structure formation of the Universe.Comment: 8 pages, 5 figures, 1 tabl

arXiv.org e-Print Archive

eScholarship - University of California

Detecting Galaxy-Filament Alignments in the Sloan Digital Sky Survey III

Author: Blazek Jonathan
Chen Yen-Chi
He Siyu
Ho Shirley
Mandelbaum Rachel
Melchior Peter
Singh Sukhdeep
Publication venue: 'Oxford University Press (OUP)'
Publication date: 21/02/2019
Field of study

Previous studies have shown the filamentary structures in the cosmic web influence the alignments of nearby galaxies. We study this effect in the LOWZ sample of the Sloan Digital Sky Survey using the "Cosmic Web Reconstruction" filament catalogue. We find that LOWZ galaxies exhibit a small but statistically significant alignment in the direction parallel to the orientation of nearby filaments. This effect is detectable even in the absence of nearby galaxy clusters, which suggests it is an effect from the matter distribution in the filament. A nonparametric regression model suggests that the alignment effect with filaments extends over separations of 30-40 Mpc. We find that galaxies that are bright and early-forming align more strongly with the directions of nearby filaments than those that are faint and late-forming; however, trends with stellar mass are less statistically significant, within the narrow range of stellar mass of this sample.Comment: 14 pages, 13 figures. Accepted to the MNRA

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Princeton University Open Access Repository