Search CORE

60,463 research outputs found

#Bieber + #Blast = #BieberBlast: Early Prediction of Popular Hashtag Compounds

Author: Bagasheva A.
Caleffi P.-M.
Cassell J.
Cook P.
Croft W.
Cunha E.
Eisenstein J.
Eisenstein J.
Giegerich H. J.
Hacken P.
Hong L.
Hu Y.
Lee C.-y.
Lerman K.
Lin Y.-R.
Lui M.
Léturgie A.
Medler D. A.
Milroy J.
Nguyen T.
Owoputi O.
Ritter A.
Ritter A.
Weng L.
Yang J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/10/2015
Field of study

Compounding of natural language units is a very common phenomena. In this paper, we show, for the first time, that Twitter hashtags which, could be considered as correlates of such linguistic units, undergo compounding. We identify reasons for this compounding and propose a prediction model that can identify with 77.07% accuracy if a pair of hashtags compounding in the near future (i.e., 2 months after compounding) shall become popular. At longer times T = 6, 10 months the accuracies are 77.52% and 79.13% respectively. This technique has strong implications to trending hashtag recommendation since newly formed hashtag compounds can be recommended early, even before the compounding has taken place. Further, humans can predict compounds with an overall accuracy of only 48.7% (treated as baseline). Notably, while humans can discriminate the relatively easier cases, the automatic framework is successful in classifying the relatively harder cases.Comment: 14 pages, 4 figures, 9 tables, published in CSCW (Computer-Supported Cooperative Work and Social Computing) 2016. in Proceedings of 19th ACM conference on Computer-Supported Cooperative Work and Social Computing (CSCW 2016

arXiv.org e-Print Archive

Crossref

Bidirectional syntactic priming across cognitive domains: from arithmetic to language and back

Author: Frazier L.
Friederici A. D.
Hardin J.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2014
Field of study

Scheepers et al. (2011) showed that the structure of a correctly solved mathematical equation affects how people subsequently complete sentences containing high vs. low relative-clause attachment ambiguities. Here we investigated whether such effects generalise to different structures and tasks, and importantly, whether they also hold in the reverse direction (i.e., from linguistic to mathematical processing). In a questionnaire-based experiment, participants had to solve structurally left- or right-branching equations (e.g., 5 × 2 + 7 versus 5 + 2 × 7) and to provide sensicality ratings for structurally left- or right-branching adjective-noun-noun compounds (e.g., alien monster movie versus lengthy monster movie). In the first version of the experiment, the equations were used as primes and the linguistic expressions as targets (investigating structural priming from maths to language). In the second version, the order was reversed (language-to-maths priming). Both versions of the experiment showed clear structural priming effects, conceptually replicating and extending the findings from Scheepers et al. (2011). Most crucially, the observed bi-directionality of cross-domain structural priming strongly supports the notion of shared syntactic representations (or recursive procedures to generate and parse them) between arithmetic and language

Crossref

Edinburgh Research Explorer

Enlighten

Extending the adverbial coverage of a NLP oriented resource for French

Author: Stavroula Voyatzi
Tolone Elsa
Publication venue
Publication date: 08/11/2011
Field of study

This paper presents a work on extending the adverbial entries of LGLex: a NLP oriented syntactic resource for French. Adverbs were extracted from the Lexicon-Grammar tables of both simple adverbs ending in -ment '-ly' (Molinier and Levrier, 2000) and compound adverbs (Gross, 1986; 1990). This work relies on the exploitation of fine-grained linguistic information provided in existing resources. Various features are encoded in both LG tables and they haven't been exploited yet. They describe the relations of deleting, permuting, intensifying and paraphrasing that associate, on the one hand, the simple and compound adverbs and, on the other hand, different types of compound adverbs. The resulting syntactic resource is manually evaluated and freely available under the LGPL-LR license.Comment: Proceedings of the 5th International Joint Conference on Natural Language Processing (IJCNLP'11), Chiang Mai : Thailand (2011

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Meta-Learning for Phonemic Annotation of Corpora

Author: Daelemans W.
Gillis S.
Hoste V.
Tjong Kim Sang E.F.
van den Bosch A.
Weigand H.
Publication venue
Publication date: 01/01/2000
Field of study

We apply rule induction, classifier combination and meta-learning (stacked classifiers) to the problem of bootstrapping high accuracy automatic annotation of corpora with pronunciation information. The task we address in this paper consists of generating phonemic representations reflecting the Flemish and Dutch pronunciations of a word on the basis of its orthographic representation (which in turn is based on the actual speech recordings). We compare several possible approaches to achieve the text-to-pronunciation mapping task: memory-based learning, transformation-based learning, rule induction, maximum entropy modeling, combination of classifiers in stacked learning, and stacking of meta-learners. We are interested both in optimal accuracy and in obtaining insight into the linguistic regularities involved. As far as accuracy is concerned, an already high accuracy level (93% for Celex and 86% for Fonilex at word level) for single classifiers is boosted significantly with additional error reductions of 31% and 38% respectively using combination of classifiers, and a further 5% using combination of meta-learners, bringing overall word level accuracy to 96% for the Dutch variant and 92% for the Flemish variant. We also show that the application of machine learning methods indeed leads to increased insight into the linguistic regularities determining the variation between the two pronunciation variants studied.Comment: 8 page

arXiv.org e-Print Archive

CiteSeerX

Ghent University Academic Bibliography

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Atlas.txt : Exploring Lingusitic Grounding Techniques for Communicating Spatial Information to Blind Users

Author: C Thinus-Blanc
H Grice
H Taylor
Kavita E. Thomas
M Brambring
M Noordzij
M Noordzij
Matthijs L. Noordzij
P Thorndyke
S Harper
S Millar
Somayajulu Sripada
T Tenbrink
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2012
Field of study

Peer reviewedPostprin

Aberdeen University Research

CiteSeerX

Crossref

University of Twente Research Information