Search CORE

50,509 research outputs found

#Bieber + #Blast = #BieberBlast: Early Prediction of Popular Hashtag Compounds

Author: Bagasheva A.
Caleffi P.-M.
Cassell J.
Cook P.
Croft W.
Cunha E.
Eisenstein J.
Eisenstein J.
Giegerich H. J.
Hacken P.
Hong L.
Hu Y.
Lee C.-y.
Lerman K.
Lin Y.-R.
Lui M.
Léturgie A.
Medler D. A.
Milroy J.
Nguyen T.
Owoputi O.
Ritter A.
Ritter A.
Weng L.
Yang J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/10/2015
Field of study

Compounding of natural language units is a very common phenomena. In this paper, we show, for the first time, that Twitter hashtags which, could be considered as correlates of such linguistic units, undergo compounding. We identify reasons for this compounding and propose a prediction model that can identify with 77.07% accuracy if a pair of hashtags compounding in the near future (i.e., 2 months after compounding) shall become popular. At longer times T = 6, 10 months the accuracies are 77.52% and 79.13% respectively. This technique has strong implications to trending hashtag recommendation since newly formed hashtag compounds can be recommended early, even before the compounding has taken place. Further, humans can predict compounds with an overall accuracy of only 48.7% (treated as baseline). Notably, while humans can discriminate the relatively easier cases, the automatic framework is successful in classifying the relatively harder cases.Comment: 14 pages, 4 figures, 9 tables, published in CSCW (Computer-Supported Cooperative Work and Social Computing) 2016. in Proceedings of 19th ACM conference on Computer-Supported Cooperative Work and Social Computing (CSCW 2016

arXiv.org e-Print Archive

Crossref

Investigating five key predictive text entry with combined distance and keystroke modelling

Author: CL James
IH Witten
Mark D. Dunlop
MD Dunlop
Michelle Montgomery Masters
PM Fitts
SA Brewster
SK Card
SM Katz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2008
Field of study

This paper investigates text entry on mobile devices using only five-keys. Primarily to support text entry on smaller devices than mobile phones, this method can also be used to maximise screen space on mobile phones. Reported combined Fitt's law and keystroke modelling predicts similar performance with bigram prediction using a five-key keypad as is currently achieved on standard mobile phones using unigram prediction. User studies reported here show similar user performance on five-key pads as found elsewhere for novice nine-key pad users

Crossref

University of Strathclyde Institutional Repository

Computational Sociolinguistics: A Survey

Author: de Jong Franciska
Doğruöz A. Seza
Nguyen Dong
Rosé Carolyn P.
Publication venue
Publication date: 01/01/2016
Field of study

Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of "Computational Sociolinguistics" that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies. We hope to convey the possible benefits of a closer collaboration between the two communities and conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication: 18th February, 201

arXiv.org e-Print Archive

Crossref

Ghent University Academic Bibliography

EUR Research Repository

University of Twente Research Information

Inference of the Russian drug community from one of the largest social networks in the Russian Federation

Author: Boukhanovsky A. V.
Dijkstra L. J.
Duijn P. A. C.
Sloot P. M. A.
Yakushev A. V.
Publication venue
Publication date: 13/05/2013
Field of study

The criminal nature of narcotics complicates the direct assessment of a drug community, while having a good understanding of the type of people drawn or currently using drugs is vital for finding effective intervening strategies. Especially for the Russian Federation this is of immediate concern given the dramatic increase it has seen in drug abuse since the fall of the Soviet Union in the early nineties. Using unique data from the Russian social network 'LiveJournal' with over 39 million registered users worldwide, we were able for the first time to identify the on-line drug community by context sensitive text mining of the users' blogs using a dictionary of known drug-related official and 'slang' terminology. By comparing the interests of the users that most actively spread information on narcotics over the network with the interests of the individuals outside the on-line drug community, we found that the 'average' drug user in the Russian Federation is generally mostly interested in topics such as Russian rock, non-traditional medicine, UFOs, Buddhism, yoga and the occult. We identify three distinct scale-free sub-networks of users which can be uniquely classified as being either 'infectious', 'susceptible' or 'immune'.Comment: 12 pages, 11 figure

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

Purported use and self-awareness of cognitive and metacognitive foreign language reading strategies in tertiary education in Mozambique

Author: Cabinda Manuel Joao Jose
Publication venue
Publication date: 01/01/2016
Field of study

This paper explores the results of a Survey of Reading Strategies (SORS)-based questionnaire administered to 28 university student participants. The study is carried out in a post-colonial multilingual context, Mozambique. The main aims of the paper are to assess the degree of purported use and awareness of participants own use of reading comprehension skills and strategies in a foreign language (English). The participants were tested for their reading text comprehension using an IELTS comprehension test (Cabinda, 2013). The results revealed low reading comprehension levels. Results contrast with results from the SORS-based questionnaire (Cabinda, 2013) which revealed claims of use of a wide range of cognitive, metacognitive and supply strategies – aspects of high level reading ability and text comprehension. Conclusions show that the participants used or claimed to chiefly use metacognitive and cognitive reading strategies equally, matching the behaviour of good readers, but they also reported a high degree of supply strategies to construe meaning from text, mainly code-switching, translation and cognates. The latter confirms results from studies by Jimenez et al. (1995, 1996) and Zhang & Wu (2009), yet do not conclusively show a correlation between the participants’ degree of text comprehension and their effective use of reading skills and strategies to construe meaning. Further conclusions show that the reported high use of these L1 (Portuguese or other) related supply strategies (not used by English L1 readers) does not aid their reading comprehension

Ghent University Academic Bibliography