Search CORE

8,936 research outputs found

Semantic Stability in Social Tagging Streams

Author: Huberman Bernardo A.
Singer Philipp
Strohmaier Markus
Wagner Claudia
Publication venue
Publication date: 01/01/2013
Field of study

One potential disadvantage of social tagging systems is that due to the lack of a centralized vocabulary, a crowd of users may never manage to reach a consensus on the description of resources (e.g., books, users or songs) on the Web. Yet, previous research has provided interesting evidence that the tag distributions of resources may become semantically stable over time as more and more users tag them. At the same time, previous work has raised an array of new questions such as: (i) How can we assess the semantic stability of social tagging systems in a robust and methodical way? (ii) Does semantic stabilization of tags vary across different social tagging systems and ultimately, (iii) what are the factors that can explain semantic stabilization in such systems? In this work we tackle these questions by (i) presenting a novel and robust method which overcomes a number of limitations in existing methods, (ii) empirically investigating semantic stabilization processes in a wide range of social tagging systems with distinct domains and properties and (iii) detecting potential causes for semantic stabilization, specifically imitation behavior, shared background knowledge and intrinsic properties of natural language. Our results show that tagging streams which are generated by a combination of imitation dynamics and shared background knowledge exhibit faster and higher semantic stability than tagging streams which are generated via imitation dynamics or natural language streams alone

arXiv.org e-Print Archive

Crossref

SSOAR - Social Science Open Access Repository

MAnnheim DOCument Server

Confounds and Consequences in Geotagged Twitter Data

Author: Eisenstein Jacob
Pavalanathan Umashanthi
Publication venue
Publication date: 01/01/2015
Field of study

Twitter is often used in quantitative studies that identify geographically-preferred topics, writing styles, and entities. These studies rely on either GPS coordinates attached to individual messages, or on the user-supplied location field in each profile. In this paper, we compare these data acquisition techniques and quantify the biases that they introduce; we also measure their effects on linguistic analysis and text-based geolocation. GPS-tagging and self-reported locations yield measurably different corpora, and these linguistic differences are partially attributable to differences in dataset composition by age and gender. Using a latent variable model to induce age and gender, we show how these demographic variables interact with geography to affect language use. We also show that the accuracy of text-based geolocation varies with population demographics, giving the best results for men above the age of 40.Comment: final version for EMNLP 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Geo, audio, video, photo:How digital convergence in mobile devices facilitates participatory culture in libraries

Author: Abbott Wendy
Donaghey Jessie
Hare Jo
Hopkins Peta
Publication venue: 'Informa UK Limited'
Publication date: 13/12/2014
Field of study

Bond University Research Portal

Crossref

Recommended from our members

E-liquid-related posts to Twitter in 2018: Thematic analysis.

Author: Allem Jon-Patrick
Cruz Tess Boley
Dharmapuri Likhit
Majmundar Anuja
Unger Jennifer B
Publication venue: eScholarship, University of California
Publication date: 01/12/2019
Field of study

IntroductionE-liquid is the solution aerosolized by e-cigarette devices to produce vapor. Continuously evolving e-liquids, and corresponding devices, can affect user experiences associated with these products. Twitter conversations about e-liquids can capture salient behavioral, social, and communicative cues associated with e-liquids. We analyzed Twitter data to characterize key topics of conversation about e-liquids to inform surveillance, and regulatory efforts.MethodsTwitter posts containing e-liquid-related terms ("e-liquid(s)," "e-juice(s)") were obtained from 1 January 2018 to 31 December 2018. Text classifiers were used to identify topics of the posts (n = 15,927).ResultsThe most prevalent topic was Promotional at 29.35% followed by Flavors at 24.22%, and Person Tagging at 21.47%. Juice Composition was next most prevalent at 17.61% followed by Cannabis at 16.83%, and Nicotine Health Risks at 6.39%. Quit Smoking was rare at 0.57%.ConclusionThese results suggest that flavors, cannabis, health risks of nicotine, and composition warrant consideration as targets in future surveillance, public policy, and interventions addressing the use of e-liquids. Twitter provides ample opportunity to influence the normalization, and uptake, of e-cigarette-related products among non-smokers and youth, unless regulatory restrictions, and counter messaging campaigns are developed to reduce this risk

eScholarship - University of California