4,082 research outputs found
Sequences of purchases in credit card data reveal life styles in urban populations
Zipf-like distributions characterize a wide set of phenomena in physics,
biology, economics and social sciences. In human activities, Zipf-laws describe
for example the frequency of words appearance in a text or the purchases types
in shopping patterns. In the latter, the uneven distribution of transaction
types is bound with the temporal sequences of purchases of individual choices.
In this work, we define a framework using a text compression technique on the
sequences of credit card purchases to detect ubiquitous patterns of collective
behavior. Clustering the consumers by their similarity in purchases sequences,
we detect five consumer groups. Remarkably, post checking, individuals in each
group are also similar in their age, total expenditure, gender, and the
diversity of their social and mobility networks extracted by their mobile phone
records. By properly deconstructing transaction data with Zipf-like
distributions, this method uncovers sets of significant sequences that reveal
insights on collective human behavior.Comment: 30 pages, 26 figure
Computational Sociolinguistics: A Survey
Language is a social phenomenon and variation is inherent to its social
nature. Recently, there has been a surge of interest within the computational
linguistics (CL) community in the social dimension of language. In this article
we present a survey of the emerging field of "Computational Sociolinguistics"
that reflects this increased interest. We aim to provide a comprehensive
overview of CL research on sociolinguistic themes, featuring topics such as the
relation between language and social identity, language use in social
interaction and multilingual communication. Moreover, we demonstrate the
potential for synergy between the research communities involved, by showing how
the large-scale data-driven methods that are widely used in CL can complement
existing sociolinguistic studies, and how sociolinguistics can inform and
challenge the methods and assumptions employed in CL studies. We hope to convey
the possible benefits of a closer collaboration between the two communities and
conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication:
18th February, 201
Hierarchical regression modeling for language research
I demonstrate the application of hierarchical regression modeling, a state-of-the-art technique for statistical inference, to language research. First, a stable sociolinguistic variable in Philadelphia (Labov, 2001) is reconsidered, with attention paid to the treatment of collinearities among socioeconomic predictors. I then demonstrate the use of hierarchical models to account for the random sampling of subjects and items in an experimental setting, using data from a study of word-learning in the face of tonal variation (Quam and Swingley, forthcoming). The results from these case studies demonstrate that modeling sampling from the population has empirical consequences
Basic tasks of sentiment analysis
Subjectivity detection is the task of identifying objective and subjective
sentences. Objective sentences are those which do not exhibit any sentiment.
So, it is desired for a sentiment analysis engine to find and separate the
objective sentences for further analysis, e.g., polarity detection. In
subjective sentences, opinions can often be expressed on one or multiple
topics. Aspect extraction is a subtask of sentiment analysis that consists in
identifying opinion targets in opinionated text, i.e., in detecting the
specific aspects of a product or service the opinion holder is either praising
or complaining about
Symbol Emergence in Robotics: A Survey
Humans can learn the use of language through physical interaction with their
environment and semiotic communication with other people. It is very important
to obtain a computational understanding of how humans can form a symbol system
and obtain semiotic skills through their autonomous mental development.
Recently, many studies have been conducted on the construction of robotic
systems and machine-learning methods that can learn the use of language through
embodied multimodal interaction with their environment and other systems.
Understanding human social interactions and developing a robot that can
smoothly communicate with human users in the long term, requires an
understanding of the dynamics of symbol systems and is crucially important. The
embodied cognition and social interaction of participants gradually change a
symbol system in a constructive manner. In this paper, we introduce a field of
research called symbol emergence in robotics (SER). SER is a constructive
approach towards an emergent symbol system. The emergent symbol system is
socially self-organized through both semiotic communications and physical
interactions with autonomous cognitive developmental agents, i.e., humans and
developmental robots. Specifically, we describe some state-of-art research
topics concerning SER, e.g., multimodal categorization, word discovery, and a
double articulation analysis, that enable a robot to obtain words and their
embodied meanings from raw sensory--motor information, including visual
information, haptic information, auditory information, and acoustic speech
signals, in a totally unsupervised manner. Finally, we suggest future
directions of research in SER.Comment: submitted to Advanced Robotic
- …