63,584 research outputs found
From Frequency to Meaning: Vector Space Models of Semantics
Computers understand very little of the meaning of human language. This
profoundly limits our ability to give instructions to computers, the ability of
computers to explain their actions to us, and the ability of computers to
analyse and process text. Vector space models (VSMs) of semantics are beginning
to address these limits. This paper surveys the use of VSMs for semantic
processing of text. We organize the literature on VSMs according to the
structure of the matrix in a VSM. There are currently three broad classes of
VSMs, based on term-document, word-context, and pair-pattern matrices, yielding
three classes of applications. We survey a broad range of applications in these
three categories and we take a detailed look at a specific open source project
in each category. Our goal in this survey is to show the breadth of
applications of VSMs for semantics, to provide a new perspective on VSMs for
those who are already familiar with the area, and to provide pointers into the
literature for those who are less familiar with the field
Finding Street Gang Members on Twitter
Most street gang members use Twitter to intimidate others, to present
outrageous images and statements to the world, and to share recent illegal
activities. Their tweets may thus be useful to law enforcement agencies to
discover clues about recent crimes or to anticipate ones that may occur.
Finding these posts, however, requires a method to discover gang member Twitter
profiles. This is a challenging task since gang members represent a very small
population of the 320 million Twitter users. This paper studies the problem of
automatically finding gang members on Twitter. It outlines a process to curate
one of the largest sets of verifiable gang member profiles that have ever been
studied. A review of these profiles establishes differences in the language,
images, YouTube links, and emojis gang members use compared to the rest of the
Twitter population. Features from this review are used to train a series of
supervised classifiers. Our classifier achieves a promising F1 score with a low
false positive rate.Comment: 8 pages, 9 figures, 2 tables, Published as a full paper at 2016
IEEE/ACM International Conference on Advances in Social Networks Analysis and
Mining (ASONAM 2016
Australasian Arachnology, Number 71, April 2005
Nearly 20 years after the first
meeting of the Society in Tunanda in
1986 and more than 10 years after the
Internationonal Arachnological Congress
in Brisbane, in 1993, there will be another
‘reunion’ of the Australasian
Arachnological Society. As part of the
Combined Australian Entomological
Society, Society of Australian
Systematic Biologists and Invertebrate
Biodiversity and Conservation
Conference (Australian National
University, Canberra) from 4-9 December
2005, we are organizing a symposium
‘Australasian Arachnology – Evolution,
Ecology and Conservation’ Currently,
there are two sessions earmarked for this
symposium, however, the final format will
be determined by the number of
participants. Please register your interest
with the conference organisers. A call of
abstracts will be sent out in June (for
details please check:
http://www.invertebrates2005.com)
Language in Our Time: An Empirical Analysis of Hashtags
Hashtags in online social networks have gained tremendous popularity during
the past five years. The resulting large quantity of data has provided a new
lens into modern society. Previously, researchers mainly rely on data collected
from Twitter to study either a certain type of hashtags or a certain property
of hashtags. In this paper, we perform the first large-scale empirical analysis
of hashtags shared on Instagram, the major platform for hashtag-sharing. We
study hashtags from three different dimensions including the temporal-spatial
dimension, the semantic dimension, and the social dimension. Extensive
experiments performed on three large-scale datasets with more than 7 million
hashtags in total provide a series of interesting observations. First, we show
that the temporal patterns of hashtags can be categorized into four different
clusters, and people tend to share fewer hashtags at certain places and more
hashtags at others. Second, we observe that a non-negligible proportion of
hashtags exhibit large semantic displacement. We demonstrate hashtags that are
more uniformly shared among users, as quantified by the proposed hashtag
entropy, are less prone to semantic displacement. In the end, we propose a
bipartite graph embedding model to summarize users' hashtag profiles, and rely
on these profiles to perform friendship prediction. Evaluation results show
that our approach achieves an effective prediction with AUC (area under the ROC
curve) above 0.8 which demonstrates the strong social signals possessed in
hashtags.Comment: WWW 201
- …