63,584 research outputs found

    From Frequency to Meaning: Vector Space Models of Semantics

    Full text link
    Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field

    Finding Street Gang Members on Twitter

    Full text link
    Most street gang members use Twitter to intimidate others, to present outrageous images and statements to the world, and to share recent illegal activities. Their tweets may thus be useful to law enforcement agencies to discover clues about recent crimes or to anticipate ones that may occur. Finding these posts, however, requires a method to discover gang member Twitter profiles. This is a challenging task since gang members represent a very small population of the 320 million Twitter users. This paper studies the problem of automatically finding gang members on Twitter. It outlines a process to curate one of the largest sets of verifiable gang member profiles that have ever been studied. A review of these profiles establishes differences in the language, images, YouTube links, and emojis gang members use compared to the rest of the Twitter population. Features from this review are used to train a series of supervised classifiers. Our classifier achieves a promising F1 score with a low false positive rate.Comment: 8 pages, 9 figures, 2 tables, Published as a full paper at 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2016

    Australasian Arachnology, Number 71, April 2005

    Get PDF
    Nearly 20 years after the first meeting of the Society in Tunanda in 1986 and more than 10 years after the Internationonal Arachnological Congress in Brisbane, in 1993, there will be another ‘reunion’ of the Australasian Arachnological Society. As part of the Combined Australian Entomological Society, Society of Australian Systematic Biologists and Invertebrate Biodiversity and Conservation Conference (Australian National University, Canberra) from 4-9 December 2005, we are organizing a symposium ‘Australasian Arachnology – Evolution, Ecology and Conservation’ Currently, there are two sessions earmarked for this symposium, however, the final format will be determined by the number of participants. Please register your interest with the conference organisers. A call of abstracts will be sent out in June (for details please check: http://www.invertebrates2005.com)

    Language in Our Time: An Empirical Analysis of Hashtags

    Get PDF
    Hashtags in online social networks have gained tremendous popularity during the past five years. The resulting large quantity of data has provided a new lens into modern society. Previously, researchers mainly rely on data collected from Twitter to study either a certain type of hashtags or a certain property of hashtags. In this paper, we perform the first large-scale empirical analysis of hashtags shared on Instagram, the major platform for hashtag-sharing. We study hashtags from three different dimensions including the temporal-spatial dimension, the semantic dimension, and the social dimension. Extensive experiments performed on three large-scale datasets with more than 7 million hashtags in total provide a series of interesting observations. First, we show that the temporal patterns of hashtags can be categorized into four different clusters, and people tend to share fewer hashtags at certain places and more hashtags at others. Second, we observe that a non-negligible proportion of hashtags exhibit large semantic displacement. We demonstrate hashtags that are more uniformly shared among users, as quantified by the proposed hashtag entropy, are less prone to semantic displacement. In the end, we propose a bipartite graph embedding model to summarize users' hashtag profiles, and rely on these profiles to perform friendship prediction. Evaluation results show that our approach achieves an effective prediction with AUC (area under the ROC curve) above 0.8 which demonstrates the strong social signals possessed in hashtags.Comment: WWW 201
    • …
    corecore