4,240 research outputs found
State of the art 2015: a literature review of social media intelligence capabilities for counter-terrorism
Overview
This paper is a review of how information and insight can be drawn from open social media sources. It focuses on the specific research techniques that have emerged, the capabilities they provide, the possible insights they offer, and the ethical and legal questions they raise. These techniques are considered relevant and valuable in so far as they can help to maintain public safety by preventing terrorism, preparing for it, protecting the public from it and pursuing its perpetrators. The report also considers how far this can be achieved against the backdrop of radically changing technology and public attitudes towards surveillance. This is an updated version of a 2013 report paper on the same subject, State of the Art. Since 2013, there have been significant changes in social media, how it is used by terrorist groups, and the methods being developed to make sense of it.
The paper is structured as follows:
Part 1 is an overview of social media use, focused on how it is used by groups of interest to those involved in counter-terrorism. This includes new sections on trends of social media platforms; and a new section on Islamic State (IS).
Part 2 provides an introduction to the key approaches of social media intelligence (henceforth âSOCMINTâ) for counter-terrorism.
Part 3 sets out a series of SOCMINT techniques. For each technique a series of capabilities and insights are considered, the validity and reliability of the method is considered, and how they might be applied to counter-terrorism work explored.
Part 4 outlines a number of important legal, ethical and practical considerations when undertaking SOCMINT work
Creating extended gender labelled datasets of Twitter users
The gender information of a Twitter user is not known a priori when analysing Twitter data, because user registration does not include gender information. This paper proposes an approach for creating extended gender labelled datasets of Twitter users. The process involves creating a smaller database of active Twitter users and to manually label the gender. The process follows by extracting features from unstructured information found on each user profile and by creating a gender classification model. The model is then applied to a larger dataset, thus providing automatic labels and corresponding confidence scores, which can be used to estimate the most accurately labeled users. The resulting databases can be further enriched with additional information extracted, for example, from the profile picture and from the user location. The proposed approach was successfully applied to English and Portuguese users, leading to two large datasets containing more than 57K labeled users each.info:eu-repo/semantics/acceptedVersio
Personality Based Recommendation System Using Social Media
Recommendation system is the reason of success for most of the social media companies as well as e-commerce sites. Giving recommendation to the uses is one of the interesting and challenging tasks nowadays, it helps to generate revenue, to increase number of users, to reduce the searching time for particular item. Recommendation system helps for making interest in user and eventually it increases the popularity of any site. Huge number of items (product, users, movies, songs, hotels etc.) and its feature sets makes it hard to predict the accurate items to the user. It is important to keep all historic data of user as well as all information about the items to generate recommendation. In this paper, the personality of the user is used with the combination on the most popular recommendation techniques like collaborative filtering (CF) and content based filtering (CB) proposed on the amazon review data set. In the first model the personality of the user is calculated by using the big five model on the twitter account. In the second module Collaborative filtering is used to generate the recommendation based on the historic information of the user wherries in third module, Content based filtering is uses to generate recommendation based on the feature set of the item. Pearson-correlation algorithm is applied on both modules and ranking are generated. Finally union of the both vector space are taken as the final recommendation
Analyzing Social and Stylometric Features to Identify Spear phishing Emails
Spear phishing is a complex targeted attack in which, an attacker harvests
information about the victim prior to the attack. This information is then used
to create sophisticated, genuine-looking attack vectors, drawing the victim to
compromise confidential information. What makes spear phishing different, and
more powerful than normal phishing, is this contextual information about the
victim. Online social media services can be one such source for gathering vital
information about an individual. In this paper, we characterize and examine a
true positive dataset of spear phishing, spam, and normal phishing emails from
Symantec's enterprise email scanning service. We then present a model to detect
spear phishing emails sent to employees of 14 international organizations, by
using social features extracted from LinkedIn. Our dataset consists of 4,742
targeted attack emails sent to 2,434 victims, and 9,353 non targeted attack
emails sent to 5,912 non victims; and publicly available information from their
LinkedIn profiles. We applied various machine learning algorithms to this
labeled data, and achieved an overall maximum accuracy of 97.76% in identifying
spear phishing emails. We used a combination of social features from LinkedIn
profiles, and stylometric features extracted from email subjects, bodies, and
attachments. However, we achieved a slightly better accuracy of 98.28% without
the social features. Our analysis revealed that social features extracted from
LinkedIn do not help in identifying spear phishing emails. To the best of our
knowledge, this is one of the first attempts to make use of a combination of
stylometric features extracted from emails, and social features extracted from
an online social network to detect targeted spear phishing emails.Comment: Detection of spear phishing using social media feature
A survey of location inference techniques on Twitter
The increasing popularity of the social networking service, Twitter, has made it more involved in day-to-day communications, strengthening social relationships and information dissemination. Conversations on Twitter are now being explored as indicators within early warning systems to alert of imminent natural disasters such as earthquakes and aid prompt emergency responses to crime. Producers are privileged to have limitless access to market perception from consumer comments on social media and microblogs. Targeted advertising can be made more effective based on user profile information such as demography, interests and location. While these applications have proven beneficial, the ability to effectively infer the location of Twitter users has even more immense value. However, accurately identifying where a message originated from or an authorâs location remains a challenge, thus essentially driving research in that regard. In this paper, we survey a range of techniques applied to infer the location of Twitter users from inception to state of the art. We find significant improvements over time in the granularity levels and better accuracy with results driven by refinements to algorithms and inclusion of more spatial features
Detecting portuguese and english Twitter usersâ gender
Existing social networking services provide means for people to communicate and express
their feelings in a easy way. Such user generated content contains clues of userâs behaviors and
preferences, as well as other metadata information that is now available for scientific research.
Twitter, in particular, has become a relevant source for social networking studies, mainly because:
it provides a simple way for users to express their feelings, ideas, and opinions; makes
the user generated content and associated metadata available to the community; and furthermore
provides easy-to-use web interfaces and application programming interfaces (API) to access
data. For many studies, the available information about a user is relevant. However, the gender
attribute is not provided when creating a Twitter account.
The main focus of this study is to infer the usersâ gender from other available information.
We propose a methodology for gender detection of Twitter users, using unstructured information
found on Twitter profile, user generated content, and later using the userâs profile picture.
In previous studies, one of the challenges presented was the labor-intensive task of manually
labelling datasets. In this study, we propose a method for creating extended labelled datasets in
a semi-automatic fashion. With the extended labelled datasets, we associate the usersâ textual
content with their gender and created gender models, based on the usersâ generated content and
profile information. We explore supervised and unsupervised classifiers and evaluate the results
in both Portuguese and English Twitter user datasets. We obtained an accuracy of 93.2% with
English users and an accuracy of 96.9% with Portuguese users. The proposed methodology of
our research is language independent, but our focus was given to Portuguese and English users.Os serviços de redes sociais existentes proporcionam meios para as pessoas comunicarem
e exprimirem os seus sentimentos de uma forma fĂĄcil. O conteĂșdo gerado por estes utilizadores
contĂ©m indĂcios dos seus comportamentos e preferĂȘncias, bem como outros metadados que estĂŁo
agora disponĂveis para investigação cientĂfica. O Twitter em particular, tornou-se uma fonte
importante para estudos das redes socias, sobretudo porque fornece um modo simples para os
utilizadores expressarem os seus sentimentos, ideias e opiniĂ”es; disponibiliza o conteĂșdo gerado
pelos utilizadores e os metadados associados Ă comunidade; e fornece interfaces web e interfaces
de programação de aplicaçÔes (API) para acesso aos dados de fåcil utilização. Para muitos
estudos, a informação disponĂvel sobre um utilizador Ă© relevante. No entanto, o atributo de
género não é fornecido ao criar uma conta no Twitter.
O foco principal deste estudo é inferir o género dos utilizadores através da informação
disponĂvel. Propomos uma metodologia para a detecção de gĂ©nero de utilizadores do Twitter,
usando informação nĂŁo estruturada encontrada no perfil do Twitter, no conteĂșdo gerado pelo
utilizador, e mais tarde usando a imagem de perfil do utilizador. Em estudos anteriores, um dos
desafios apresentados foi a tarefa de etiquetar manualmente dados, que revelou exigir bastante
trabalho. Neste estudo, propomos um método para a criação de conjuntos de dados etiquetados
de uma forma semi-automåtica, utilizando um conjunto de atributos com base na informação
nĂŁo estruturada de perfil. Utilizando os conjuntos de dados etiquetados, associamos conteĂșdo
textual ao seu gĂ©nero e criamos modelos, com base no conteĂșdo gerado pelos utilizadores, e
na informação de perfil. Exploramos classificadores supervisionados e não supervisionados e
avaliamos os resultados em ambos os conjuntos de dados de utilizadores Portugueses e Ingleses
do Twitter. Obtivemos uma precisĂŁo de 93,2% com utilizadores Ingleses e uma precisĂŁo de
96,9% com utilizadores Portugueses. A metodologia proposta Ă© independente do idioma, mas
o foco foi dado a utilizadores Portugueses e Ingleses
- âŠ