39,014 research outputs found
Crowdsourcing Cybersecurity: Cyber Attack Detection using Social Media
Social media is often viewed as a sensor into various societal events such as
disease outbreaks, protests, and elections. We describe the use of social media
as a crowdsourced sensor to gain insight into ongoing cyber-attacks. Our
approach detects a broad range of cyber-attacks (e.g., distributed denial of
service (DDOS) attacks, data breaches, and account hijacking) in an
unsupervised manner using just a limited fixed set of seed event triggers. A
new query expansion strategy based on convolutional kernels and dependency
parses helps model reporting structure and aids in identifying key event
characteristics. Through a large-scale analysis over Twitter, we demonstrate
that our approach consistently identifies and encodes events, outperforming
existing methods.Comment: 13 single column pages, 5 figures, submitted to KDD 201
Firsthand Opiates Abuse on Social Media: Monitoring Geospatial Patterns of Interest Through a Digital Cohort
In the last decade drug overdose deaths reached staggering proportions in the
US. Besides the raw yearly deaths count that is worrisome per se, an alarming
picture comes from the steep acceleration of such rate that increased by 21%
from 2015 to 2016. While traditional public health surveillance suffers from
its own biases and limitations, digital epidemiology offers a new lens to
extract signals from Web and Social Media that might be complementary to
official statistics. In this paper we present a computational approach to
identify a digital cohort that might provide an updated and complementary view
on the opioid crisis. We introduce an information retrieval algorithm suitable
to identify relevant subspaces of discussion on social media, for mining data
from users showing explicit interest in discussions about opioid consumption in
Reddit. Moreover, despite the pseudonymous nature of the user base, almost 1.5
million users were geolocated at the US state level, resembling the census
population distribution with a good agreement. A measure of prevalence of
interest in opiate consumption has been estimated at the state level, producing
a novel indicator with information that is not entirely encoded in the standard
surveillance. Finally, we further provide a domain specific vocabulary
containing informal lexicon and street nomenclature extracted by user-generated
content that can be used by researchers and practitioners to implement novel
digital public health surveillance methodologies for supporting policy makers
in fighting the opioid epidemic.Comment: Proceedings of the 2019 World Wide Web Conference (WWW '19
A Topic Recommender for Journalists
The way in which people acquire information on events and form their own
opinion on them has changed dramatically with the advent of social media. For many
readers, the news gathered from online sources become an opportunity to share points
of view and information within micro-blogging platforms such as Twitter, mainly
aimed at satisfying their communication needs. Furthermore, the need to deepen the
aspects related to news stimulates a demand for additional information which is often
met through online encyclopedias, such as Wikipedia. This behaviour has also
influenced the way in which journalists write their articles, requiring a careful assessment
of what actually interests the readers. The goal of this paper is to present
a recommender system, What to Write and Why, capable of suggesting to a journalist,
for a given event, the aspects still uncovered in news articles on which the
readers focus their interest. The basic idea is to characterize an event according to
the echo it receives in online news sources and associate it with the corresponding
readers’ communicative and informative patterns, detected through the analysis of
Twitter and Wikipedia, respectively. Our methodology temporally aligns the results
of this analysis and recommends the concepts that emerge as topics of interest from
Twitter and Wikipedia, either not covered or poorly covered in the published news
articles
Investigative Simulation: Towards Utilizing Graph Pattern Matching for Investigative Search
This paper proposes the use of graph pattern matching for investigative graph
search, which is the process of searching for and prioritizing persons of
interest who may exhibit part or all of a pattern of suspicious behaviors or
connections. While there are a variety of applications, our principal
motivation is to aid law enforcement in the detection of homegrown violent
extremists. We introduce investigative simulation, which consists of several
necessary extensions to the existing dual simulation graph pattern matching
scheme in order to make it appropriate for intelligence analysts and law
enforcement officials. Specifically, we impose a categorical label structure on
nodes consistent with the nature of indicators in investigations, as well as
prune or complete search results to ensure sensibility and usefulness of
partial matches to analysts. Lastly, we introduce a natural top-k ranking
scheme that can help analysts prioritize investigative efforts. We demonstrate
performance of investigative simulation on a real-world large dataset.Comment: 8 pages, 6 figures. Paper to appear in the Fosint-SI 2016 conference
proceedings in conjunction with the 2016 IEEE/ACM International Conference on
Advances in Social Networks Analysis and Mining ASONAM 201
Excitable human dynamics driven by extrinsic events in massive communities
Using empirical data from a social media site (Twitter) and on trading
volumes of financial securities, we analyze the correlated human activity in
massive social organizations. The activity, typically excited by real-world
events and measured by the occurrence rate of international brand names and
trading volumes, is characterized by intermittent fluctuations with bursts of
high activity separated by quiescent periods. These fluctuations are broadly
distributed with an inverse cubic tail and have long-range temporal
correlations with a power spectrum. We describe the activity by a
stochastic point process and derive the distribution of activity levels from
the corresponding stochastic differential equation. The distribution and the
corresponding power spectrum are fully consistent with the empirical
observations.Comment: 9 pages, 3 figure
- …