39,014 research outputs found

    Crowdsourcing Cybersecurity: Cyber Attack Detection using Social Media

    Full text link
    Social media is often viewed as a sensor into various societal events such as disease outbreaks, protests, and elections. We describe the use of social media as a crowdsourced sensor to gain insight into ongoing cyber-attacks. Our approach detects a broad range of cyber-attacks (e.g., distributed denial of service (DDOS) attacks, data breaches, and account hijacking) in an unsupervised manner using just a limited fixed set of seed event triggers. A new query expansion strategy based on convolutional kernels and dependency parses helps model reporting structure and aids in identifying key event characteristics. Through a large-scale analysis over Twitter, we demonstrate that our approach consistently identifies and encodes events, outperforming existing methods.Comment: 13 single column pages, 5 figures, submitted to KDD 201

    Firsthand Opiates Abuse on Social Media: Monitoring Geospatial Patterns of Interest Through a Digital Cohort

    Get PDF
    In the last decade drug overdose deaths reached staggering proportions in the US. Besides the raw yearly deaths count that is worrisome per se, an alarming picture comes from the steep acceleration of such rate that increased by 21% from 2015 to 2016. While traditional public health surveillance suffers from its own biases and limitations, digital epidemiology offers a new lens to extract signals from Web and Social Media that might be complementary to official statistics. In this paper we present a computational approach to identify a digital cohort that might provide an updated and complementary view on the opioid crisis. We introduce an information retrieval algorithm suitable to identify relevant subspaces of discussion on social media, for mining data from users showing explicit interest in discussions about opioid consumption in Reddit. Moreover, despite the pseudonymous nature of the user base, almost 1.5 million users were geolocated at the US state level, resembling the census population distribution with a good agreement. A measure of prevalence of interest in opiate consumption has been estimated at the state level, producing a novel indicator with information that is not entirely encoded in the standard surveillance. Finally, we further provide a domain specific vocabulary containing informal lexicon and street nomenclature extracted by user-generated content that can be used by researchers and practitioners to implement novel digital public health surveillance methodologies for supporting policy makers in fighting the opioid epidemic.Comment: Proceedings of the 2019 World Wide Web Conference (WWW '19

    A Topic Recommender for Journalists

    Get PDF
    The way in which people acquire information on events and form their own opinion on them has changed dramatically with the advent of social media. For many readers, the news gathered from online sources become an opportunity to share points of view and information within micro-blogging platforms such as Twitter, mainly aimed at satisfying their communication needs. Furthermore, the need to deepen the aspects related to news stimulates a demand for additional information which is often met through online encyclopedias, such as Wikipedia. This behaviour has also influenced the way in which journalists write their articles, requiring a careful assessment of what actually interests the readers. The goal of this paper is to present a recommender system, What to Write and Why, capable of suggesting to a journalist, for a given event, the aspects still uncovered in news articles on which the readers focus their interest. The basic idea is to characterize an event according to the echo it receives in online news sources and associate it with the corresponding readers’ communicative and informative patterns, detected through the analysis of Twitter and Wikipedia, respectively. Our methodology temporally aligns the results of this analysis and recommends the concepts that emerge as topics of interest from Twitter and Wikipedia, either not covered or poorly covered in the published news articles

    Investigative Simulation: Towards Utilizing Graph Pattern Matching for Investigative Search

    Full text link
    This paper proposes the use of graph pattern matching for investigative graph search, which is the process of searching for and prioritizing persons of interest who may exhibit part or all of a pattern of suspicious behaviors or connections. While there are a variety of applications, our principal motivation is to aid law enforcement in the detection of homegrown violent extremists. We introduce investigative simulation, which consists of several necessary extensions to the existing dual simulation graph pattern matching scheme in order to make it appropriate for intelligence analysts and law enforcement officials. Specifically, we impose a categorical label structure on nodes consistent with the nature of indicators in investigations, as well as prune or complete search results to ensure sensibility and usefulness of partial matches to analysts. Lastly, we introduce a natural top-k ranking scheme that can help analysts prioritize investigative efforts. We demonstrate performance of investigative simulation on a real-world large dataset.Comment: 8 pages, 6 figures. Paper to appear in the Fosint-SI 2016 conference proceedings in conjunction with the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining ASONAM 201

    Excitable human dynamics driven by extrinsic events in massive communities

    Full text link
    Using empirical data from a social media site (Twitter) and on trading volumes of financial securities, we analyze the correlated human activity in massive social organizations. The activity, typically excited by real-world events and measured by the occurrence rate of international brand names and trading volumes, is characterized by intermittent fluctuations with bursts of high activity separated by quiescent periods. These fluctuations are broadly distributed with an inverse cubic tail and have long-range temporal correlations with a 1/f1/f power spectrum. We describe the activity by a stochastic point process and derive the distribution of activity levels from the corresponding stochastic differential equation. The distribution and the corresponding power spectrum are fully consistent with the empirical observations.Comment: 9 pages, 3 figure
    • …
    corecore