5,707 research outputs found
Extracting News Events from Microblogs
Twitter stream has become a large source of information for many people, but
the magnitude of tweets and the noisy nature of its content have made
harvesting the knowledge from Twitter a challenging task for researchers for a
long time. Aiming at overcoming some of the main challenges of extracting the
hidden information from tweet streams, this work proposes a new approach for
real-time detection of news events from the Twitter stream. We divide our
approach into three steps. The first step is to use a neural network or deep
learning to detect news-relevant tweets from the stream. The second step is to
apply a novel streaming data clustering algorithm to the detected news tweets
to form news events. The third and final step is to rank the detected events
based on the size of the event clusters and growth speed of the tweet
frequencies. We evaluate the proposed system on a large, publicly available
corpus of annotated news events from Twitter. As part of the evaluation, we
compare our approach with a related state-of-the-art solution. Overall, our
experiments and user-based evaluation show that our approach on detecting
current (real) news events delivers a state-of-the-art performance
Detecting Cyber Security Vulnerabilities through Reactive Programming
We propose a software architectural model, which uses reactive programming for collecting and filtering live tweets and interpreting their potential correlation to software vulnerabilities and exploits. We aim to investigate if we could discover the existence of exploits for disclosed vulnerabilities in Twitter data streams. Reactive programming is used for performing filtering and querying of tweet to find potential exploits. The result of processing Twitter data streams with reactive programming could be broadcasted, by pointing towards potential exploits, which might create a cyber-attack. They can also be entered as a new entry into existing overt or open source intelligence repositories
Malware in the Future? Forecasting of Analyst Detection of Cyber Events
There have been extensive efforts in government, academia, and industry to
anticipate, forecast, and mitigate cyber attacks. A common approach is
time-series forecasting of cyber attacks based on data from network telescopes,
honeypots, and automated intrusion detection/prevention systems. This research
has uncovered key insights such as systematicity in cyber attacks. Here, we
propose an alternate perspective of this problem by performing forecasting of
attacks that are analyst-detected and -verified occurrences of malware. We call
these instances of malware cyber event data. Specifically, our dataset was
analyst-detected incidents from a large operational Computer Security Service
Provider (CSSP) for the U.S. Department of Defense, which rarely relies only on
automated systems. Our data set consists of weekly counts of cyber events over
approximately seven years. Since all cyber events were validated by analysts,
our dataset is unlikely to have false positives which are often endemic in
other sources of data. Further, the higher-quality data could be used for a
number for resource allocation, estimation of security resources, and the
development of effective risk-management strategies. We used a Bayesian State
Space Model for forecasting and found that events one week ahead could be
predicted. To quantify bursts, we used a Markov model. Our findings of
systematicity in analyst-detected cyber attacks are consistent with previous
work using other sources. The advanced information provided by a forecast may
help with threat awareness by providing a probable value and range for future
cyber events one week ahead. Other potential applications for cyber event
forecasting include proactive allocation of resources and capabilities for
cyber defense (e.g., analyst staffing and sensor configuration) in CSSPs.
Enhanced threat awareness may improve cybersecurity.Comment: Revised version resubmitted to journa
Viewpoint Discovery and Understanding in Social Networks
The Web has evolved to a dominant platform where everyone has the opportunity
to express their opinions, to interact with other users, and to debate on
emerging events happening around the world. On the one hand, this has enabled
the presence of different viewpoints and opinions about a - usually
controversial - topic (like Brexit), but at the same time, it has led to
phenomena like media bias, echo chambers and filter bubbles, where users are
exposed to only one point of view on the same topic. Therefore, there is the
need for methods that are able to detect and explain the different viewpoints.
In this paper, we propose a graph partitioning method that exploits social
interactions to enable the discovery of different communities (representing
different viewpoints) discussing about a controversial topic in a social
network like Twitter. To explain the discovered viewpoints, we describe a
method, called Iterative Rank Difference (IRD), which allows detecting
descriptive terms that characterize the different viewpoints as well as
understanding how a specific term is related to a viewpoint (by detecting other
related descriptive terms). The results of an experimental evaluation showed
that our approach outperforms state-of-the-art methods on viewpoint discovery,
while a qualitative analysis of the proposed IRD method on three different
controversial topics showed that IRD provides comprehensive and deep
representations of the different viewpoints
Temporal Cross-Media Retrieval with Soft-Smoothing
Multimedia information have strong temporal correlations that shape the way
modalities co-occur over time. In this paper we study the dynamic nature of
multimedia and social-media information, where the temporal dimension emerges
as a strong source of evidence for learning the temporal correlations across
visual and textual modalities. So far, cross-media retrieval models, explored
the correlations between different modalities (e.g. text and image) to learn a
common subspace, in which semantically similar instances lie in the same
neighbourhood. Building on such knowledge, we propose a novel temporal
cross-media neural architecture, that departs from standard cross-media
methods, by explicitly accounting for the temporal dimension through temporal
subspace learning. The model is softly-constrained with temporal and
inter-modality constraints that guide the new subspace learning task by
favouring temporal correlations between semantically similar and temporally
close instances. Experiments on three distinct datasets show that accounting
for time turns out to be important for cross-media retrieval. Namely, the
proposed method outperforms a set of baselines on the task of temporal
cross-media retrieval, demonstrating its effectiveness for performing temporal
subspace learning.Comment: To appear in ACM MM 201
- …