24,439 research outputs found
Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture
We present the architecture behind Twitter's real-time related query
suggestion and spelling correction service. Although these tasks have received
much attention in the web search literature, the Twitter context introduces a
real-time "twist": after significant breaking news events, we aim to provide
relevant results within minutes. This paper provides a case study illustrating
the challenges of real-time data processing in the era of "big data". We tell
the story of how our system was built twice: our first implementation was built
on a typical Hadoop-based analytics stack, but was later replaced because it
did not meet the latency requirements necessary to generate meaningful
real-time results. The second implementation, which is the system deployed in
production, is a custom in-memory processing engine specifically designed for
the task. This experience taught us that the current typical usage of Hadoop as
a "big data" platform, while great for experimentation, is not well suited to
low-latency processing, and points the way to future work on data analytics
platforms that can handle "big" as well as "fast" data
Time Aware Knowledge Extraction for Microblog Summarization on Twitter
Microblogging services like Twitter and Facebook collect millions of user
generated content every moment about trending news, occurring events, and so
on. Nevertheless, it is really a nightmare to find information of interest
through the huge amount of available posts that are often noise and redundant.
In general, social media analytics services have caught increasing attention
from both side research and industry. Specifically, the dynamic context of
microblogging requires to manage not only meaning of information but also the
evolution of knowledge over the timeline. This work defines Time Aware
Knowledge Extraction (briefly TAKE) methodology that relies on temporal
extension of Fuzzy Formal Concept Analysis. In particular, a microblog
summarization algorithm has been defined filtering the concepts organized by
TAKE in a time-dependent hierarchy. The algorithm addresses topic-based
summarization on Twitter. Besides considering the timing of the concepts,
another distinguish feature of the proposed microblog summarization framework
is the possibility to have more or less detailed summary, according to the
user's needs, with good levels of quality and completeness as highlighted in
the experimental results.Comment: 33 pages, 10 figure
Real-Time Context-Aware Microservice Architecture for Predictive Analytics and Smart Decision-Making
The impressive evolution of the Internet of Things and the great amount of data flowing through the systems provide us with an inspiring scenario for Big Data analytics and advantageous real-time context-aware predictions and smart decision-making. However, this requires a scalable system for constant streaming processing, also provided with the ability of decision-making and action taking based on the performed predictions. This paper aims at proposing a scalable architecture to provide real-time context-aware actions based on predictive streaming processing of data as an evolution of a previously provided event-driven service-oriented architecture which already permitted the context-aware detection and notification of relevant data. For this purpose, we have defined and implemented a microservice-based architecture which provides real-time context-aware actions based on predictive streaming processing of data. As a result, our architecture has been enhanced twofold: on the one hand, the architecture has been supplied with reliable predictions through the use of predictive analytics and complex event processing techniques, which permit the notification of relevant context-aware information ahead of time. On the other, it has been refactored towards a microservice architecture pattern, highly improving its maintenance and evolution. The architecture performance has been evaluated with an air quality case study
- …