1,831 research outputs found
Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture
We present the architecture behind Twitter's real-time related query
suggestion and spelling correction service. Although these tasks have received
much attention in the web search literature, the Twitter context introduces a
real-time "twist": after significant breaking news events, we aim to provide
relevant results within minutes. This paper provides a case study illustrating
the challenges of real-time data processing in the era of "big data". We tell
the story of how our system was built twice: our first implementation was built
on a typical Hadoop-based analytics stack, but was later replaced because it
did not meet the latency requirements necessary to generate meaningful
real-time results. The second implementation, which is the system deployed in
production, is a custom in-memory processing engine specifically designed for
the task. This experience taught us that the current typical usage of Hadoop as
a "big data" platform, while great for experimentation, is not well suited to
low-latency processing, and points the way to future work on data analytics
platforms that can handle "big" as well as "fast" data
Big data analytics on container-orchestrated systems
Container-orchestration systems offer new possibilites to software architects seeking
to make their software systems more scalable and reliable. In the past, these systems
have been used to implement transactional software systems but, more recently, they
have been applied to other areas including big data analytics. To understand the advantages
and limitations such systems impose on software architects, I migrated an
existing big data analytics infrastructure from a software architecture that required
lots of work from its developers to deploy and maintain to the new software architecture
provided by container-orchestration systems. My results show that scalability
is increased, maintenance costs are reduced, and reliability is easier to achieve
Simplifying Big Data Analytics System with A Reference Architecture
The internet and pervasive technology like the Internet of Things (i.e. sensors and smart devices) have exponentially increased the scale of data collection and availability. This big data not only challenges the structure of existing enterprise analytics systems but also offer new opportunities to create new knowledge and competitive advantage. Businesses have been exploiting these opportunities by implementing and operating big data analytics capabilities. Social network companies such as Facebook, LinkedIn, Twitter and Video streaming company like Netflix have implemented big data analytics and subsequently published related literatures. However, these use cases did not provide a simplified and coherent big data analytics reference architecture as well as currently, there still remains limited reference architecture of big data analytics. This paper aims to simplify big data analytics by providing a reference architecture based on existing four use cases and subsequently verified the reference architecture with Amazon and Google analytics services
Security Posture: A Systematic Review of Cyber Threats and Proactive Security
In the last decade, several high-profile cyber threats have occurred with global impact and devastating consequences. The tools, techniques, and procedures used to prevent cyber threats from occurring fall under the category of proactive security. Proactive security methodologies, however, vary among professionals where differing tactics have proved situationally effective. To determine the most effective tactics for preventing exploitation of vulnerabilities, the author examines the attack vector of three incidents from the last five years in a systematic review format: the WannaCry incident, the 2020 SolarWinds SUNBURST exploit, and the recently discovered Log4j vulnerability. From the three cases and existing literature, the author determined that inventory management, auditing, and patching are essential proactive security measures which may have prevented the incidents altogether. Then, the author discusses obstacles inherent to these solutions, such as time, talent, and resource restrictions, and proposes the use of user-friendly, open-source tools as a solution. The author intends through this research to improve the security posture of the Internet by encouraging further research into proactive cyber threat intelligence measures and motivating business executives to prioritize cybersecurity
A reference architecture for big data systems
Over dozens of years, applying new IT technologies into organizations has always been a big concern for business. Big data certainly is a new concept exciting business. To be able to access more data and empower to analysis big data requires new big data platforms. However, there still remains limited reference architecture for big data systems. In this paper, based on existing reference architecture of big data systems, we propose new high level abstract reference architecture and related reference architecture notations, that better express the overall architecture. The new reference architecture is verified using one existing case and an additional new use case
When Things Matter: A Data-Centric View of the Internet of Things
With the recent advances in radio-frequency identification (RFID), low-cost
wireless sensor devices, and Web technologies, the Internet of Things (IoT)
approach has gained momentum in connecting everyday objects to the Internet and
facilitating machine-to-human and machine-to-machine communication with the
physical world. While IoT offers the capability to connect and integrate both
digital and physical entities, enabling a whole new class of applications and
services, several significant challenges need to be addressed before these
applications and services can be fully realized. A fundamental challenge
centers around managing IoT data, typically produced in dynamic and volatile
environments, which is not only extremely large in scale and volume, but also
noisy, and continuous. This article surveys the main techniques and
state-of-the-art research efforts in IoT from data-centric perspectives,
including data stream processing, data storage models, complex event
processing, and searching in IoT. Open research issues for IoT data management
are also discussed
- …