1,831 research outputs found

    Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture

    Full text link
    We present the architecture behind Twitter's real-time related query suggestion and spelling correction service. Although these tasks have received much attention in the web search literature, the Twitter context introduces a real-time "twist": after significant breaking news events, we aim to provide relevant results within minutes. This paper provides a case study illustrating the challenges of real-time data processing in the era of "big data". We tell the story of how our system was built twice: our first implementation was built on a typical Hadoop-based analytics stack, but was later replaced because it did not meet the latency requirements necessary to generate meaningful real-time results. The second implementation, which is the system deployed in production, is a custom in-memory processing engine specifically designed for the task. This experience taught us that the current typical usage of Hadoop as a "big data" platform, while great for experimentation, is not well suited to low-latency processing, and points the way to future work on data analytics platforms that can handle "big" as well as "fast" data

    Big data analytics on container-orchestrated systems

    Get PDF
    Container-orchestration systems offer new possibilites to software architects seeking to make their software systems more scalable and reliable. In the past, these systems have been used to implement transactional software systems but, more recently, they have been applied to other areas including big data analytics. To understand the advantages and limitations such systems impose on software architects, I migrated an existing big data analytics infrastructure from a software architecture that required lots of work from its developers to deploy and maintain to the new software architecture provided by container-orchestration systems. My results show that scalability is increased, maintenance costs are reduced, and reliability is easier to achieve

    Simplifying Big Data Analytics System with A Reference Architecture

    Get PDF
    The internet and pervasive technology like the Internet of Things (i.e. sensors and smart devices) have exponentially increased the scale of data collection and availability. This big data not only challenges the structure of existing enterprise analytics systems but also offer new opportunities to create new knowledge and competitive advantage. Businesses have been exploiting these opportunities by implementing and operating big data analytics capabilities. Social network companies such as Facebook, LinkedIn, Twitter and Video streaming company like Netflix have implemented big data analytics and subsequently published related literatures. However, these use cases did not provide a simplified and coherent big data analytics reference architecture as well as currently, there still remains limited reference architecture of big data analytics. This paper aims to simplify big data analytics by providing a reference architecture based on existing four use cases and subsequently verified the reference architecture with Amazon and Google analytics services

    Security Posture: A Systematic Review of Cyber Threats and Proactive Security

    Get PDF
    In the last decade, several high-profile cyber threats have occurred with global impact and devastating consequences. The tools, techniques, and procedures used to prevent cyber threats from occurring fall under the category of proactive security. Proactive security methodologies, however, vary among professionals where differing tactics have proved situationally effective. To determine the most effective tactics for preventing exploitation of vulnerabilities, the author examines the attack vector of three incidents from the last five years in a systematic review format: the WannaCry incident, the 2020 SolarWinds SUNBURST exploit, and the recently discovered Log4j vulnerability. From the three cases and existing literature, the author determined that inventory management, auditing, and patching are essential proactive security measures which may have prevented the incidents altogether. Then, the author discusses obstacles inherent to these solutions, such as time, talent, and resource restrictions, and proposes the use of user-friendly, open-source tools as a solution. The author intends through this research to improve the security posture of the Internet by encouraging further research into proactive cyber threat intelligence measures and motivating business executives to prioritize cybersecurity

    A reference architecture for big data systems

    Get PDF
    Over dozens of years, applying new IT technologies into organizations has always been a big concern for business. Big data certainly is a new concept exciting business. To be able to access more data and empower to analysis big data requires new big data platforms. However, there still remains limited reference architecture for big data systems. In this paper, based on existing reference architecture of big data systems, we propose new high level abstract reference architecture and related reference architecture notations, that better express the overall architecture. The new reference architecture is verified using one existing case and an additional new use case

    When Things Matter: A Data-Centric View of the Internet of Things

    Full text link
    With the recent advances in radio-frequency identification (RFID), low-cost wireless sensor devices, and Web technologies, the Internet of Things (IoT) approach has gained momentum in connecting everyday objects to the Internet and facilitating machine-to-human and machine-to-machine communication with the physical world. While IoT offers the capability to connect and integrate both digital and physical entities, enabling a whole new class of applications and services, several significant challenges need to be addressed before these applications and services can be fully realized. A fundamental challenge centers around managing IoT data, typically produced in dynamic and volatile environments, which is not only extremely large in scale and volume, but also noisy, and continuous. This article surveys the main techniques and state-of-the-art research efforts in IoT from data-centric perspectives, including data stream processing, data storage models, complex event processing, and searching in IoT. Open research issues for IoT data management are also discussed
    • …
    corecore