74,874 research outputs found

    Big Data Caching for Networking: Moving from Cloud to Edge

    Full text link
    In order to cope with the relentless data tsunami in 5G5G wireless networks, current approaches such as acquiring new spectrum, deploying more base stations (BSs) and increasing nodes in mobile packet core networks are becoming ineffective in terms of scalability, cost and flexibility. In this regard, context-aware 55G networks with edge/cloud computing and exploitation of \emph{big data} analytics can yield significant gains to mobile operators. In this article, proactive content caching in 55G wireless networks is investigated in which a big data-enabled architecture is proposed. In this practical architecture, vast amount of data is harnessed for content popularity estimation and strategic contents are cached at the BSs to achieve higher users' satisfaction and backhaul offloading. To validate the proposed solution, we consider a real-world case study where several hours of mobile data traffic is collected from a major telecom operator in Turkey and a big data-enabled analysis is carried out leveraging tools from machine learning. Based on the available information and storage capacity, numerical studies show that several gains are achieved both in terms of users' satisfaction and backhaul offloading. For example, in the case of 1616 BSs with 30%30\% of content ratings and 1313 Gbyte of storage size (78%78\% of total library size), proactive caching yields 100%100\% of users' satisfaction and offloads 98%98\% of the backhaul.Comment: accepted for publication in IEEE Communications Magazine, Special Issue on Communications, Caching, and Computing for Content-Centric Mobile Network

    Intelligent Management and Efficient Operation of Big Data

    Get PDF
    This chapter details how Big Data can be used and implemented in networking and computing infrastructures. Specifically, it addresses three main aspects: the timely extraction of relevant knowledge from heterogeneous, and very often unstructured large data sources, the enhancement on the performance of processing and networking (cloud) infrastructures that are the most important foundational pillars of Big Data applications or services, and novel ways to efficiently manage network infrastructures with high-level composed policies for supporting the transmission of large amounts of data with distinct requisites (video vs. non-video). A case study involving an intelligent management solution to route data traffic with diverse requirements in a wide area Internet Exchange Point is presented, discussed in the context of Big Data, and evaluated.Comment: In book Handbook of Research on Trends and Future Directions in Big Data and Web Intelligence, IGI Global, 201

    Security problems and challenges in a machine learning-based hybrid big data processing network systems

    Get PDF
    The data source that produces data continuously in high volume and high velocity with large varieties of data types creates Big Data, and causes problems and challenges to Machine Learning (ML) techniques that help extract, analyze and visualize important information. To overcome these problems and challenges, we propose to make use of the hybrid networking model that consists of multiple components such as Hadoop distributed file system (HDFS), cloud storage system, security module and ML unit. Processing of Big Data in this networking environment with ML technique requires user interaction and additional storage hence some artificial delay between the arrivals of data domains through external storage can help HDFSto process the Big Data efficiently. To address this problem we suggest using public cloud for data storage which will induce meaningful time delay to the data while making use of its storage capability. However, the use of public cloud will lead to security vulnerability to the data transmission and storage. Therefore, we need some form of security algorithm that provides a flexible key-based encryption technique that can provide tradeoffs between time-delay, security strength and storage risks. In this paper we propose a model for using public cloud provider trust levels to select encryption types for data storage for use within a Big Data analytics network topology

    Foggy clouds and cloudy fogs: a real need for coordinated management of fog-to-cloud computing systems

    Get PDF
    The recent advances in cloud services technology are fueling a plethora of information technology innovation, including networking, storage, and computing. Today, various flavors have evolved of IoT, cloud computing, and so-called fog computing, a concept referring to capabilities of edge devices and users' clients to compute, store, and exchange data among each other and with the cloud. Although the rapid pace of this evolution was not easily foreseeable, today each piece of it facilitates and enables the deployment of what we commonly refer to as a smart scenario, including smart cities, smart transportation, and smart homes. As most current cloud, fog, and network services run simultaneously in each scenario, we observe that we are at the dawn of what may be the next big step in the cloud computing and networking evolution, whereby services might be executed at the network edge, both in parallel and in a coordinated fashion, as well as supported by the unstoppable technology evolution. As edge devices become richer in functionality and smarter, embedding capacities such as storage or processing, as well as new functionalities, such as decision making, data collection, forwarding, and sharing, a real need is emerging for coordinated management of fog-to-cloud (F2C) computing systems. This article introduces a layered F2C architecture, its benefits and strengths, as well as the arising open and research challenges, making the case for the real need for their coordinated management. Our architecture, the illustrative use case presented, and a comparative performance analysis, albeit conceptual, all clearly show the way forward toward a new IoT scenario with a set of existing and unforeseen services provided on highly distributed and dynamic compute, storage, and networking resources, bringing together heterogeneous and commodity edge devices, emerging fogs, as well as conventional clouds.Peer ReviewedPostprint (author's final draft

    Software-defined measurement to support programmable networking for SoyKB

    Get PDF
    Campuses are increasingly adopting hybrid cloud architectures for supporting big data science applications that require "on-demand" resources, which are not always available locally on-site. Policies at the campus edge for handling multiple such applications competing for remote resources can cause bottlenecks across applications. To proactively avoid such bottlenecks, we investigate the benefits in the integration of two complementary technology paradigms of software-defined measurement and programmable networking. The integration inherently allows flexible end-to-end application performance monitoring and dynamic control of big data application flows using: (a) software-defined networking for transit selection to remote sites, and (b) pertinent selection of local or remote compute resources. Using the Soybean Knowledge Base (SoyKB) as an exemplar application, we demonstrate the benefits of software-defined measurement to support programmable networking. As part of our study methodology, we first profiled the original data flows within SoyKB's use of iPlant public cloud resources, and identified bottleneck cases such as slow data transfer speeds, lack of performance information (e.g., such as cluster availability, job status and network health) and inflexible control of hybrid cloud resources to address application-specific needs. The profiling study motivated us to propose a new hybrid cloud architecture for SoyKB workflows that utilize end-to-end performance measurements to support a cost-optimized selection of sites for computation and effective traffic engineering at the campus-edge. We validate our approach for a SoyKB workflow use case that we setup on a wide-area overlay network testbed implementation across two geographically distributed campuses. Our performance results show a notable performance improvement in SoyKB remote data transfer flows that utilize iRODS and TCP tuning mechanisms in the presence of cross-traffic big data flows. Additionally, we implement a SoyKB system that provides: (i) flexible workflow performance analytics at a glance to SoyKB researchers handling big data, and (ii) web service mechanisms for interfacing with popular dynamic resource management technologies such as OpenStack, HTCondor and Pegasus

    Call for Abstracts and Papers

    Get PDF
    Conference topics include but are not limited to: Cybersecurity, Medical Informatics, Smart Systems, E-learning, Library Science, IT for Economic Development, E-Government, Technological Innovations, Digital Marketing, Business Analytics, Cloud Computing, E-business/commerce, Emerging Trends, Digital Accounting, Healthcare IT, IT Auditing/Forensics, Mobile Computing, Organizational Culture & Technology, Social Networking, and Big Data

    PIRE ExoGENI – ENVRI: Preparation for Big Data Science

    Get PDF
    Big Data is a new field in both scientific research and IT industry focusing on collections of data sets which are so huge and complex that create numerous difficulties not only in processing them but also in transferring and storing them. The Big Data science tries to overcome problems or optimize performancebased on the “5V” concept: Volume, Variety, Velocity, Variability and Value. A Big Data infrastructure integrates advanced IT technologies such as Cloud computing, databases, network and HPC, providing scientists with all the required functionality for performing high level research activities. The EU project of ENVRI is an example of developing Big Data infrastructure for environmental scientists with a special focus on issues like architecture, metadata frameworks, data discovery etc. In Big Data infrastructures like ENVRI, aggregating huge amount of data from different sources, and transferring them between distribution locations are important processes in the many experiments [5]. Efficient data transfer is thus a key service required in the big data infrastructure. At the same time, Software Defined Networking (SDN) is a new promising approach of networking. SDN decouples the control interface from network devices and allows high level applications to manipulate network behavior. However, most of the existing high level data transfer protocols treat network as a black box, and do not include the control for network level functionality. There is a scientific gap between Big Data science and Software Defined Networking and -until now- there is no work done combining these two technologies. This gap leads our research on this project
    • …
    corecore