74,874 research outputs found
Big Data Caching for Networking: Moving from Cloud to Edge
In order to cope with the relentless data tsunami in wireless networks,
current approaches such as acquiring new spectrum, deploying more base stations
(BSs) and increasing nodes in mobile packet core networks are becoming
ineffective in terms of scalability, cost and flexibility. In this regard,
context-aware G networks with edge/cloud computing and exploitation of
\emph{big data} analytics can yield significant gains to mobile operators. In
this article, proactive content caching in G wireless networks is
investigated in which a big data-enabled architecture is proposed. In this
practical architecture, vast amount of data is harnessed for content popularity
estimation and strategic contents are cached at the BSs to achieve higher
users' satisfaction and backhaul offloading. To validate the proposed solution,
we consider a real-world case study where several hours of mobile data traffic
is collected from a major telecom operator in Turkey and a big data-enabled
analysis is carried out leveraging tools from machine learning. Based on the
available information and storage capacity, numerical studies show that several
gains are achieved both in terms of users' satisfaction and backhaul
offloading. For example, in the case of BSs with of content ratings
and Gbyte of storage size ( of total library size), proactive
caching yields of users' satisfaction and offloads of the
backhaul.Comment: accepted for publication in IEEE Communications Magazine, Special
Issue on Communications, Caching, and Computing for Content-Centric Mobile
Network
Intelligent Management and Efficient Operation of Big Data
This chapter details how Big Data can be used and implemented in networking
and computing infrastructures. Specifically, it addresses three main aspects:
the timely extraction of relevant knowledge from heterogeneous, and very often
unstructured large data sources, the enhancement on the performance of
processing and networking (cloud) infrastructures that are the most important
foundational pillars of Big Data applications or services, and novel ways to
efficiently manage network infrastructures with high-level composed policies
for supporting the transmission of large amounts of data with distinct
requisites (video vs. non-video). A case study involving an intelligent
management solution to route data traffic with diverse requirements in a wide
area Internet Exchange Point is presented, discussed in the context of Big
Data, and evaluated.Comment: In book Handbook of Research on Trends and Future Directions in Big
Data and Web Intelligence, IGI Global, 201
Security problems and challenges in a machine learning-based hybrid big data processing network systems
The data source that produces data continuously in high volume and high velocity with large varieties of data types creates Big Data, and causes problems and challenges to Machine Learning (ML) techniques that help extract, analyze and visualize important information. To overcome these problems and challenges, we propose to make use of the hybrid networking model that consists of multiple components such as Hadoop distributed file system (HDFS), cloud storage system, security module and ML unit. Processing of Big Data in this networking environment with ML technique requires user interaction and additional storage hence some artificial delay between the arrivals of data domains through external storage can help HDFSto process the Big Data efficiently. To address this problem we suggest using public cloud for data storage which will induce meaningful time delay to the data while making use of its storage capability. However, the use of public cloud will lead to security vulnerability to the data transmission and storage. Therefore, we need some form of security algorithm that provides a flexible key-based encryption technique that can provide tradeoffs between time-delay, security strength and storage risks. In this paper we propose a model for using public cloud provider trust levels to select encryption types for data storage for use within a Big Data analytics network topology
Foggy clouds and cloudy fogs: a real need for coordinated management of fog-to-cloud computing systems
The recent advances in cloud services technology are fueling a plethora of information technology innovation, including networking, storage, and computing. Today, various flavors have evolved of IoT, cloud computing, and so-called fog computing, a concept referring to capabilities of edge devices and users' clients to compute, store, and exchange data among each other and with the cloud. Although the rapid pace of this evolution was not easily foreseeable, today each piece of it facilitates and enables the deployment of what we commonly refer to as a smart scenario, including smart cities, smart transportation, and smart homes. As most current cloud, fog, and network services run simultaneously in each scenario, we observe that we are at the dawn of what may be the next big step in the cloud computing and networking evolution, whereby services might be executed at the network edge, both in parallel and in a coordinated fashion, as well as supported by the unstoppable technology evolution. As edge devices become richer in functionality and smarter, embedding capacities such as storage or processing, as well as new functionalities, such as decision making, data collection, forwarding, and sharing, a real need is emerging for coordinated management of fog-to-cloud (F2C) computing systems. This article introduces a layered F2C architecture, its benefits and strengths, as well as the arising open and research challenges, making the case for the real need for their coordinated management. Our architecture, the illustrative use case presented, and a comparative performance analysis, albeit conceptual, all clearly show the way forward toward a new IoT scenario with a set of existing and unforeseen services provided on highly distributed and dynamic compute, storage, and networking resources, bringing together heterogeneous and commodity edge devices, emerging fogs, as well as conventional clouds.Peer ReviewedPostprint (author's final draft
Software-defined measurement to support programmable networking for SoyKB
Campuses are increasingly adopting hybrid cloud architectures for supporting big data science applications that require "on-demand" resources, which are not always available locally on-site. Policies at the campus edge for handling multiple such applications competing for remote resources can cause bottlenecks across applications. To proactively avoid such bottlenecks, we investigate the benefits in the integration of two complementary technology paradigms of software-defined measurement and programmable networking. The integration inherently allows flexible end-to-end application performance monitoring and dynamic control of big data application flows using: (a) software-defined networking for transit selection to remote sites, and (b) pertinent selection of local or remote compute resources. Using the Soybean Knowledge Base (SoyKB) as an exemplar application, we demonstrate the benefits of software-defined measurement to support programmable networking. As part of our study methodology, we first profiled the original data flows within SoyKB's use of iPlant public cloud resources, and identified bottleneck cases such as slow data transfer speeds, lack of performance information (e.g., such as cluster availability, job status and network health) and inflexible control of hybrid cloud resources to address application-specific needs. The profiling study motivated us to propose a new hybrid cloud architecture for SoyKB workflows that utilize end-to-end performance measurements to support a cost-optimized selection of sites for computation and effective traffic engineering at the campus-edge. We validate our approach for a SoyKB workflow use case that we setup on a wide-area overlay network testbed implementation across two geographically distributed campuses. Our performance results show a notable performance improvement in SoyKB remote data transfer flows that utilize iRODS and TCP tuning mechanisms in the presence of cross-traffic big data flows. Additionally, we implement a SoyKB system that provides: (i) flexible workflow performance analytics at a glance to SoyKB researchers handling big data, and (ii) web service mechanisms for interfacing with popular dynamic resource management technologies such as OpenStack, HTCondor and Pegasus
Call for Abstracts and Papers
Conference topics include but are not limited to: Cybersecurity, Medical Informatics, Smart Systems, E-learning, Library Science, IT for Economic Development, E-Government, Technological Innovations, Digital Marketing, Business Analytics, Cloud Computing, E-business/commerce, Emerging Trends, Digital Accounting, Healthcare IT, IT Auditing/Forensics, Mobile Computing, Organizational Culture & Technology, Social Networking, and Big Data
PIRE ExoGENI – ENVRI: Preparation for Big Data Science
Big Data is a new field in both scientific research and IT industry focusing on collections of data sets which are so huge and complex that create numerous difficulties not only in processing them but also in transferring and storing them. The Big Data science tries to overcome problems or optimize performancebased on the “5V” concept: Volume, Variety, Velocity, Variability and Value. A Big Data infrastructure integrates advanced IT technologies such as Cloud computing, databases, network and HPC, providing scientists with all the required functionality for performing high level research activities. The EU project of ENVRI is an example of developing Big Data infrastructure for environmental scientists with a special focus on issues like architecture, metadata frameworks, data discovery etc.
In Big Data infrastructures like ENVRI, aggregating huge amount of data from different sources, and transferring them between distribution locations are important processes in the many experiments [5]. Efficient data transfer is thus a key service required in the big data infrastructure.
At the same time, Software Defined Networking (SDN) is a new promising approach of networking. SDN decouples the control interface from network devices and allows high level applications to manipulate network behavior. However, most of the existing high level data transfer protocols treat network as a black box, and do not include the control for network level functionality.
There is a scientific gap between Big Data science and Software Defined Networking and -until now- there is no work done combining these two technologies. This gap leads our research on this project
- …