3,443 research outputs found
Big Data Caching for Networking: Moving from Cloud to Edge
In order to cope with the relentless data tsunami in wireless networks,
current approaches such as acquiring new spectrum, deploying more base stations
(BSs) and increasing nodes in mobile packet core networks are becoming
ineffective in terms of scalability, cost and flexibility. In this regard,
context-aware G networks with edge/cloud computing and exploitation of
\emph{big data} analytics can yield significant gains to mobile operators. In
this article, proactive content caching in G wireless networks is
investigated in which a big data-enabled architecture is proposed. In this
practical architecture, vast amount of data is harnessed for content popularity
estimation and strategic contents are cached at the BSs to achieve higher
users' satisfaction and backhaul offloading. To validate the proposed solution,
we consider a real-world case study where several hours of mobile data traffic
is collected from a major telecom operator in Turkey and a big data-enabled
analysis is carried out leveraging tools from machine learning. Based on the
available information and storage capacity, numerical studies show that several
gains are achieved both in terms of users' satisfaction and backhaul
offloading. For example, in the case of BSs with of content ratings
and Gbyte of storage size ( of total library size), proactive
caching yields of users' satisfaction and offloads of the
backhaul.Comment: accepted for publication in IEEE Communications Magazine, Special
Issue on Communications, Caching, and Computing for Content-Centric Mobile
Network
Sketch of Big Data Real-Time Analytics Model
Big Data has drawn huge attention from researchers in information sciences, decision makers in governments and enterprises. However, there is a lot of potential and highly useful value hidden in the huge volume of data. Data is the new oil, but unlike oil data can be refined further to create even more value. Therefore, a new scientific paradigm is born as data-intensive scientific discovery, also known as Big Data. The growth volume of real-time data requires new techniques and technologies to discover insight value. In this paper we introduce the Big Data real-time analytics model as a new technique. We discuss and compare several Big Data technologies for real-time processing along with various challenges and issues in adapting Big Data. Real-time Big Data analysis based on cloud computing approach is our future research direction
ALOJA: A benchmarking and predictive platform for big data performance analysis
The main goals of the ALOJA research project from BSC-MSR, are to explore and automate the characterization of cost-effectivenessof Big Data deployments. The development of the project over its first year, has resulted in a open source benchmarking platform, an online public repository of results with over 42,000 Hadoop job runs, and web-based analytic tools to gather insights about system's cost-performance1.
This article describes the evolution of the project's focus and research
lines from over a year of continuously benchmarking Hadoop under dif-
ferent configuration and deployments options, presents results, and dis
cusses the motivation both technical and market-based of such changes.
During this time, ALOJA's target has evolved from a previous low-level
profiling of Hadoop runtime, passing through extensive benchmarking
and evaluation of a large body of results via aggregation, to currently
leveraging Predictive Analytics (PA) techniques. Modeling benchmark
executions allow us to estimate the results of new or untested configu-
rations or hardware set-ups automatically, by learning techniques from
past observations saving in benchmarking time and costs.This work is partially supported the BSC-Microsoft Research Centre, the Span-
ish Ministry of Education (TIN2012-34557), the MINECO Severo Ochoa Research program (SEV-2011-0067) and the Generalitat de Catalunya (2014-SGR-1051).Peer ReviewedPostprint (author's final draft
ShenZhen transportation system (SZTS): a novel big data benchmark suite
Data analytics is at the core of the supply chain for both products and services in modern economies and societies. Big data workloads, however, are placing unprecedented demands on computing technologies, calling for a deep understanding and characterization of these emerging workloads. In this paper, we propose ShenZhen Transportation System (SZTS), a novel big data Hadoop benchmark suite comprised of real-life transportation analysis applications with real-life input data sets from Shenzhen in China. SZTS uniquely focuses on a specific and real-life application domain whereas other existing Hadoop benchmark suites, such as HiBench and CloudRank-D, consist of generic algorithms with synthetic inputs. We perform a cross-layer workload characterization at the microarchitecture level, the operating system (OS) level, and the job level, revealing unique characteristics of SZTS compared to existing Hadoop benchmarks as well as general-purpose multi-core PARSEC benchmarks. We also study the sensitivity of workload behavior with respect to input data size, and we propose a methodology for identifying representative input data sets
- âŠ