1,176 research outputs found
From Social Data Mining to Forecasting Socio-Economic Crisis
Socio-economic data mining has a great potential in terms of gaining a better
understanding of problems that our economy and society are facing, such as
financial instability, shortages of resources, or conflicts. Without
large-scale data mining, progress in these areas seems hard or impossible.
Therefore, a suitable, distributed data mining infrastructure and research
centers should be built in Europe. It also appears appropriate to build a
network of Crisis Observatories. They can be imagined as laboratories devoted
to the gathering and processing of enormous volumes of data on both natural
systems such as the Earth and its ecosystem, as well as on human
techno-socio-economic systems, so as to gain early warnings of impending
events. Reality mining provides the chance to adapt more quickly and more
accurately to changing situations. Further opportunities arise by individually
customized services, which however should be provided in a privacy-respecting
way. This requires the development of novel ICT (such as a self- organizing
Web), but most likely new legal regulations and suitable institutions as well.
As long as such regulations are lacking on a world-wide scale, it is in the
public interest that scientists explore what can be done with the huge data
available. Big data do have the potential to change or even threaten democratic
societies. The same applies to sudden and large-scale failures of ICT systems.
Therefore, dealing with data must be done with a large degree of responsibility
and care. Self-interests of individuals, companies or institutions have limits,
where the public interest is affected, and public interest is not a sufficient
justification to violate human rights of individuals. Privacy is a high good,
as confidentiality is, and damaging it would have serious side effects for
society.Comment: 65 pages, 1 figure, Visioneer White Paper, see
http://www.visioneer.ethz.c
Challenges for MapReduce in Big Data
In the Big Data community, MapReduce has been seen as one of the key enabling approaches for meeting continuously increasing demands on computing resources imposed by massive data sets. The reason for this is the high scalability of the MapReduce paradigm which allows for massively parallel and distributed execution over a large number of computing nodes. This paper identifies MapReduce issues and challenges in handling Big Data with the objective of providing an overview of the field, facilitating better planning and management of Big Data projects, and identifying opportunities for future research in this field. The identified challenges are grouped into four main categories corresponding to Big Data tasks types: data storage (relational databases and NoSQL stores), Big Data analytics (machine learning and interactive analytics), online processing, and security and privacy. Moreover, current efforts aimed at improving and extending MapReduce to address identified challenges are presented. Consequently, by identifying issues and challenges MapReduce faces when handling Big Data, this study encourages future Big Data research
Big Data and the Internet of Things
Advances in sensing and computing capabilities are making it possible to
embed increasing computing power in small devices. This has enabled the sensing
devices not just to passively capture data at very high resolution but also to
take sophisticated actions in response. Combined with advances in
communication, this is resulting in an ecosystem of highly interconnected
devices referred to as the Internet of Things - IoT. In conjunction, the
advances in machine learning have allowed building models on this ever
increasing amounts of data. Consequently, devices all the way from heavy assets
such as aircraft engines to wearables such as health monitors can all now not
only generate massive amounts of data but can draw back on aggregate analytics
to "improve" their performance over time. Big data analytics has been
identified as a key enabler for the IoT. In this chapter, we discuss various
avenues of the IoT where big data analytics either is already making a
significant impact or is on the cusp of doing so. We also discuss social
implications and areas of concern.Comment: 33 pages. draft of upcoming book chapter in Japkowicz and Stefanowski
(eds.) Big Data Analysis: New algorithms for a new society, Springer Series
on Studies in Big Data, to appea
Internet of Things-aided Smart Grid: Technologies, Architectures, Applications, Prototypes, and Future Research Directions
Traditional power grids are being transformed into Smart Grids (SGs) to
address the issues in existing power system due to uni-directional information
flow, energy wastage, growing energy demand, reliability and security. SGs
offer bi-directional energy flow between service providers and consumers,
involving power generation, transmission, distribution and utilization systems.
SGs employ various devices for the monitoring, analysis and control of the
grid, deployed at power plants, distribution centers and in consumers' premises
in a very large number. Hence, an SG requires connectivity, automation and the
tracking of such devices. This is achieved with the help of Internet of Things
(IoT). IoT helps SG systems to support various network functions throughout the
generation, transmission, distribution and consumption of energy by
incorporating IoT devices (such as sensors, actuators and smart meters), as
well as by providing the connectivity, automation and tracking for such
devices. In this paper, we provide a comprehensive survey on IoT-aided SG
systems, which includes the existing architectures, applications and prototypes
of IoT-aided SG systems. This survey also highlights the open issues,
challenges and future research directions for IoT-aided SG systems
A Strategic Roadmap for Maximizing Big Data Return
Big Data has turned out to be one of the popular expressions in IT the last couple of years. In the current digital period, according to the huge improvement occurring in the web and online world innovations, we are facing a gigantic volume of information. The size of data has expanded significantly with the appearance of today's innovation in numerous segments, for example, assembling, business, and science. Types of information have been changed from structured data-driven databases to data including documents, images, audio, video, and social media contents referred to as unstructured data or Big Data. Consequently, most of the organizations try to invest in the big data technology aiming to get value from their investment. However, the organizations face a challenge to determine their requirements and then the technology that suits their businesses. Different technologies are provided by variety of vendors, each of them can be used, and there is no methodology helping them for choosing and making a right decision. Therefore, the objective of this paper is to construct a roadmap for helping the organizations determine their needs and selecting a suitable technology and applying this conducted proposed roadmap practically on two companies
Big Data and Large-scale Data Analytics: Efficiency of Sustainable Scalability and Security of Centralized Clouds and Edge Deployment Architectures
One of the significant shifts of the next-generation computing technologies will certainly be in
the development of Big Data (BD) deployment architectures. Apache Hadoop, the BD
landmark, evolved as a widely deployed BD operating system. Its new features include
federation structure and many associated frameworks, which provide Hadoop 3.x with the
maturity to serve different markets. This dissertation addresses two leading issues involved in
exploiting BD and large-scale data analytics realm using the Hadoop platform. Namely,
(i)Scalability that directly affects the system performance and overall throughput using
portable Docker containers. (ii) Security that spread the adoption of data protection practices
among practitioners using access controls. An Enhanced Mapreduce Environment (EME),
OPportunistic and Elastic Resource Allocation (OPERA) scheduler, BD Federation Access Broker
(BDFAB), and a Secure Intelligent Transportation System (SITS) of multi-tiers architecture for
data streaming to the cloud computing are the main contribution of this thesis study
- …