4,776 research outputs found
Big Data Computing Using Cloud-Based Technologies, Challenges and Future Perspectives
The excessive amounts of data generated by devices and Internet-based sources
at a regular basis constitute, big data. This data can be processed and
analyzed to develop useful applications for specific domains. Several
mathematical and data analytics techniques have found use in this sphere. This
has given rise to the development of computing models and tools for big data
computing. However, the storage and processing requirements are overwhelming
for traditional systems and technologies. Therefore, there is a need for
infrastructures that can adjust the storage and processing capability in
accordance with the changing data dimensions. Cloud Computing serves as a
potential solution to this problem. However, big data computing in the cloud
has its own set of challenges and research issues. This chapter surveys the big
data concept, discusses the mathematical and data analytics techniques that can
be used for big data and gives taxonomy of the existing tools, frameworks and
platforms available for different big data computing models. Besides this, it
also evaluates the viability of cloud-based big data computing, examines
existing challenges and opportunities, and provides future research directions
in this field
A Comparative Taxonomy and Survey of Public Cloud Infrastructure Vendors
An increasing number of technology enterprises are adopting cloud-native
architectures to offer their web-based products, by moving away from
privately-owned data-centers and relying exclusively on cloud service
providers. As a result, cloud vendors have lately increased, along with the
estimated annual revenue they share. However, in the process of selecting a
provider's cloud service over the competition, we observe a lack of universal
common ground in terms of terminology, functionality of services and billing
models. This is an important gap especially under the new reality of the
industry where each cloud provider has moved towards his own service taxonomy,
while the number of specialized services has grown exponentially. This work
discusses cloud services offered by four dominant, in terms of their current
market share, cloud vendors. We provide a taxonomy of their services and
sub-services that designates major service families namely computing, storage,
databases, analytics, data pipelines, machine learning, and networking. The aim
of such clustering is to indicate similarities, common design approaches and
functional differences of the offered services. The outcomes are essential both
for individual researchers, and bigger enterprises in their attempt to identify
the set of cloud services that will utterly meet their needs without
compromises. While we acknowledge the fact that this is a dynamic industry,
where new services arise constantly, and old ones experience important updates,
this study paints a solid image of the current offerings and gives prominence
to the directions that cloud service providers are following
Analytics for the Internet of Things: A Survey
The Internet of Things (IoT) envisions a world-wide, interconnected network
of smart physical entities. These physical entities generate a large amount of
data in operation and as the IoT gains momentum in terms of deployment, the
combined scale of those data seems destined to continue to grow. Increasingly,
applications for the IoT involve analytics. Data analytics is the process of
deriving knowledge from data, generating value like actionable insights from
them. This article reviews work in the IoT and big data analytics from the
perspective of their utility in creating efficient, effective and innovative
applications and services for a wide spectrum of domains. We review the broad
vision for the IoT as it is shaped in various communities, examine the
application of data analytics across IoT domains, provide a categorisation of
analytic approaches and propose a layered taxonomy from IoT data to analytics.
This taxonomy provides us with insights on the appropriateness of analytical
techniques, which in turn shapes a survey of enabling technology and
infrastructure for IoT analytics. Finally, we look at some tradeoffs for
analytics in the IoT that can shape future research
The Role of Big Data Analytics in Industrial Internet of Things
Big data production in industrial Internet of Things (IIoT) is evident due to
the massive deployment of sensors and Internet of Things (IoT) devices.
However, big data processing is challenging due to limited computational,
networking and storage resources at IoT device-end. Big data analytics (BDA) is
expected to provide operational- and customer-level intelligence in IIoT
systems. Although numerous studies on IIoT and BDA exist, only a few studies
have explored the convergence of the two paradigms. In this study, we
investigate the recent BDA technologies, algorithms and techniques that can
lead to the development of intelligent IIoT systems. We devise a taxonomy by
classifying and categorising the literature on the basis of important
parameters (e.g. data sources, analytics tools, analytics techniques,
requirements, industrial analytics applications and analytics types). We
present the frameworks and case studies of the various enterprises that have
benefited from BDA. We also enumerate the considerable opportunities introduced
by BDA in IIoT.We identify and discuss the indispensable challenges that remain
to be addressed as future research directions as well
Recommended from our members
Computational Strategies for Scalable Genomics Analysis.
The revolution in next-generation DNA sequencing technologies is leading to explosive data growth in genomics, posing a significant challenge to the computing infrastructure and software algorithms for genomics analysis. Various big data technologies have been explored to scale up/out current bioinformatics solutions to mine the big genomics data. In this review, we survey some of these exciting developments in the applications of parallel distributed computing and special hardware to genomics. We comment on the pros and cons of each strategy in the context of ease of development, robustness, scalability, and efficiency. Although this review is written for an audience from the genomics and bioinformatics fields, it may also be informative for the audience of computer science with interests in genomics applications
Role of Apache Software Foundation in Big Data Projects
With the increase in amount of Big Data being generated each year, tools and
technologies developed and used for the purpose of storing, processing and
analyzing Big Data has also improved. Open-Source software has been an
important factor in the success and innovation in the field of Big Data while
Apache Software Foundation (ASF) has played a crucial role in this success and
innovation by providing a number of state-of-the-art projects, free and open to
the public. ASF has classified its project in different categories. In this
report, projects listed under Big Data category are deeply analyzed and
discussed with reference to one-of-the seven sub-categories defined. Our
investigation has shown that many of the Apache Big Data projects are
autonomous but some are built based on other Apache projects and some work in
conjunction with other projects to improve and ease development in Big Data
space
ECHO: An Adaptive Orchestration Platform for Hybrid Dataflows across Cloud and Edge
The Internet of Things (IoT) is offering unprecedented observational data
that are used for managing Smart City utilities. Edge and Fog gateway devices
are an integral part of IoT deployments to acquire real-time data and enact
controls. Recently, Edge-computing is emerging as first-class paradigm to
complement Cloud-centric analytics. But a key limitation is the lack of a
platform-as-a-service for applications spanning Edge and Cloud. Here, we
propose ECHO, an orchestration platform for dataflows across distributed
resources. ECHO's hybrid dataflow composition can operate on diverse data
models -- streams, micro-batches and files, and interface with native runtime
engines like TensorFlow and Storm to execute them. It manages the application's
lifecycle, including container-based deployment and a registry for state
management. ECHO can schedule the dataflow on different Edge, Fog and Cloud
resources, and also perform dynamic task migration between resources. We
validate the ECHO platform for executing video analytics and sensor streams for
Smart Traffic and Smart Utility applications on Raspberry Pi, NVidia TX1, ARM64
and Azure Cloud VM resources, and present our results.Comment: 17 pages, 5 figures, 2 tables, submitted to ICSOC-201
A deep learning based solution for construction equipment detection: from development to deployment
This paper aims at providing researchers and engineering professionals with a
practical and comprehensive deep learning based solution to detect construction
equipment from the very first step of its development to the last one which is
deployment. This paper focuses on the last step of deployment. The first phase
of solution development, involved data preparation, model selection, model
training, and model evaluation. The second phase of the study comprises of
model optimization, application specific embedded system selection, and
economic analysis. Several embedded systems were proposed and compared. The
review of the results confirms superior real-time performance of the solutions
with a consistent above 90% rate of accuracy. The current study validates the
practicality of deep learning based object detection solutions for construction
scenarios. Moreover, the detailed knowledge, presented in this study, can be
employed for several purposes such as, safety monitoring, productivity
assessments, and managerial decisions.Comment: 17 pages, 16 figures, 6 table
A Berkeley View of Systems Challenges for AI
With the increasing commoditization of computer vision, speech recognition
and machine translation systems and the widespread deployment of learning-based
back-end technologies such as digital advertising and intelligent
infrastructures, AI (Artificial Intelligence) has moved from research labs to
production. These changes have been made possible by unprecedented levels of
data and computation, by methodological advances in machine learning, by
innovations in systems software and architectures, and by the broad
accessibility of these technologies.
The next generation of AI systems promises to accelerate these developments
and increasingly impact our lives via frequent interactions and making (often
mission-critical) decisions on our behalf, often in highly personalized
contexts. Realizing this promise, however, raises daunting challenges. In
particular, we need AI systems that make timely and safe decisions in
unpredictable environments, that are robust against sophisticated adversaries,
and that can process ever increasing amounts of data across organizations and
individuals without compromising confidentiality. These challenges will be
exacerbated by the end of the Moore's Law, which will constrain the amount of
data these technologies can store and process. In this paper, we propose
several open research directions in systems, architectures, and security that
can address these challenges and help unlock AI's potential to improve lives
and society.Comment: Berkeley Technical Repor
Attacking Machine Learning models as part of a cyber kill chain
Machine learning is gaining popularity in the network security domain as many
more network-enabled devices get connected, as malicious activities become
stealthier, and as new technologies like Software Defined Networking emerge.
Compromising machine learning model is a desirable goal. In fact, spammers have
been quite successful getting through machine learning enabled spam filters for
years. While previous works have been done on adversarial machine learning,
none has been considered within a defense-in-depth environment, in which
correct classification alone may not be good enough. For the first time, this
paper proposes a cyber kill-chain for attacking machine learning models
together with a proof of concept. The intention is to provide a high level
attack model that inspire more secure processes in
research/design/implementation of machine learning based security solutions.Comment: 8 page
- …