Search CORE

1,120 research outputs found

An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics

Author: A Abouzeid
A Matsunaga
A McKenna
B Cottingham
B Langmead
B Langmead
B Langmead
BD O'Connor
C Lam
C Sansom
D Henschen
F Chang
G Sadasivam
J Dean
J Dean
J Lin
J Qui
J Venner
K Heafield
L Stein
M Baker
M Gaggero
M Isard
M Ngazimbi
M Schatz
MC Schatz
RC Gentleman
Ronald C Taylor
S Canon
S Coghlan
S Das
S Ghemawat
S Leo
T White
X Qui
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

HyperLoom possibilities for executing scientific workflows on the cloud

Author: Ashby Thomas J.
Böhm Stanislav
Chupakhin Vladimir
Cima Vojtěch
Dvorský Jiří
Martinovič Jan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

We have developed HyperLoom - a platform for defining and executing scientific workflows in large-scale HPC systems. The computational tasks in such workflows often have non-trivial dependency patterns, unknown execution time and unknown sizes of generated outputs. HyperLoom enables to efficiently execute the workflows respecting task requirements and cluster resources agnostically to the shape or size of the workflow. Although HPC infrastructures provide an unbeatable performance, they may be unavailable or too expensive especially for small to medium workloads. Moreover, for some workloads, due to HPCs not very flexible resource allocation policy, the system energy efficiency may not be optimal at some stages of the execution. In contrast, current public cloud providers such as Amazon, Google or Exoscale allow users a comfortable and elastic way of deploying, scaling and disposing a virtualized cluster of almost any size. In this paper, we describe HyperLoom virtualization and evaluate its performance in a virtualized environment using workflows of various shapes and sizes. Finally, we discuss the Hyperloom potential for its expansion to cloud environments.61140639

Crossref

DSpace at VSB Technical University of Ostrava

BetterLife 2.0: large-scale social intelligence reasoning on cloud

Author: Hu H
Wang CL
Wang Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

This paper presents the design of the BetterLife 2.0 framework, which facilitates implementation of large-scale social intelligence application in cloud environment. We argued that more and more mobile social applications in pervasive computing need to be implemented this way, with a lot of user generated activities in social networking websites. We adopted the Case-based Reasoning technique to provide logical reasoning and outlined design considerations when porting a typical CBR framework jCOLIBRI2 to cloud, using Hadoop's various services (HDFS, HBase). These services allow efficient case base management (e.g. case insertion) and distribution of computational intensive jobs to speed up reasoning process more than 5 times. With the scalability merit of MapReduce, we can improve recommendation service with social network analysis that needs to handle millions of users' social activities. © 2010 IEEE.published_or_final_versionThe 2nd IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2010), Indianapolis, IN., 30 November-3 December 2010. In Proceedings of the 2nd CloudCom, 2010, p. 529-53

HKU Scholars Hub

Big Data and the Internet of Things

Author: A Baaziz
A Kleiner
ED Feigelson
MA Waller
S Boyd
S Vandermerwe
Z Zhou
Publication venue
Publication date: 24/03/2015
Field of study

Advances in sensing and computing capabilities are making it possible to embed increasing computing power in small devices. This has enabled the sensing devices not just to passively capture data at very high resolution but also to take sophisticated actions in response. Combined with advances in communication, this is resulting in an ecosystem of highly interconnected devices referred to as the Internet of Things - IoT. In conjunction, the advances in machine learning have allowed building models on this ever increasing amounts of data. Consequently, devices all the way from heavy assets such as aircraft engines to wearables such as health monitors can all now not only generate massive amounts of data but can draw back on aggregate analytics to "improve" their performance over time. Big data analytics has been identified as a key enabler for the IoT. In this chapter, we discuss various avenues of the IoT where big data analytics either is already making a significant impact or is on the cusp of doing so. We also discuss social implications and areas of concern.Comment: 33 pages. draft of upcoming book chapter in Japkowicz and Stefanowski (eds.) Big Data Analysis: New algorithms for a new society, Springer Series on Studies in Big Data, to appea

arXiv.org e-Print Archive

Crossref

From Social Data Mining to Forecasting Socio-Economic Crisis

Author: A. Barabasi
A. Barabasi
A. Diekmann
A. Halevy
A. Monreale
A. Szalay
A. Vespignani
B. Kluger
B.-C. Chen
B.B. Mandelbrot
B.C. Chen
C. Cattuto
C. Doctorow
C. Lynch
D. Helbing
D. Helbing
D. Helbing
D. Helbing
D. Helbing
D. Helbing
D. Lazer
E. Ostrom
E. Ravenstein
E.F. Fama
E.J. Candes
G. Sugihara
G. Ziegler
G.K. Zipf
I. Foster
Ioannidis P.A. John
J. Danãelsson
J. Krumm
J.H. Fowler
K.P. Smith
L. Molgedey
L. Odling-Smee
M. Atzori
M. Scheffer
M.J. Salganik
M.M. Gaber
N. Eagle
N.A. Christakis
N.A. Christakis
N.A. Christakis
N.F. Johnson
P. Bajaria
R.K. Merton
S. Balietti
S. Nelson
S.E. Asch
S.H. Muggleton
S.V. Buldyrev
W.S. Bainbridge
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2011
Field of study

Socio-economic data mining has a great potential in terms of gaining a better understanding of problems that our economy and society are facing, such as financial instability, shortages of resources, or conflicts. Without large-scale data mining, progress in these areas seems hard or impossible. Therefore, a suitable, distributed data mining infrastructure and research centers should be built in Europe. It also appears appropriate to build a network of Crisis Observatories. They can be imagined as laboratories devoted to the gathering and processing of enormous volumes of data on both natural systems such as the Earth and its ecosystem, as well as on human techno-socio-economic systems, so as to gain early warnings of impending events. Reality mining provides the chance to adapt more quickly and more accurately to changing situations. Further opportunities arise by individually customized services, which however should be provided in a privacy-respecting way. This requires the development of novel ICT (such as a self- organizing Web), but most likely new legal regulations and suitable institutions as well. As long as such regulations are lacking on a world-wide scale, it is in the public interest that scientists explore what can be done with the huge data available. Big data do have the potential to change or even threaten democratic societies. The same applies to sudden and large-scale failures of ICT systems. Therefore, dealing with data must be done with a large degree of responsibility and care. Self-interests of individuals, companies or institutions have limits, where the public interest is affected, and public interest is not a sufficient justification to violate human rights of individuals. Privacy is a high good, as confidentiality is, and damaging it would have serious side effects for society.Comment: 65 pages, 1 figure, Visioneer White Paper, see http://www.visioneer.ethz.c

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

EDP Sciences OAI-PMH repository (1.2.0)