Search CORE

71 research outputs found

Efficient HTTP based I/O on very large datasets for high performance computing with the libdavix library

Author: C Coarfa
DC Ster van der
F Furano
I Antcheva
I Vukotic
J Heidemann
J Martin
JC Anderson
K Jackson
R Battle
R Buyya
T White
Publication venue
Publication date: 15/10/2014
Field of study

Remote data access for data analysis in high performance computing is commonly done with specialized data access protocols and storage systems. These protocols are highly optimized for high throughput on very large datasets, multi-streams, high availability, low latency and efficient parallel I/O. The purpose of this paper is to describe how we have adapted a generic protocol, the Hyper Text Transport Protocol (HTTP) to make it a competitive alternative for high performance I/O and data analysis applications in a global computing grid: the Worldwide LHC Computing Grid. In this work, we first analyze the design differences between the HTTP protocol and the most common high performance I/O protocols, pointing out the main performance weaknesses of HTTP. Then, we describe in detail how we solved these issues. Our solutions have been implemented in a toolkit called davix, available through several recent Linux distributions. Finally, we describe the results of our benchmarks where we compare the performance of davix against a HPC specific protocol for a data analysis use case.Comment: Presented at: Very large Data Bases (VLDB) 2014, Hangzho

arXiv.org e-Print Archive

Crossref

Recommended from our members

ISOGA: Integrated Services Optical Grid Architecture for Emerging E-Science Collaborative Applications

Author: Yu Oliver
Publication venue: University of Illinois at Chicago, Chicago, IL
Publication date: 28/11/2008
Field of study

This final report describes the accomplishments in the ISOGA (Integrated Services Optical Grid Architecture) project. ISOGA enables efficient deployment of existing and emerging collaborative grid applications with increasingly diverse multimedia communication requirements over a wide-area multi-domain optical network grid; and enables collaborative scientists with fast retrieval and seamless browsing of distributed scientific multimedia datasets over a wide-area optical network grid. The project focuses on research and development in the following areas: the polymorphic optical network control planes to enable multiple switching and communication services simultaneously; the intelligent optical grid user-network interface to enable user-centric network control and monitoring; and the seamless optical grid dataset browsing interface to enable fast retrieval of local/remote dataset for visualization and manipulation

UNT Digital Library

Logical infrastructure composition layer, the GEYSERS holistic approach for infrastructure virtualisation

Author: Anhalt Fabienne
Buysse Jens
De Leenheer Marc
Demchemko Yuri
Develder Chris
Ferrer Riera Jordi
Figuerola Sergi
Garcia-Espin Joan A
Ghijsen Mattijs
Soudan Sébastien
Publication venue: Ghent University, Department of Information technology
Publication date: 01/01/2012
Field of study

Ghent University Academic Bibliography

Performance improvement of an optical network providing services based on multicast

Author: Barth Dominique
Cohen Johanne
Reinhard Vincent
Tomasik Joanna
Weisser Marc-Antoine
Publication venue
Publication date: 02/05/2011
Field of study

Operators of networks covering large areas are confronted with demands from some of their customers who are virtual service providers. These providers may call for the connectivity service which fulfils the specificity of their services, for instance a multicast transition with allocated bandwidth. On the other hand, network operators want to make profit by trading the connectivity service of requested quality to their customers and to limit their infrastructure investments (or do not invest anything at all). We focus on circuit switching optical networks and work on repetitive multicast demands whose source and destinations are {\em \`a priori} known by an operator. He may therefore have corresponding trees "ready to be allocated" and adapt his network infrastructure according to these recurrent transmissions. This adjustment consists in setting available branching routers in the selected nodes of a predefined tree. The branching nodes are opto-electronic nodes which are able to duplicate data and retransmit it in several directions. These nodes are, however, more expensive and more energy consuming than transparent ones. In this paper we are interested in the choice of nodes of a multicast tree where the limited number of branching routers should be located in order to minimize the amount of required bandwidth. After formally stating the problem we solve it by proposing a polynomial algorithm whose optimality we prove. We perform exhaustive computations to show an operator gain obtained by using our algorithm. These computations are made for different methods of the multicast tree construction. We conclude by giving dimensioning guidelines and outline our further work.Comment: 16 pages, 13 figures, extended version from Conference ISCIS 201

arXiv.org e-Print Archive

Comparative Analysis of Cloud Simulators and Authentication Techniques in Cloud Computing

Author: Ashima Mehta
S. N. Panda
Publication venue: 'Chitkara University Publications'
Publication date: 28/12/2016
Field of study

Cloud computing is the concern of computer hardware and software resources above the internet so that anyone who is connected to the internet can access it as a service or provision in a seamless way. As we are moving more and more towards the application of this newly emerging technology, it is essential to study, evaluate and analyze the performance, security and other related problems that might be encountered in cloud computing. Since, it is not a practicable way to directly examine the behavior of cloud on such problems using the real hardware and software resources due to its high costs, modeling and simulation has become an essential tool to withstand with these issues. In this paper, we retrospect, analyse and compare features of the existing cloud computing simulators and various location based authentication and simulation tools

Crossref

Journal on Today's Ideas - Tomorrow's Technologies

A survey of general-purpose experiment management tools for distributed systems

Author: Buchert Tomasz
Nussbaum Lucas
Richard Olivier
Ruiz Cristian
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

International audienceIn the field of large-scale distributed systems, experimentation is particularly difficult. The studied systems are complex, often nondeterministic and unreliable, software is plagued with bugs, whereas the experiment workflows are unclear and hard to reproduce. These obstacles led many independent researchers to design tools to control their experiments, boost productivity and improve quality of scientific results. Despite much research in the domain of distributed systems experiment management, the current fragmentation of efforts asks for a general analysis. We therefore propose to build a framework to uncover missing functionality of these tools, enable meaningful comparisons be-tween them and find recommendations for future improvements and research. The contribution in this paper is twofold. First, we provide an extensive list of features offered by general-purpose experiment management tools dedicated to distributed systems research on real platforms. We then use it to assess existing solutions and compare them, outlining possible future paths for improvements

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

ISOGA: Integrated Services Optical Grid Architecture for Emerging E-Science Collaborative Applications

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Crossref

Throughput Optimal On-Line Algorithms for Advanced Resource Reservation in Ultra High-Speed Networks

Author: Cohen Reuven
Fazlollahi Niloofar
Starobinski David
Publication venue
Publication date: 02/11/2007
Field of study

Advanced channel reservation is emerging as an important feature of ultra high-speed networks requiring the transfer of large files. Applications include scientific data transfers and database backup. In this paper, we present two new, on-line algorithms for advanced reservation, called BatchAll and BatchLim, that are guaranteed to achieve optimal throughput performance, based on multi-commodity flow arguments. Both algorithms are shown to have polynomial-time complexity and provable bounds on the maximum delay for 1+epsilon bandwidth augmented networks. The BatchLim algorithm returns the completion time of a connection immediately as a request is placed, but at the expense of a slightly looser competitive ratio than that of BatchAll. We also present a simple approach that limits the number of parallel paths used by the algorithms while provably bounding the maximum reduction factor in the transmission throughput. We show that, although the number of different paths can be exponentially large, the actual number of paths needed to approximate the flow is quite small and proportional to the number of edges in the network. Simulations for a number of topologies show that, in practice, 3 to 5 parallel paths are sufficient to achieve close to optimal performance. The performance of the competitive algorithms are also compared to a greedy benchmark, both through analysis and simulation.Comment: 9 pages, 8 figure

arXiv.org e-Print Archive

CiteSeerX

Data Avenue: Remote Storage Resource Management in WS-PGRADE/gUSE

Author: Farkas Zoltán
Hajnal Ákos
Kacsuk Péter
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Crossref

SZTAKI Publication Repository