70 research outputs found

    High Performance Data Transfer in Grid Environment Using GridFTP over InfiniBand

    Full text link

    Methods and design issues for next generation network-aware applications

    Get PDF
    Networks are becoming an essential component of modern cyberinfrastructure and this work describes methods of designing distributed applications for high-speed networks to improve application scalability, performance and capabilities. As the amount of data generated by scientific applications continues to grow, to be able to handle and process it, applications should be designed to use parallel, distributed resources and high-speed networks. For scalable application design developers should move away from the current component-based approach and implement instead an integrated, non-layered architecture where applications can use specialized low-level interfaces. The main focus of this research is on interactive, collaborative visualization of large datasets. This work describes how a visualization application can be improved through using distributed resources and high-speed network links to interactively visualize tens of gigabytes of data and handle terabyte datasets while maintaining high quality. The application supports interactive frame rates, high resolution, collaborative visualization and sustains remote I/O bandwidths of several Gbps (up to 30 times faster than local I/O). Motivated by the distributed visualization application, this work also researches remote data access systems. Because wide-area networks may have a high latency, the remote I/O system uses an architecture that effectively hides latency. Five remote data access architectures are analyzed and the results show that an architecture that combines bulk and pipeline processing is the best solution for high-throughput remote data access. The resulting system, also supporting high-speed transport protocols and configurable remote operations, is up to 400 times faster than a comparable existing remote data access system. Transport protocols are compared to understand which protocol can best utilize high-speed network connections, concluding that a rate-based protocol is the best solution, being 8 times faster than standard TCP. An HD-based remote teaching application experiment is conducted, illustrating the potential of network-aware applications in a production environment. Future research areas are presented, with emphasis on network-aware optimization, execution and deployment scenarios

    Data issues at the Euro-Mediterranean Centre for Climate Change

    Get PDF

    Data issues at the Euro-Mediterranean Centre for Climate Change

    Get PDF
    Climate Change research is even more becoming a data intensive and oriented scientific activity. Petabytes of climate data, big collections of datasets are continuously produced, delivered, accessed, processed by scientists and researchers at multiple sites at an international level. This work presents the Euro-Mediterranean Centre for Climate Change (CMCC) initiative, discussing data and metadata issues and dealing with both architectural and infrastructural aspects concerning the adopted grid enabled solution. A complete overview of the grid services deployed at the Centre is presented as well as the client side support (CMCC data portal and monitoring dashboard)

    Progress in Multi-Disciplinary Data Life Cycle Management

    Get PDF
    Modern science is most often driven by data. Improvements in state-of-the-art technologies and methods in many scientific disciplines lead not only to increasing data rates, but also to the need to improve or even completely overhaul their data life cycle management. Communities usually face two kinds of challenges: generic ones like federated authorization and authentication infrastructures and data preservation, and ones that are specific to their community and their respective data life cycle. In practice, the specific requirements often hinder the use of generic tools and methods. The German Helmholtz Association project "Large-Scale Data Management and Analysis" (LSDMA) addresses both challenges: its five Data Life Cycle Labs (DLCLs) closely collaborate with communities in joint research and development to optimize the communities data life cycle management, while its Data Services Integration Team (DSIT) provides generic data tools and services. We present most recent developments and results from the DLCLs covering communities ranging from heavy ion physics and photon science to high-throughput microscopy, and from DSIT

    Optimization of Data Transfer for Grid Using GridFTP

    Get PDF
    Grid is a highly distributed heterogeneous environment which interoperates towards completing a common goal. As such, grids performance is highly dependant on the speed of data transfer between systems of which grid consists. Most commonly used and a ā€œde factoā€ standard for data transfer on grid systems is GridFTP [7]. While working on CRO-GRID Infrastructure [3] project, we faced problems with small data transfer speeds when transferring small files. In this paper we present a program we have developed which uses GridFTP and speeds up data transfer of large number of small or medium sized files. We also present empirical results of the performance measurements achieved while using the implemented system
    • ā€¦
    corecore