Search CORE

5 research outputs found

Recommended from our members

Running the Sloan Digital Sky Survey data archive server

Author: Neilsen Eric H., Jr.
Stoughton Chris
Publication venue: Fermi National Accelerator Laboratory
Publication date: 01/11/2006
Field of study

The Sloan Digital Sky Survey (SDSS) Data Archive Server (DAS) provides public access to over 12Tb of data in 17 million files produced by the SDSS data reduction pipeline. Many tasks which seem trivial when serving smaller, less complex data sets present challenges when serving data of this volume and technical complexity. The included output files should be chosen to support as much science as possible from publicly released data, and only publicly released data. Users must have the resources needed to read and interpret the data correctly. Server administrators must generate new data releases at regular intervals, monitor usage, quickly recover from hardware failures, and monitor the data served by the DAS both for contents and corruption. We discuss these challenges, describe tools we use to administer and support the DAS, and discuss future development plans

UNT Digital Library

Towards Multi-site Metadata Management for Geographically Distributed Cloud Workflows

Author: Antoniu Gabriel
Costan Alexandru
Pineda-Morales Luis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2015
Field of study

International audienceWith their globally distributed datacenters, clouds now provide an opportunity to run complex large-scale applications on dynamically provisioned, networked and federated infrastructures. However, there is a lack of tools supporting data-intensive applications across geographically distributed sites. For instance, scientific workflows which handle many small files can easily saturate state-of-the-art distributed filesystems based on centralized metadata servers (e.g. HDFS, PVFS). In this paper, we explore several alternative design strategies to efficiently support the execution of existing workflow engines across multi-site clouds, by reducing the cost of metadata operations. These strategies leverage workflow semantics in a 2-level metadata partitioning hierarchy that combines distribution and replication. The system was validated on the Microsoft Azure cloud across 4 EU and US datacenters. The experiments were conducted on 128 nodes using synthetic benchmarks and real-life applications. We observe as much as 28% gain in execution time for a parallel, geo-distributed real-world application (Montage) and up to 50% for a metadata-intensive synthetic benchmark, compared to a baseline centralized configuration

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Large Science Databases – Are Cloud Services Ready for Them?

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2011
Field of study

Crossref

Supercomputing Frontiers

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/07/2022
Field of study

This open access book constitutes the refereed proceedings of the 7th Asian Conference Supercomputing Conference, SCFA 2022, which took place in Singapore in March 2022. The 8 full papers presented in this book were carefully reviewed and selected from 21 submissions. They cover a range of topics including file systems, memory hierarchy, HPC cloud platform, container image configuration workflow, large-scale applications, and scheduling

Directory of Open Access Books (DOAB)

Electrónica de control de un mini-robot para el posicionamiento micrométrico de una fibra óptica en el plano focal de un telescopio

Author: Fahim Fernández Nasib
Publication venue
Publication date: 01/01/2013
Field of study

Biblos-e Archivo