232 research outputs found
Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)
Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016) Timisoara, Romania. February 8-11, 2016.The PhD Symposium was a very good opportunity for the young researchers to share information and knowledge, to
present their current research, and to discuss topics with other students in order to look for synergies and common research
topics. The idea was very successful and the assessment made by the PhD Student was very good. It also helped to
achieve one of the major goals of the NESUS Action: to establish an open European research network targeting sustainable
solutions for ultrascale computing aiming at cross fertilization among HPC, large scale distributed systems, and big
data management, training, contributing to glue disparate researchers working across different areas and provide a meeting
ground for researchers in these separate areas to exchange ideas, to identify synergies, and to pursue common activities in
research topics such as sustainable software solutions (applications and system software stack), data management, energy
efficiency, and resilience.European Cooperation in Science and Technology. COS
A multi-tier cached I/O architecture for massively parallel supercomputers
Recent advances in storage technologies and high performance interconnects have made possible in the last years to build, more and more potent storage systems that serve thousands of nodes. The majority of storage systems of clusters and supercomputers from Top 500 list are managed by one of three scalable parallel file systems: GPFS, PVFS, and Lustre. Most large-scale scientific parallel applications are written in Message Passing Interface (MPI), which has become the de-facto standard for scalable distributed memory machines. One part of the MPI standard is related to I/O and has among its main goals the portability and efficiency of file system accesses. All of the above mentioned parallel file systems may be accessed also through the MPI-IO interface. The I/O access patterns of scientific parallel applications often consist of accesses to a large number of small, non-contiguous pieces of data. For small file accesses the performance is dominated by the latency of network transfers and disks. Parallel scientific applications lead to interleaved file access patterns with high interprocess spatial locality at the I/O nodes. Additionally, scientific applications exhibit repetitive behaviour when a loop or a function with loops issues I/O requests. When I/O access patterns are repetitive, caching and prefetching can effectively mask their access latency. These characteristics of the access patterns motivated several researchers to propose parallel I/O optimizations both at library and file system levels. However, these optimizations are not always integrated across different layers in the systems. In this dissertation we propose a novel generic parallel I/O architecture for clusters and supercomputers. Our design is aimed at large-scale parallel architectures with thousands of compute nodes. Besides acting as middleware for existing parallel file systems, our architecture provides on-line virtualization of storage resources. Another objective of this thesis is to factor out the common parallel I/O functionality from clusters and supercomputers in generic modules in order to facilitate porting of scientific applications across these platforms. Our solution is based on a multi-tier cache architecture, collective I/O, and asynchronous data staging strategies hiding the latency of data transfer between cache tiers. The thesis targets to reduce the file access latency perceived by the data-intensive parallel scientific applications by multi-layer asynchronous data transfers. In order to accomplish this objective, our techniques leverage the multi-core architectures by overlapping computation with communication and I/O in parallel threads. Prototypes of our solutions have been deployed on both clusters and Blue Gene supercomputers. Performance evaluation shows that the combination of collective strategies with overlapping of computation, communication, and I/O may bring a substantial performance benefit for access patterns common for parallel scientific applications.-----------------------------------------------------------------------------------------------------------------------------En los Ăşltimos años se ha observado un incremento sustancial de la cantidad de datos producidos por las aplicaciones cientĂficas paralelas y de la necesidad de almacenar estos datos de forma persistente. Los sistemas de ficheros paralelos como PVFS, Lustre y GPFS han ofrecido una soluciĂłn escalable para esta demanda creciente de almacenamiento.
La mayorĂa de las aplicaciones cientĂficas son escritas haciendo uso de la interfaz de paso de mensajes (MPI), que se ha convertido en un estándar de-facto de programaciĂłn para las arquitecturas de memoria distribuida. Las aplicaciones paralelas que usan MPI pueden acceder a los sistemas de ficheros paralelos a travĂ©s de la interfaz ofrecida por MPI-IO.
Los patrones de acceso de las aplicaciones cientĂficas paralelas consisten en un gran nĂşmero de accesos pequeños y no contiguos. Para tamaños de acceso pequeños, el rendimiento viene limitado por la latencia de las transferencias de red y disco. Además, las aplicaciones cientĂficas llevan a
cabo accesos con una alta localidad espacial entre los distintos procesos en los nodos de E/S.
Adicionalmente, las aplicaciones cientĂficas presentan tĂpicamente un comportamiento repetitivo.
Cuando los patrones de acceso de E/S son repetitivos, técnicas como escritura demorada y lectura adelantada pueden enmascarar de forma eficiente las latencias de los accesos de E/S.
Estas caracterĂsticas han motivado a muchos investigadores en proponer optimizaciones de E/S tanto a nivel de biblioteca como a nivel del sistema de ficheros. Sin embargo, actualmente estas optimizaciones no se integran siempre a travĂ©s de las distintas capas del sistema.
El objetivo principal de esta tesis es proponer una nueva arquitectura genĂ©rica de E/S paralela para clusters y supercomputadores. Nuestra soluciĂłn está basada en una arquitectura de caches en varias capas, una tĂ©cnica de E/S colectiva y estrategias de acceso asĂncronas que
ocultan la latencia de transferencia de datos entre las distintas capas de caches.
Nuestro diseño está dirigido a arquitecturas paralelas escalables con miles de nodos de cómputo.
Además de actuar como middleware para los sistemas de ficheros paralelos existentes,
nuestra arquitectura debe proporcionar virtualizaciĂłn on-line de los recursos de almacenamiento.
Otro de los objeticos marcados para esta tesis es la factorizaciĂłn de las funcionalidades comunes en clusters y supercomputadores, en mĂłdulos genĂ©ricos que faciliten el despliegue de las aplicaciones cientĂficas a travĂ©s de estas plataformas.
Se han desplegado distintos prototipos de nuestras soluciones tanto en clusters como en
supercomputadores. Las evaluaciones de rendimiento demuestran que gracias a la combicaciĂłn de las estratĂ©gias colectivas de E/S y del solapamiento de computaciĂłn, comunicaciĂłn y E/S, se puede obtener una sustancial mejora del rendimiento en los patrones de acceso anteriormente descritos, muy comunes en las aplicaciones paralelas de caracter cientĂfico
Work in progress about enhancing the programmability and energy efficiency of storage in HPC and cloud environments
Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016) Timisoara, Romania. February 8-11, 2016.We present the work in progress for the PhD thesis titled “Enhancing the programmability and energy efficiency of
storage in HPC and cloud environments”. In this thesis, we focus on studying and optimizing data movement
across different layers of the operating system’s I/O stack. We study the power consumption during I/O-intensive
workloads using sophisticated software and hardware instrumentation, collecting time series data from internal ATX
power lines that feed every system component, and several run-time operating system metrics. Data exploration
and data analysis reveal for each I/O access pattern various power and performance regimes. These regimes show
how power is used by the system as data moved through the I/O stack. We use this knowledge to build I/O power
models that are able to predict power consumption for different I/O workloads, and optimize the CPU device driver
that manage performance states to obtain great power savings (over 30%). Finally, we develop new mechanisms and
abstractions that allow co-located virtual machines to share data with each other more efficiently. Our virtualized
data sharing solution reduces data movement among virtual domains, leading to energy savings I/O performance
improvements.European Cooperation in Science and Technology. COS
A generic I/O architecture for data-intensiveapplications based on in-memorydistributed cache
Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016) Timisoara, Romania. February 8-11, 2016.The evolution in scientific computing towards data-intensive applications and the increase of heterogeneity in
the computing resources, are exposing new challenges in the I/O layer requirements. We propose a generic I/O
architecture for data-intensive applications based on in-memory distributed caching. This solution leverages
the evolution of network capacities and the price drop in memory to improve I/O performance for I/O-bounded
applications adaptable to existing high-performance scenarios. We have showed the potential improvementsEuropean Cooperation in Science and Technology. COSTThis work is partially supported by EU under the COST Program Action IC1305: Network for Sustainable Ultrascale Computing (NESUS). This work is partially supported by the grant TIN2013-41350-P, Scalable Data Management Techniques for High-End Computing Systems from the Spanish Ministry of Economy and Competitiveness
High-level programming for heterogeneous and hierarchical parallel systems
High-Level Heterogeneous and Hierarchical Parallel Systems (HLPGPU) aims to bring together researchers and practitioners to present new results and ongoing work on those aspects of high-level programming relevant, or specific to general-purpose computing on graphics processing units (GPGPUs) and new architectures. The 2016 HLPGPU symposium was an event co-located with the HiPEAC conference in Prague, Czech Republic. HLPGPU is targeted at high-level parallel techniques, including programming models, libraries and languages, algorithmic skeletons, refactoring tools and techniques for parallel patterns, tools and systems to aid parallel programming, heterogeneous computing, timing analysis and statistical performance models.PostprintPeer reviewe
Proceedings of the Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015) Krakow, Poland
Proceedings of: Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015). Krakow (Poland), September 10-11, 2015
New directions in mobile, hybrid, and heterogeneous clouds for cyberinfrastructures
With the increasing availability of mobile devices and data generated by end-users, scientific instruments and simulations solving many of our most important scientific and engineering problems require innovative technical solutions. These solutions should provide the whole chain to process data and services from the mobile users to the cloud infrastructure, which must also integrate heterogeneous clouds to provide availability, scalability, and data privacy. This special issue presents the results of particular research works showing advances on mobile, hybrid, and heterogeneous clouds for modern cyberinfrastructures
Exposing data locality in HPC-based systems by using the HDFS backend
This work was partially supported by the project “CABAHLA-CM: Convergencia Big data-Hpc: de los sensores a las Aplicaciones” S2018/TCS4423 from Madrid Regional Government and the European Union’s Horizon 2020 research, New Data Intensive Computing Methods for High-End and Edge Computing Platforms (DECIDE). Ref. PID2019-107858GB-I00 and innovation program under grant agreement No 801091, project “ÀSPIDE: Exascale programming models for extreme data processing”
Boosting analyses in the life sciences via clusters, grids and clouds
In the last 20 years, computational methods have become an important part of developing emerging technologies for the field of bioinformatics and biomedicine. Those methods rely heavily on large scale computational resources as they need to manage Tbytes or Pbytes of data with large-scale structural and functional relationships, TFlops or PFlops of computing power for simulating highly complex models, or many-task processes and workflows for processing and analyzing data. This special issue contains papers showing existing solutions and latest developments in Life Sciences and Computing Sciences to collaboratively explore new ideas and approaches to successfully apply distributed IT-systems in translational research, clinical intervention, and decision-making. (C) 2016 Published by Elsevier B.V
Proceedings of the First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014): Porto, Portugal
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014). Porto (Portugal), August 27-28, 2014
- …