Search CORE

39 research outputs found

Remote sensing big data computing: challenges and opportunities

Author: Huang Bormin
Jie Wei
Ma Yan
Ranjan Rajiv
Wang Lizhe
Wu Haiping
Zomaya Albert
Publication venue: 'Elsevier BV'
Publication date: 01/10/2015
Field of study

As we have entered an era of high resolution earth observation, the RS data are undergoing an explosive growth. The proliferation of data also give rise to the increasing complexity of RS data, like the diversity and higher dimensionality characteristic of the data. RS data are regarded as RS ‘‘Big Data’’. Fortunately, we are witness the coming technological leapfrogging. In this paper, we give a brief overview on the Big Data and data-intensive problems, including the analysis of RS Big Data, Big Data challenges, current techniques and works for processing RS Big Data

Crossref

UWL Repository

The Australian National University

Award ER25750: Coordinated Infrastructure for Fault Tolerance Systems Indiana University Final Report

Author: Lumsdaine Andrew
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 08/03/2013
Field of study

The main purpose of the Coordinated Infrastructure for Fault Tolerance in Systems initiative has been to conduct research with a goal of providing end-to-end fault tolerance on a systemwide basis for applications and other system software. While fault tolerance has been an integral part of most high-performance computing (HPC) system software developed over the past decade, it has been treated mostly as a collection of isolated stovepipes. Visibility and response to faults has typically been limited to the particular hardware and software subsystems in which they are initially observed. Little fault information is shared across subsystems, allowing little flexibility or control on a system-wide basis, making it practically impossible to provide cohesive end-to-end fault tolerance in support of scientific applications. As an example, consider faults such as communication link failures that can be seen by a network library but are not directly visible to the job scheduler, or consider faults related to node failures that can be detected by system monitoring software but are not inherently visible to the resource manager. If information about such faults could be shared by the network libraries or monitoring software, then other system software, such as a resource manager or job scheduler, could ensure that failed nodes or failed network links were excluded from further job allocations and that further diagnosis could be performed. As a founding member and one of the lead developers of the Open MPI project, our efforts over the course of this project have been focused on making Open MPI more robust to failures by supporting various fault tolerance techniques, and using fault information exchange and coordination between MPI and the HPC system software stack�from the application, numeric libraries, and programming language runtime to other common system components such as jobs schedulers, resource managers, and monitoring tools

Crossref

UNT Digital Library

A differentiated proposal of three dimension i/o performance characterization model focusing on storage environments

Author: Pioli Junior Laercio
Publication venue: 'Revista Educacao em Foco (UFJF)'
Publication date: 25/08/2020
Field of study

The I/O bottleneck remains a central issue in high-performance environments. Cloud computing, high-performance computing (HPC) and big data environments share many underneath difficulties to deliver data at a desirable time rate requested by high-performance applications. This increases the possibility of creating bottlenecks throughout the application feeding process by bottom hardware devices located in the storage system layer. In the last years, many researchers have been proposed solutions to improve the I/O architecture considering different approaches. Some of them take advantage of hardware devices while others focus on a sophisticated software approach. However, due to the complexity of dealing with high-performance environments, creating solutions to improve I/O performance in both software and hardware is challenging and gives researchers many opportunities. Classifying these improvements in different dimensions allows researchers to understand how these improvements have been built over the years and how it progresses. In addition, it also allows future efforts to be directed to research topics that have developed at a lower rate, balancing the general development process. This research present a three-dimension characterization model for classifying research works on I/O performance improvements for large scale storage computing facilities. This classification model can also be used as a guideline framework to summarize researches providing an overview of the actual scenario. We also used the proposed model to perform a systematic literature mapping that covered ten years of research on I/O performance improvements in storage environments. This study classified hundreds of distinct researches identifying which were the hardware, software, and storage systems that received more attention over the years, which were the most researches proposals elements and where these elements were evaluated. In order to justify the importance of this model and the development of solutions that targets I/O performance improvements, we evaluated a subset of these improvements using a a real and complete experimentation environment, the Grid5000. Analysis over different scenarios using a synthetic I/O benchmark demonstrates how the throughput and latency parameters behaves when performing different I/O operations using distinct storage technologies and approaches.O gargalo de E/S continua sendo um problema central em ambientes de alto desempenho. Os ambientes de computação em nuvem, computação de alto desempenho (HPC) e big data compartilham muitas dificuldades para fornecer dados em uma taxa de tempo desejável solicitada por aplicações de alto desempenho. Isso aumenta a possibilidade de criar gargalos em todo o processo de alimentação de aplicativos pelos dispositivos de hardware inferiores localizados na camada do sistema de armazenamento. Nos últimos anos, muitos pesquisadores propuseram soluções para melhorar a arquitetura de E/S considerando diferentes abordagens. Alguns deles aproveitam os dispositivos de hardware, enquanto outros se concentram em uma abordagem sofisticada de software. No entanto, devido à complexidade de lidar com ambientes de alto desempenho, criar soluções para melhorar o desempenho de E/S em software e hardware é um desafio e oferece aos pesquisadores muitas oportunidades. A classificação dessas melhorias em diferentes dimensões permite que os pesquisadores entendam como essas melhorias foram construídas ao longo dos anos e como elas progridem. Além disso, também permite que futuros esforços sejam direcionados para tópicos de pesquisa que se desenvolveram em menor proporção, equilibrando o processo geral de desenvolvimento. Esta pesquisa apresenta um modelo de caracterização tridimensional para classificar trabalhos de pesquisa sobre melhorias de desempenho de E/S para instalações de computação de armazenamento em larga escala. Esse modelo de classificação também pode ser usado como uma estrutura de diretrizes para resumir as pesquisas, fornecendo uma visão geral do cenário real. Também usamos o modelo proposto para realizar um mapeamento sistemático da literatura que abrangeu dez anos de pesquisa sobre melhorias no desempenho de E/S em ambientes de armazenamento. Este estudo classificou centenas de pesquisas distintas, identificando quais eram os dispositivos de hardware, software e sistemas de armazenamento que receberam mais atenção ao longo dos anos, quais foram os elementos de proposta mais pesquisados e onde esses elementos foram avaliados. Para justificar a importância desse modelo e o desenvolvimento de soluções que visam melhorias no desempenho de E/S, avaliamos um subconjunto dessas melhorias usando um ambiente de experimentação real e completo, o Grid5000. Análises em cenários diferentes usando um benchmark de E/S sintética demonstra como os parâmetros de vazão e latência se comportam ao executar diferentes operações de E/S usando tecnologias e abordagens distintas de armazenamento

Repositório Institucional - UFJF

Adaptive Parallelism for Coupled, Multithreaded Message-Passing Programs

Author: Gutiérrez Samuel K.
Publication venue: UNM Digital Repository
Publication date: 01/12/2018
Field of study

Hybrid parallel programming models that combine message passing (MP) and shared- memory multithreading (MT) are becoming more popular, especially with applications requiring higher degrees of parallelism and scalability. Consequently, coupled parallel programs, those built via the integration of independently developed and optimized software libraries linked into a single application, increasingly comprise message-passing libraries with differing preferred degrees of threading, resulting in thread-level heterogeneity. Retroactively matching threading levels between independently developed and maintained libraries is difficult, and the challenge is exacerbated because contemporary middleware services provide only static scheduling policies over entire program executions, necessitating suboptimal, over-subscribed or under-subscribed, configurations. In coupled applications, a poorly configured component can lead to overall poor application performance, suboptimal resource utilization, and increased time-to-solution. So it is critical that each library executes in a manner consistent with its design and tuning for a particular system architecture and workload. Therefore, there is a need for techniques that address dynamic, conflicting configurations in coupled multithreaded message-passing (MT-MP) programs. Our thesis is that we can achieve significant performance improvements over static under-subscribed approaches through reconfigurable execution environments that consider compute phase parallelization strategies along with both hardware and software characteristics. In this work, we present new ways to structure, execute, and analyze coupled MT- MP programs. Our study begins with an examination of contemporary approaches used to accommodate thread-level heterogeneity in coupled MT-MP programs. Here we identify potential inefficiencies in how these programs are structured and executed in the high-performance computing domain. We then present and evaluate a novel approach for accommodating thread-level heterogeneity. Our approach enables full utilization of all available compute resources throughout an application’s execution by providing programmable facilities with modest overheads to dynamically reconfigure runtime environments for compute phases with differing threading factors and affinities. Our performance results show that for a majority of the tested scientific workloads our approach and corresponding open-source reference implementation render speedups greater than 50 % over the static under-subscribed baseline. Motivated by our examination of reconfigurable execution environments and their memory overhead, we also study the memory attribution problem: the inability to predict or evaluate during runtime where the available memory is used across the software stack comprising the application, reusable software libraries, and supporting runtime infrastructure. Specifically, dynamic adaptation requires runtime intervention, which by its nature introduces additional runtime and memory overhead. To better understand the latter, we propose and evaluate a new way to quantify component-level memory usage from unmodified binaries dynamically linked to a message-passing communication library. Our experimental results show that our approach and corresponding implementation accurately measure memory resource usage as a function of time, scale, communication workload, and software or hardware system architecture, clearly distinguishing between application and communication library usage at a per-process level

Hardware Support for Efficient Packet Processing

Author: Geib Benjamin Ulrich
Publication venue: Universität Mannheim
Publication date: 01/01/2012
Field of study

Scalability is the key ingredient to further increase the performance of today’s supercomputers. As other approaches like frequency scaling reach their limits, parallelization is the only feasible way to further improve the performance. The time required for communication needs to be kept as small as possible to increase the scalability, in order to be able to further parallelize such systems. In the first part of this thesis ways to reduce the inflicted latency in packet based interconnection networks are analyzed and several new architectural solutions are proposed to solve these issues. These solutions have been tested and proven in a field programmable gate array (FPGA) environment. In addition, a hardware (HW) structure is presented that enables low latency packet processing for financial markets. The second part and the main contribution of this thesis is the newly designed crossbar architecture. It introduces a novel way to integrate the ability to multicast in a crossbar design. Furthermore, an efficient implementation of adaptive routing to reduce the congestion vulnerability in packet based interconnection networks is shown. The low latency of the design is demonstrated through simulation and its scalability is proven with synthesis results. The third part concentrates on the improvements and modifications made to EXTOLL, a high performance interconnection network specifically designed for low latency and high throughput applications. Contributions are modules enabling an efficient integration of multiple host interfaces as well as the integration of the on-chip interconnect. Additionally, some of the already existing functionality has been revised and improved to reach better performance and a lower latency. Micro-benchmark results are presented to underline the contribution of the made modifications

MAnnheim DOCument Server

Recommended from our members

Coordinated Fault Tolerance for High-Performance Computing

Author: al. et
Bosilca George
Dongarra Jack
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 08/04/2013
Field of study

Our work to meet our goal of end-to-end fault tolerance has focused on two areas: (1) improving fault tolerance in various software currently available and widely used throughout the HEC domain and (2) using fault information exchange and coordination to achieve holistic, systemwide fault tolerance and understanding how to design and implement interfaces for integrating fault tolerance features for multiple layers of the software stack—from the application, math libraries, and programming language runtime to other common system software such as jobs schedulers, resource managers, and monitoring tools

UNT Digital Library

Science & Technology Trends Quarterly Review 2012 February

Author: Science & Technology Foresight Center
科学技術動向研究センター
Publication venue: Science & Technology Foresight Center（NISTEP)
Publication date: 01/01/2012
Field of study

National Institute of Science and Technology Policy Library (NISTEP) / 科学技術・学術政策研究所ライブラリ

A Hardware Verification Methodology for an Interconnection Network with fast Process Synchronization

Author: Burkhardt Niels
Publication venue: Universität Mannheim
Publication date: 01/01/2012
Field of study

Shrinking process node sizes allow the integration of more and more functionality into a single chip design. At the same time, the mask costs to manufacture a new chip increases steadily. For the industry this cost increase can be absorbed by selling more chips. Furthermore, new innovative chip designs have a higher risk. Therefore, the industry only changes small parts of a chip design between different generations to minimize their risks. Thus, new innovative chip designs can only be realized by research institutes, which do not have the cost restrictions and the pressure from the markets as the industry. Such an innovative research project is EXTOLL, which is developed by the Computer Architecture Group of the University of Heidelberg. It is a new interconnection network for High performance Computing, and targets the problems of existing interconnection networks commercially available. EXTOLL is optimized for a high bandwidth, a low latency, and a high message rate. Especially, the low latency and high message rate become more important for modern interconnection networks. As the size of networks grow, the same computational problem is distributed to more nodes. This leads to a lower data granularity and more smaller messages, that have to be transported by the interconnection network. The problem of smaller messages in the interconnection network is addressed by this thesis. It develops a new network protocol, which is optimized for small messages. It reduces the protocol overhead required for sending small messages. Furthermore, the growing network sizes introduce a reliability problem. This is also addressed by the developed efficient network protocol. The smaller data granularity also increases the need for an efficient barrier synchronization. Such a hardware barrier synchronization is developed by thesis, using a new approach of integrating the barrier functionality into the interconnection network. The masks costs to manufacture an ASIC make it difficult for a research institute to build an ASIC. A research institute cannot afford re-spin, because of the costs. Therefore, there is the pressure to make it right the first time. An approach to avoid a re-spin is the functional verification in prior to the submission. A complete and comprehensive verification methodology is developed for the EXTOLL interconnection network. Due to the structured approach, it is possible to realize the functional verification with limited resources in a small time frame. Additionally, the developed verification methodology is able to support different target technologies for the design with a very little overhead

MAnnheim DOCument Server

Coordinated Fault-Tolerance for High-Performance Computing Final Project Report

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Crossref

Exploiting Performance Counters to Predict and Improve Energy Performance of HPC Systems

Author: Da Costa Georges
Lefèvre Laurent
Pierson Jean-Marc
Stolf Patricia
Tsafack Chetsa Ghislain Landry
Publication venue: 'Elsevier BV'
Publication date: 08/08/2013
Field of study

International audienceHardware monitoring through performance counters is available on almost all modern processors. Although these counters are originally designed for performance tuning, they have also been used for evaluating power consumption. We propose two approaches for modelling and understanding the behaviour of high performance computing (HPC) systems relying on hardware monitoring counters. We evaluate the effectiveness of our system modelling approach considering both optimising the energy usage of HPC systems and predicting HPC applications' energy consumption as target objectives. Although hardware monitoring counters are used for modelling the system, other methods -- including partial phase recognition and cross platform energy prediction -- are used for energy optimisation and prediction. Experimental results for energy prediction demonstrate that we can accurately predict the peak energy consumption of an application on a target platform; whereas, results for energy optimisation indicate that with no a priori knowledge of workloads sharing the platform we can save up to 24\% of the overall HPC system's energy consumption under benchmarks and real-life workloads

HAL-ENS-LYON

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

INRIA a CCSD electronic archive server

Open Archive Toulouse Archive Ouverte

Hal-Diderot

HAL-Rennes 1