496,584 research outputs found

    Editorial for FGCS Special issue on “Time-critical Applications on Software-defined Infrastructures”

    Get PDF
    Performance requirements in many applications can often be modelled as constraints related to time, for example, the span of data processing for disaster early warning [1], latency in live event broadcasting [2], and jitter during audio/video conferences [3]. These time constraints are often treated either in an “as fast as possible” manner, such as sensitive latencies in high-performance computing or communication tasks, or in a “timeliness” way where tasks have to be finished within a given window in real-time systems, as classified in [4]. To meet the required time constraints, one has to carefully analyse time constraints, engineer and integrate system components, and optimise the scheduling for computing and communication tasks. The development of a time-critical application is thus time-consuming and costly. During the past decades, the infrastructure technologies of computing, storage and networking have made tremendous progress. Besides the capacity and performance of physical devices, the virtualisation technologies offer effective resource management and isolation at different levels, such as Java Virtual Machines at the application level, Dockers at the operating system level, and Virtual Machines at the whole system level. Moreover, the network embedding [5] and software-defined networking [6] provide network-level virtualisation and control that enable a new paradigm of infrastructure, where infrastructure resources can be virtualised, isolated, and dynamically customised based on application needs. The software-defined infrastructures, including Cloud, Fog, Edge, software-defined networking and network function virtualisation, emerge nowadays as new environments for distributed applications with time-critical application requirements, but also face challenges in effectively utilising the advanced infrastructure features in system engineering and dynamic control. This special issue on “time-critical applications and software-defined infrastructures” focuses on practical aspects of the design, development, customisation and performance-oriented operation of such applications for Clouds and other distributed environments

    HPC Platform for Railway Safety-Critical Functionalities Based on Artificial Intelligence

    Get PDF
    The automation of railroad operations is a rapidly growing industry. In 2023, a new European standard for the automated Grade of Automation (GoA) 2 over European Train Control System (ETCS) driving is anticipated. Meanwhile, railway stakeholders are already planning their research initiatives for driverless and unattended autonomous driving systems. As a result, the industry is particularly active in research regarding perception technologies based on Computer Vision (CV) and Artificial Intelligence (AI), with outstanding results at the application level. However, executing high-performance and safety-critical applications on embedded systems and in real-time is a challenge. There are not many commercially available solutions, since High-Performance Computing (HPC) platforms are typically seen as being beyond the business of safety-critical systems. This work proposes a novel safety-critical and high-performance computing platform for CV- and AI-enhanced technology execution used for automatic accurate stopping and safe passenger transfer railway functionalities. The resulting computing platform is compatible with the majority of widely-used AI inference methodologies, AI model architectures, and AI model formats thanks to its design, which enables process separation, redundant execution, and HW acceleration in a transparent manner. The proposed technology increases the portability of railway applications into embedded systems, isolates crucial operations, and effectively and securely maintains system resources.The novel approach presented in this work is being developed as a specific railway use case for autonomous train operation into SELENE European research project. This project has received funding from RIA—Research and Innovation action under grant agreement No. 871467

    Performance Analysis of Noise Subspace-based Narrowband Direction-of-Arrival (DOA) Estimation Algorithms on CPU and GPU

    Full text link
    High-performance computing of array signal processing problems is a critical task as real-time system performance is required for many applications. Noise subspace-based Direction-of-Arrival (DOA) estimation algorithms are popular in the literature since they provide higher angular resolution and higher robustness. In this study, we investigate various optimization strategies for high-performance DOA estimation on GPU and comparatively analyze alternative implementations (MATLAB, C/C++ and CUDA). Experiments show that up to 3.1x speedup can be achieved on GPU compared to the baseline multi-threaded CPU implementation. The source code is publicly available at the following link: https://github.com/erayhamza/NssDOACud

    EARLY PERFORMANCE PREDICTION METHODOLOGY FOR MANY-CORES ON CHIP BASED APPLICATIONS

    Get PDF
    Modern high performance computing applications such as personal computing, gaming, numerical simulations require application-specific integrated circuits (ASICs) that comprises of many cores. Performance for these applications depends mainly on latency of interconnects which transfer data between cores that implement applications by distributing tasks. Time-to-market is a critical consideration while designing ASICs for these applications. Therefore, to reduce design cycle time, predicting system performance accurately at an early stage of design is essential. With process technology in nanometer era, physical phenomena such as crosstalk, reflection on the propagating signal have a direct impact on performance. Incorporating these effects provides a better performance estimate at an early stage. This work presents a methodology for better performance prediction at an early stage of design, achieved by mapping system specification to a circuit-level netlist description. At system-level, to simplify description and for efficient simulation, SystemVerilog descriptions are employed. For modeling system performance at this abstraction, queueing theory based bounded queue models are applied. At the circuit level, behavioral Input/Output Buffer Information Specification (IBIS) models can be used for analyzing effects of these physical phenomena on on-chip signal integrity and hence performance. For behavioral circuit-level performance simulation with IBIS models, a netlist must be described consisting of interacting cores and a communication link. Two new netlists, IBIS-ISS and IBIS-AMI-ISS are introduced for this purpose. The cores are represented by a macromodel automatically generated by a developed tool from IBIS models. The generated IBIS models are employed in the new netlists. Early performance prediction methodology maps a system specification to an instance of these netlists to provide a better performance estimate at an early stage of design. The methodology is scalable in nanometer process technology and can be reused in different designs

    The Road towards Predictable Automotive High-Performance Platforms

    Get PDF
    Due to the trends of centralizing the E/E architecture and new computing-intensive applications, high-performance hardware platforms are currently finding their way into automotive systems. However, the SoCs currently available on the market have significant weaknesses when it comes to providing predictable performance for time-critical applications. The main reason for this is that these platforms are optimized for averagecase performance. This shortcoming represents one major risk in the development of current and future automotive systems. In this paper we describe how high-performance and predictability could (and should) be reconciled in future HW/SW platforms. We believe that this goal can only be reached in a close collaboration between system suppliers, IP providers, semiconductor companies, and OS/hypervisor vendors. Furthermore, academic input will be needed to solve remaining challenges and to further improve initial solutions

    Data Replication-Based Scheduling in Cloud Computing Environment

    Get PDF
    Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes a bottleneck for the whole cloud workflow system and decreases the performance of the system dramatically. Job scheduling and data replication are two important techniques which can enhance the performance of data-intensive applications. It is wise to integrate these techniques into one framework for achieving a single objective. In this paper, we integrate data replication and job scheduling with the aim of reducing response time by reduction of data access time in cloud computing environment. This is called data replication-based scheduling (DRBS). Simulation results show the effectiveness of our algorithm in comparison with well-known algorithms such as random and round-robin

    A Performance/Cost Evaluation for a GPU-Based Drug Discovery Application on Volunteer Computing

    Get PDF
    Bioinformatics is an interdisciplinary research field that develops tools for the analysis of large biological databases, and, thus, the use of high performance computing (HPC) platforms is mandatory for the generation of useful biological knowledge. The latest generation of graphics processing units (GPUs) has democratized the use of HPC as they push desktop computers to cluster-level performance. Many applications within this field have been developed to leverage these powerful and low-cost architectures. However, these applications still need to scale to larger GPU-based systems to enable remarkable advances in the fields of healthcare, drug discovery, genome research, etc. The inclusion of GPUs in HPC systems exacerbates power and temperature issues, increasing the total cost of ownership (TCO). This paper explores the benefits of volunteer computing to scale bioinformatics applications as an alternative to own large GPU-based local infrastructures. We use as a benchmark a GPU-based drug discovery application called BINDSURF that their computational requirements go beyond a single desktop machine. Volunteer computing is presented as a cheap and valid HPC system for those bioinformatics applications that need to process huge amounts of data and where the response time is not a critical factor.IngenierĂ­a, Industria y ConstrucciĂł

    A User Friendly Phase Detection Methodology for HPC Systems' Analysis

    Get PDF
    International audienceA wide array of today's high performance computing (HPC) applications exhibits recurring behaviours or execution phases throughout their run-time. Accurate detection of program phases allows reconfiguring the system for a better power/performance trade off; and can reduce the simulation time of programs by identifying regions of code whose performance is critical to the entire program. Program phases are also reflected in different behaviours the system goes through or system phases, which can be used as an alternative means of program phase detection for users lacking expertise. In this paper, we present an execution vector based (EV-based) phase detection, which is an on-line methodology for detecting phases in the behaviour of a HPC system and determining execution points that correspond to these phases. We also present a methodology for defining a small set of EVs representative of the system's behaviour over a fixed period of time and show that EV-based phase detection identifies recurring phases. Our methodology is illustrated with benchmarks and a real life application

    Energy Efficient Data-Intensive Computing With Mapreduce

    Get PDF
    Power and energy consumption are critical constraints in data center design and operation. In data centers, MapReduce data-intensive applications demand significant resources and energy. Recognizing the importance and urgency of optimizing energy usage of MapReduce applications, this work aims to provide instrumental tools to measure and evaluate MapReduce energy efficiency and techniques to conserve energy without impacting performance. Energy conservation for data-intensive computing requires enabling technology to provide detailed and systemic energy information and to identify in the underlying system hardware and software. To address this need, we present eTune, a fine-grained, scalable energy profiling framework for data-intensive computing on large-scale distributed systems. eTune leverages performance monitoring counters (PMCs) on modern computer components and statistically builds power-performance correlation models. Using learned models, eTune augments direct measurement with a software-based power estimator that runs on compute nodes and reports power at multiple levels including node, core, memory, and disks with high accuracy. Data-intensive computing differs from traditional high performance computing as most execution time is spent in moving data between storage devices, nodes, and components. Since data movements are potential performance and energy bottlenecks, we propose an analysis framework with methods and metrics for evaluating and characterizing costly built-in MapReduce data movements. The revealed data movement energy characteristics can be exploited in system design and resource allocation to improve data-intensive computing energy efficiency. Finally, we present an optimization technique that targets inefficient built-in MapReduce data movements to conserve energy without impacting performance. The optimization technique allocates the optimal number of compute nodes to applications and dynamically schedules processor frequency during its execution based on data movement characteristics. Experimental results show significant energy savings, though improvements depend on both workload characteristics and policies of resource and dynamic voltage and frequency scheduling. As data volume doubles every two years and more data centers are put into production, energy consumption is expected to grow further. We expect these studies provide direction and insight in building more energy efficient data-intensive systems and applications, and the tools and techniques are adopted by other researchers for their energy efficient studies
    • …
    corecore