120 research outputs found

    Two-Layer Error Control Codes Combining Rectangular and Hamming Product Codes for Cache Error

    Get PDF
    We propose a novel two-layer error control code, combining error detection capability of rectangular codes and error correction capability of Hamming product codes in an efficient way, in order to increase cache error resilience for many core systems, while maintaining low power, area and latency overhead. Based on the fact of low latency and overhead of rectangular codes and high error control capability of Hamming product codes, two-layer error control codes employ simple rectangular codes for each cache line to detect cache errors, while loading the extra Hamming product code checks bits in the case of error detection; thus enabling reliable large-scale cache operations. Analysis and experiments are conducted to evaluate the cache fault-tolerant capability of various existing solutions and the proposed approach. The results show that the proposed approach can significantly increase Mean-Error-To-Failure (METF) and Mean-Time-To-failure (MTTF) up to 2.8×, reduce storage overhead by over 57%, and increase instruction per-cycle (IPC) up to 7%, compared to complex four-way 4EC5ED; and it increases METF and MTTF up to 133×, reduces storage overhead by over 11%, and achieves a similar IPC compared to simple eight-way single-error correcting double-error detecting (SECDED). The cost of the proposed approach is no more than 4% external memory access overhead

    Метод описания топологической структуры вычислительных кластеров, основанный на операциях произведений подграфов

    Get PDF
    Topological structure of communication networks in supercomputers with grow in size and complexity of installation, respectively becomes more difficult. There are many methods to describe it, but such descriptions are cumbersome, which makes them difficult to manipulate. The article proposes an approach to describing the communication environment of a supercomputer, when the communication network is described as a constructor. The elements of the constructor are typical topological structures often found in various computing systems. For this purpose, a language for describing the topological structure has been developed. It based on the operation products of subgraphs. The language is ideologically similar in its principles to the NetML and OMNeT++ languages. Special attention is paid to exceptions in the regularity of networks of real supercomputers; in order to add the possibility of describing this fact, special constructions have been introduced into the language. A library has been developed in the C programming language with purpose to facilitate work with the language intoduced in this article. Also a special wrapper over C library has been written in Python3, which then can be used to visualize graphs described by the language. The expressive power of language has been demonstrated in the description computing clusters: Tianhe-2A, AI Bridging Cloud Infrastructure and Lomonosov-2. The method has been tested and compared with GraphViz DOT it is showed multiple reductions in the Record volume required to save topology for some of the major Top500 systems.Топологическая структура коммуникационных сетей суперкомпьютерных систем при увеличении размера и сложности суперкомпьютеров соответственно усложняется. Для ее описания существует множество методов, однако такие описания являются громоздкими, что усложняет манипулирование ими. В статье предложен подход к описанию коммуникационной среды суперкомпьютера, когда коммуникационная сеть описывается как конструктор, где элементами конструктора являются типовые топологические структуры, часто встречающиеся в различных вычислительных системах. С этой целью разработан язык описания топологической структуры, основанный на операции произведения подграфов. Язык идейно схож в своих принципах с языками NetML и OMNeT++. Отдельное внимание в работе уделяется исключениям в регулярности сетей реальных суперкомпьютеров; с целью добавления возможности описания данного факта в язык внесены специальные конструкции. Для поддержки работы с языком описания разработана библиотека на языке программирования Си и специальная оболочка над ней написанная на языке Python3, которая затем может использоваться для визуализации описываемых языком графов. Выразительная мощность языка была продемонстрирована на описании вычислительных кластеров: Tianhe-2A, AI Bridging Cloud Infrastructure и Ломоносов-2. Метод был проверен и сравнен с GraphViz DOT показано многократное сокращение необходимых объема записи для некоторых крупных систем из Top500

    Configurable Version Management Hardware Transactional Memory for Multi-processor Platform

    Get PDF
    Programming on a shared memory multi-processor platforms in an efficient way is difficult as locked based synchronization limits the efficiency. Transactional memory (TM) is a promising approach in creating an abstraction layer for multi-threaded programming. However, the performance of TM is application-specific. In general, the configuration of a TM is divided into version management and conflict management. Each scheme has its strengths and weaknesses depending on executing application. Previous TM implementations for embedded system were built on fixed version management configuration which results in significant performance loss when transaction behaviour changes. In this paper, we propose a hardware transactional memory (HTM) with interchangeable version management. Random requests at different contention levels are used to verify the performance of the proposed TM. The proposed architecture is targeted for embedded applications and is area-efficient compared to current implementations that apply cache coherence protocols

    On the use of many-core Marvell ThunderX2 processor for HPC workloads

    Get PDF
    Marvell’s ThunderX2 has been the first Arm-based processor with deployments in large-scale HPC production systems, challenging the dominance that x86 processors had in the last decades. While x86 processors and its software stack have been characterized in detail, the behavior of Arm counterparts is not well known, limiting its adoption. This work methodically characterizes performance and power efficiency of the ThunderX2 running different HPC workloads compiled with two state-of-the-art compilers, GCC and Arm HPC Compiler. We study the maturity of available compilers and find that the Arm HPC Compiler is able to apply additional optimizations, resulting in better performance than GCC. In addition, we also compare both performance and power with respect to an Intel Skylake processor. Despite the faster single thread performance of Skylake, ThunderX2 is able to match performance on multi-threaded workloads due to its superior memory bandwidth. However, power efficiency of ThunderX2 is far from matching Skylake-based processors when AVX512 extensions are used

    A Survey of Green Networking Research

    Full text link
    Reduction of unnecessary energy consumption is becoming a major concern in wired networking, because of the potential economical benefits and of its expected environmental impact. These issues, usually referred to as "green networking", relate to embedding energy-awareness in the design, in the devices and in the protocols of networks. In this work, we first formulate a more precise definition of the "green" attribute. We furthermore identify a few paradigms that are the key enablers of energy-aware networking research. We then overview the current state of the art and provide a taxonomy of the relevant work, with a special focus on wired networking. At a high level, we identify four branches of green networking research that stem from different observations on the root causes of energy waste, namely (i) Adaptive Link Rate, (ii) Interface proxying, (iii) Energy-aware infrastructures and (iv) Energy-aware applications. In this work, we do not only explore specific proposals pertaining to each of the above branches, but also offer a perspective for research.Comment: Index Terms: Green Networking; Wired Networks; Adaptive Link Rate; Interface Proxying; Energy-aware Infrastructures; Energy-aware Applications. 18 pages, 6 figures, 2 table

    Multi-time-scale features for accurate respiratory sound classification

    Get PDF
    The COVID-19 pandemic has amplified the urgency of the developments in computer-assisted medicine and, in particular, the need for automated tools supporting the clinical diagnosis and assessment of respiratory symptoms. This need was already clear to the scientific community, which launched an international challenge in 2017 at the International Conference on Biomedical Health Informatics (ICBHI) for the implementation of accurate algorithms for the classification of respiratory sound. In this work, we present a framework for respiratory sound classification based on two different kinds of features: (i) short-term features which summarize sound properties on a time scale of tenths of a second and (ii) long-term features which assess sounds properties on a time scale of seconds. Using the publicly available dataset provided by ICBHI, we cross-validated the classification performance of a neural network model over 6895 respiratory cycles and 126 subjects. The proposed model reached an accuracy of 85% ± 3% and an precision of 80% ± 8%, which compare well with the body of literature. The robustness of the predictions was assessed by comparison with state-of-the-art machine learning tools, such as the support vector machine, Random Forest and deep neural networks. The model presented here is therefore suitable for large-scale applications and for adoption in clinical practice. Finally, an interesting observation is that both short-term and long-term features are necessary for accurate classification, which could be the subject of future studies related to its clinical interpretation

    Multi-Time-Scale Features for Accurate Respiratory Sound Classification

    Get PDF
    The COVID-19 pandemic has amplified the urgency of the developments in computer-assisted medicine and, in particular, the need for automated tools supporting the clinical diagnosis and assessment of respiratory symptoms. This need was already clear to the scientific community, which launched an international challenge in 2017 at the International Conference on Biomedical Health Informatics (ICBHI) for the implementation of accurate algorithms for the classification of respiratory sound. In this work, we present a framework for respiratory sound classification based on two different kinds of features: (i) short-term features which summarize sound properties on a time scale of tenths of a second and (ii) long-term features which assess sounds properties on a time scale of seconds. Using the publicly available dataset provided by ICBHI, we cross-validated the classification performance of a neural network model over 6895 respiratory cycles and 126 subjects. The proposed model reached an accuracy of 85%±3% and an precision of 80%±8%, which compare well with the body of literature. The robustness of the predictions was assessed by comparison with state-of-the-art machine learning tools, such as the support vector machine, Random Forest and deep neural networks. The model presented here is therefore suitable for large-scale applications and for adoption in clinical practice. Finally, an interesting observation is that both short-term and long-term features are necessary for accurate classification, which could be the subject of future studies related to its clinical interpretation

    Understanding Concurrency Vulnerabilities in Linux Kernel

    Full text link
    While there is a large body of work on analyzing concurrency related software bugs and developing techniques for detecting and patching them, little attention has been given to concurrency related security vulnerabilities. The two are different in that not all bugs are vulnerabilities: for a bug to be exploitable, there needs be a way for attackers to trigger its execution and cause damage, e.g., by revealing sensitive data or running malicious code. To fill the gap, we conduct the first empirical study of concurrency vulnerabilities reported in the Linux operating system in the past ten years. We focus on analyzing the confirmed vulnerabilities archived in the Common Vulnerabilities and Exposures (CVE) database, which are then categorized into different groups based on bug types, exploit patterns, and patch strategies adopted by developers. We use code snippets to illustrate individual vulnerability types and patch strategies. We also use statistics to illustrate the entire landscape, including the percentage of each vulnerability type. We hope to shed some light on the problem, e.g., concurrency vulnerabilities continue to pose a serious threat to system security, and it is difficult even for kernel developers to analyze and patch them. Therefore, more efforts are needed to develop tools and techniques for analyzing and patching these vulnerabilities.Comment: It was finished in Oct 201

    A dynamic scheduler for balancing HPC applications

    Get PDF
    Load imbalance cause significant performance degradation in High Performance Computing applications. In our previous work we showed that load imbalance can be alleviated by modern MT processors that provide mechanisms for controlling the allocation of processors internal resources. In that work, we applied static, hand-tuned resource allocations to balance HPC applications, providing improvements for benchmarks and real applications. In this paper we propose a dynamic process scheduler for the Linux kernel that automatically and transparently balances HPC applications according to their behavior. We tested our new scheduler on an IBM POWER5 machine, which provides a software-controlled prioritization mechanism that allows us to bias the processor resource allocation. Our experiments show that the scheduler reduces the imbalance of HPC applications, achieving results similar to the ones obtained by hand-tuning the applications (up to 16%). Moreover, our solution reduces the application's execution time combining effect of load balance and high responsive scheduling.Peer ReviewedPostprint (published version
    corecore