842 research outputs found

    Static analysis of SEU effects on software applications

    Get PDF
    Control flow errors have been widely addressed in literature as a possible threat to the dependability of computer systems, and many clever techniques have been proposed to detect and tolerate them. Nevertheless, it has never been discussed if the overheads introduced by many of these techniques are justified by a reasonable probability of incurring control flow errors. This paper presents a static executable code analysis methodology able to compute, depending on the target microprocessor platform, the upper-bound probability that a given application incurs in a control flow error

    A Detailed Analysis of Contemporary ARM and x86 Architectures

    Get PDF
    RISC vs. CISC wars raged in the 1980s when chip area and processor design complexity were the primary constraints and desktops and servers exclusively dominated the computing landscape. Today, energy and power are the primary design constraints and the computing landscape is significantly different: growth in tablets and smartphones running ARM (a RISC ISA) is surpassing that of desktops and laptops running x86 (a CISC ISA). Further, the traditionally low-power ARM ISA is entering the high-performance server market, while the traditionally high-performance x86 ISA is entering the mobile low-power device market. Thus, the question of whether ISA plays an intrinsic role in performance or energy efficiency is becoming important, and we seek to answer this question through a detailed measurement based study on real hardware running real applications. We analyze measurements on the ARM Cortex-A8 and Cortex-A9 and Intel Atom and Sandybridge i7 microprocessors over workloads spanning mobile, desktop, and server computing. Our methodical investigation demonstrates the role of ISA in modern microprocessors? performance and energy efficiency. We find that ARM and x86 processors are simply engineering design points optimized for different levels of performance, and there is nothing fundamentally more energy efficient in one ISA class or the other. The ISA being RISC or CISC seems irrelevant

    First Experiences with the AVR ATmega32 Microcontroller

    Get PDF
    In the fall of 2013, the Electrical Engineering department at ** institution omitted ** reconfigured its microcontroller instructional laboratory to use the AVR ATmega32 microcontroller from Atmel for the first time. This change was prompted by pressure from students, who displayed considerable interest in the AVR family of processors due to its use in popular Arduino microcomputer systems. In response to this expressed interest from students, the microcontroller lab was redesigned around this new processor. Atmel’s ATmega32 is an 8-bit RISC processor, based on Harvard architecture. This new hardware replaced systems using Freescale’s S12 processor, which is a 16-bit CISC processor, based on Princeton architecture. Thus, this change was a fundamental shift on at least three axes of computer characteristics. Was this change a gain or a loss on each of those three axes? Experience this year with teaching the lab using the ATmega32 has been generally positive, and this paper reports that experience in making the switch from the S12 to the ATmega32. Despite the fundamental architectural differences between the former processor (S12) and the new processor (ATmega32), there are many similarities between the processors as well. The hardware features of the ATmega32 mimic the S12’s features almost exactly, although the ATmega32 generally contains less of each feature, such as memory, I/O ports, timing features, etc. Many of the resources of the S12 were unused and wasted in lab exercises in the past. The scaled-back ATmega32, with basically the same features but in less abundance, is a better match to the instructional needs of this lab. Students feel they are getting a more complete exposure to the ATmega32, since nearly all the resources of the processor have been used in lab exercises, whereas many of the resources of the S12 were ignored in lab assignments because they exceeded the needs of the lab. This paper will address the adaptations that were made in the structure and pedagogy of the microcontroller course to implement the change in processors from the S12 to the ATmega32. It will document some of the troubles and surprises encountered along the way, and should provide some guidance for others contemplating such a change in their microcontroller labs

    Automatic synthesis of application-specific processors

    Get PDF
    Thesis (D. Tech. (Engineering: Electrical)) -- Central University of technology, Free State, 2012This thesis describes a method for the automatic generation of appli- cation speci_c processors. The thesis was organized into three sepa- rate but interrelated studies, which together provide: a justi_cation for the method used, a theory that supports the method, and a soft- ware application that realizes the method. The _rst study looked at how modern day microprocessors utilize their hardware resources and it proposed a metric, called core density, for measuring the utilization rate. The core density is a function of the microprocessor's instruction set and the application scheduled to run on that microprocessor. This study concluded that modern day microprocessors use their resources very ine_ciently and proposed the use of subset processors to exe- cute the same applications more e_ciently. The second study sought to provide a theoretical framework for the use of subset processors by developing a generic formal model of computer architecture. To demonstrate the model's versatility, it was used to describe a number of computer architecture components and entire computing systems. The third study describes the development of a set of software tools that enable the automatic generation of application speci_c proces- sors. The FiT toolkit automatically generates a unique Hardware Description Language (HDL) description of a processor based on an application binary _le and a parameterizable template of a generic mi- croprocessor. Area-optimized and performance-optimized custom soft processors were generated using the FiT toolkit and the utilization of the hardware resources by the custom soft processors was character- ized. The FiT toolkit was combined with an ANSI C compiler and a third-party tool for programming _eld-programmable gate arrays (FPGAs) to create an unconstrained C-to-silicon compiler

    Empirical and Statistical Application Modeling Using on -Chip Performance Monitors.

    Get PDF
    To analyze the performance of applications and architectures, both programmers and architects desire formal methods to explain anomalous behavior. To this end, we present various methods that utilize non-intrusive, performance-monitoring hardware only recently available on microprocessors to provide further explanations of observed behavior. All the methods attempt to characterize and explain the instruction-level parallelism achieved by codes on different architectures. We also present a prototype tool automating the analysis process to exploit the advantages of the empirical and statistical methods proposed. The empirical, statistical and hybrid methods are discussed and explained with case study results provided. The given methods further the wealth of tools available to programmer\u27s and architects for generally understanding the performance of scientific applications. Specifically, the models and tools presented provide new methods for evaluating and categorizing application performance. The empirical memory model serves to quantify the hierarchical memory performance of applications by inferring the incurred latencies of codes after the effect of latency hiding techniques are realized. The instruction-level model and its extensions model on-chip performance analytically giving insight into inherent performance bottlenecks in superscalar architectures. The statistical model and its hybrid extension provide other methods of categorizing codes via their statistical variations. The PTERA performance tool automates the use of performance counters for use by these methods across platforms making the modeling process easier still. These unique methods provide alternatives to performance modeling and categorizing not available previously in an attempt to utilize the inherent modeling capabilities of performance monitors on commodity processors for scientific applications

    Energy Efficiency in High Throughput Computing Tools, techniques and experi- ments

    Get PDF
    The volume of data to process and store in high throughput computing (HTC) and scientific computing continues increasing many-fold every year. Consequently, the energy consumption of data centers and similar facilities is raising economical and environmental concerns. Thus, it is of paramount importance to improve energy efficiency in such environments. This thesis focuses on understanding how to improve energy efficiency in scientific computing and HTC. For this purpose we conducted research on tools and techniques to measure power consumption. We also conducted experiments to understand if low-energy processing architectures are suitable for HTC and compared the energy efficiency of ARM and Intel ar- chitectures under authentic scientific workloads. Finally, we used the results to develop an algorithm that schedules tasks among ARM and Intel machines in a dynamic electricity pricing market in order to optimally lower the overall electric- ity bill. Our contributions are three-fold: The results of the study indicate that ARM has potential for being used in scientific and HTC from an energy efficiency perspective; We also outlined a set of tools and techniques to accurately measure energy consumption at the different levels of the computing systems; In addiciton, the developed scheduling algorithm shows potential savings in the electrical bill when applied to heterogeneous data centers working under a dynamic electricity pricing market

    Space Station Freedom data management system growth and evolution report

    Get PDF
    The Information Sciences Division at the NASA Ames Research Center has completed a 6-month study of portions of the Space Station Freedom Data Management System (DMS). This study looked at the present capabilities and future growth potential of the DMS, and the results are documented in this report. Issues have been raised that were discussed with the appropriate Johnson Space Center (JSC) management and Work Package-2 contractor organizations. Areas requiring additional study have been identified and suggestions for long-term upgrades have been proposed. This activity has allowed the Ames personnel to develop a rapport with the JSC civil service and contractor teams that does permit an independent check and balance technique for the DMS

    ARM Wrestling with Big Data: A Study of Commodity ARM64 Server for Big Data Workloads

    Full text link
    ARM processors have dominated the mobile device market in the last decade due to their favorable computing to energy ratio. In this age of Cloud data centers and Big Data analytics, the focus is increasingly on power efficient processing, rather than just high throughput computing. ARM's first commodity server-grade processor is the recent AMD A1100-series processor, based on a 64-bit ARM Cortex A57 architecture. In this paper, we study the performance and energy efficiency of a server based on this ARM64 CPU, relative to a comparable server running an AMD Opteron 3300-series x64 CPU, for Big Data workloads. Specifically, we study these for Intel's HiBench suite of web, query and machine learning benchmarks on Apache Hadoop v2.7 in a pseudo-distributed setup, for data sizes up to 20GB20GB files, 5M5M web pages and 500M500M tuples. Our results show that the ARM64 server's runtime performance is comparable to the x64 server for integer-based workloads like Sort and Hive queries, and only lags behind for floating-point intensive benchmarks like PageRank, when they do not exploit data parallelism adequately. We also see that the ARM64 server takes 13rd\frac{1}{3}^{rd} the energy, and has an Energy Delay Product (EDP) that is 5071%50-71\% lower than the x64 server. These results hold promise for ARM64 data centers hosting Big Data workloads to reduce their operational costs, while opening up opportunities for further analysis.Comment: Accepted for publication in the Proceedings of the 24th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC), 201

    Synthesis of application specific processor architectures for ultra-low energy consumption

    No full text
    In this paper we suggest that further energy savings can be achieved by a new approach to synthesis of embedded processor cores, where the architecture is tailored to the algorithms that the core executes. In the context of embedded processor synthesis, both single-core and many-core, the types of algorithms and demands on the execution efficiency are usually known at the chip design time. This knowledge can be utilised at the design stage to synthesise architectures optimised for energy consumption. Firstly, we present an overview of both traditional energy saving techniques and new developments in architectural approaches to energy-efficient processing. Secondly, we propose a picoMIPS architecture that serves as an architectural template for energy-efficient synthesis. As a case study, we show how the picoMIPS architecture can be tailored to an energy efficient execution of the DCT algorithm
    corecore