25 research outputs found

    Low-Overhead Dynamic Instruction Mix Generation using Hybrid Basic Block Profiling

    Get PDF
    Dynamic instruction mixes form an important part of the toolkits of performance tuners, compiler writers, and CPU architects. Instruction mixes are traditionally generated using software instrumentation, an accurate yet slow method, that is normally limited to user-mode code. We present a new method for generating instruction mixes using the Performance Monitoring Unit (PMU) of the CPU. It has very low overhead, extends coverage to kernel-mode execution, and causes only a very modest decrease in accuracy, compared to software instrumentation. In order to achieve this level of accuracy, we develop a new PMU-based data collection method, Hybrid Basic Block Profiling (HBBP). HBBP uses simple machine learning techniques to choose, on a per basic block basis, between data from two conventional sampling methods, Event Based Sampling (EBS) and Last Branch Records (LBR). We implement a profiling tool based on HBBP, and we report on experiments with the industry standard SPEC CPU2006 suite, as well as with two large-scale scientific codes. We observe an improvement in runtime compared to software instrumentation of up to 76x on the tested benchmarks, reducing wait times from hours to minutes. Instruction attribution errors average 2.1%. The results indicate that HBBP provides a favorable tradeoff between accuracy and speed, making it a suitable candidate for use in production environments

    Establishing a base of trust with performance counters for enterprise workloads

    Get PDF
    Understanding the performance of large, complex enterprise-class applications is an important, yet nontrivial task. Methods using hardware performance counters, such as profiling through event-based sampling, are often favored over instrumentation for analyzing such large codes, but rarely provide good accuracy at the instruction level. This work evaluates the accuracy ofmultiple eventbased sampling techniques and quantifies the impact of a range of improvements suggested in recent years. The evaluation is performed on instances of three modern CPU architectures, using designated kernels and full applications. We conclude that precisely distributed events considerably improve accuracy, with further improvements possible when using Last Branch Records. We also present practical recommendations for hardware architects, tool developers and performance engineers, aimed at improving the quality of results

    Renewable energy in copper production: A review on systems design and methodological approaches

    Get PDF
    Renewable energy systems are now accepted to be mandatory for climate change mitigation. These systems require a higher material supply than conventional ones. Particularly, they require more copper. The production of this metal, however, is intensive in energy consumption and emissions. Therefore, renewable energy systems must be used to improve the environmental performance of copper production. We cover the current state of research and develop recommendations for the design of renewable energy systems for copper production. To complement our analysis, we also consider studies from other industries and regional energy systems. We provide six recommendations for future modeling: (a) current energy demand models for copper production are overly simplistic and need to be enhanced for planning with high levels of renewable technologies; (b) multi-vector systems (electricity, heat, and fuels) need to be explicitly modeled to capture the readily available flexibility of the system; (c) copper production is done in arid regions, where water supply is energy-intensive, then, water management should be integrated in the overall design of the energy system; (d) there is operational flexibility in existing copper plants, which needs to be better understood and assessed; (e) the design of future copper mines should adapt to the dynamics of available renewable energy sources; and (f) life cycle impacts of the components of the system need to be explicitly minimized in the optimization models. Researchers and decision-makers from the copper and energy sector will benefit from this comprehensive review and these recommendations. We hope it will accelerate the deployment of renewables, particularly in the copper industry

    Two warm Neptunes transiting HIP 9618 revealed by TESS and Cheops

    Full text link
    peer reviewedHIP 9618 (HD 12572, TOI-1471, TIC 306263608) is a bright (G = 9.0 mag) solar analogue. TESS photometry revealed the star to have two candidate planets with radii of 3.9 ± 0.044 R (HIP 9618 b) and 3.343 ± 0.039 R (HIP 9618 c). While the 20.77291 d period of HIP 9618 b was measured unambiguously, HIP 9618 c showed only two transits separated by a 680-d gap in the time series, leaving many possibilities for the period. To solve this issue, CHEOPS performed targeted photometry of period aliases to attempt to recover the true period of planet c, and successfully determined the true period to be 52.56349 d. High-resolution spectroscopy with HARPS-N, SOPHIE, and CAFE revealed a mass of 10.0 ± 3.1M for HIP 9618 b, which, according to our interior structure models, corresponds to a 6.8 ± 1.4 per cent gas fraction. HIP 9618 c appears to have a lower mass than HIP 9618 b, with a 3-sigma upper limit of 50 d, opening the door for the atmospheric characterization of warm (Teq < 750 K) sub-Neptunes

    The Molecular Identification of Organic Compounds in the Atmosphere: State of the Art and Challenges

    Full text link

    Hierarchical cycle accounting: a new method for application performance tuning

    No full text
    To address the growing difficulty of performance debugging on modern processors with increasingly complex micro-architectures, we present Hierarchical Cycle Accounting (HCA), a structured, hierarchical, architecture-agnostic methodology for the identification of performance issues in workloads running on these modern processors. HCA reports to the user the cost of a number of execution components, such as load latency, memory bandwidth, instruction starvation, and branch misprediction. A critical novel feature of HCA is that all cost components are presented in the same unit, core pipeline cycles. Their relative importance can therefore be compared directly. These cost components are furthermore presented in a hierarchical fashion, with architecture-agnostic components at the top levels of the hierarchy and architecture-specific components at the bottom. This hierarchical structure is useful in guiding the performance debugging effort to the places where it can be the most effective. For a given architecture, the cost components are computed based on the observation of architecture-specific events, typically provided by a performance monitoring unit (PMU), and using a set of formulas to attribute a certain cost in cycles to each event. The selection of what PMU events to use, their validation, and the derivation of the formulas are done offline by an architecture expert, thereby freeing the non-expert from the burdensome and error-prone task of directly interpreting PMU data. We have implemented the HCA methodology in Gooda, a publicly available tool. We describe the application of Gooda to the analysis of several workloads in wide use, showing how HCA's features facilitated performance debugging for these applications. We also describe the discovery of relevant bugs in Intel hardware and the Linux Kernel as a result of using HCA

    Gibt es ?Dauerformen? bei Escherichia coli?

    No full text

    Beobachtungen �ber den Einflu� von Frost und Trockenheit auf Bodenmikroorganismen

    No full text

    Gibt es eine Gelbform von Bacillus mycoides Fl�gge 1886?

    No full text
    corecore