92 research outputs found

    PARSNIP: Performant Architecture for Race Safety with No Impact on Precision

    Get PDF
    Data race detection is a useful dynamic analysis for multithreaded programs that is a key building block in record-and-replay, enforcing strong consistency models, and detecting concurrency bugs. Existing software race detectors are precise but slow, and hardware support for precise data race detection relies on assumptions like type safety that many programs violate in practice. We propose PARSNIP, a fully precise hardware-supported data race detector. PARSNIP exploits new insights into the redundancy of race detection metadata to reduce storage overheads. PARSNIP also adopts new race detection metadata encodings that accelerate the common case while preserving soundness and completeness. When bounded hardware resources are exhausted, PARSNIP falls back to a software race detector to preserve correctness. PARSNIP does not assume that target programs are type safe, and is thus suitable for race detection on arbitrary code. Our evaluation of PARSNIP on several PARSEC benchmarks shows that performance overheads range from negligible to 2.6x, with an average overhead of just 1.5x. Moreover, Parsnip outperforms the state-of-the-art Radish hardware race detector by 4.6x

    HardBound: Architectural Support for Spatial Safety of the C Programming Language

    Get PDF
    The C programming language is at least as well known for its absence of spatial memory safety guarantees (i.e., lack of bounds checking) as it is for its high performance. C\u27s unchecked pointer arithmetic and array indexing allow simple programming mistakes to lead to erroneous executions, silent data corruption, and security vulnerabilities. Many prior proposals have tackled enforcing spatial safety in C programs by checking pointer and array accesses. However, existing software-only proposals have significant drawbacks that may prevent wide adoption, including: unacceptably high runtime overheads, lack of completeness, incompatible pointer representations, or need for non-trivial changes to existing C source code and compiler infrastructure. Inspired by the promise of these software-only approaches, this paper proposes a hardware bounded pointer architectural primitive that supports cooperative hardware/software enforcement of spatial memory safety for C programs. This bounded pointer is a new hardware primitive datatype for pointers that leaves the standard C pointer representation intact, but augments it with bounds information maintained separately and invisibly by the hardware. The bounds are initialized by the software, and they are then propagated and enforced transparently by the hardware, which automatically checks a pointer\u27s bounds before it is dereferenced. One mode of use requires instrumenting only malloc, which enables enforcement of per-allocation spatial safety for heap-allocated objects for existing binaries. When combined with simple intra-procedural compiler instrumentation, hardware bounded pointers enable a low-overhead approach for enforcing complete spatial memory safety in unmodified C programs

    ORCA: Ordering-free Regions for Consistency and Atomicity

    Get PDF
    Writing correct synchronization is one of the main difficulties of multithreaded programming. Incorrect synchronization causes many subtle concurrency errors such as data races and atomicity violations. Previous work has proposed stronger memory consistency models to rule out certain classes of concurrency bugs. However, these approaches are limited by a program’s original (and possibly incorrect) synchronization. In this work, we provide stronger guarantees than previous memory consistency models by punctuating atomicity only at ordering constructs like barriers, but not at lock operations. We describe the Ordering-free Regions for Consistency and Atomicity (ORCA) system which enforces atomicity at the granularity of ordering-free regions (OFRs). While many atomicity violations occur at finer granularity, in an empirical study of many large multithreaded workloads we find no examples of code that requires atomicity coarser than OFRs. Thus, we believe OFRs are a conservative approximation of the atomicity requirements of many programs. ORCA assists programmers by throwing an exception when OFR atomicity is threatened, and, in exception-free executions, guaranteeing that all OFRs execute atomically. In our evaluation, we show that ORCA automatically prevents real concurrency bugs. A user-study of ORCA demonstrates that synchronizing a program with ORCA is easier than using a data race detector. We evaluate modest hardware support that allows ORCA to run with just 18% slowdown on average over pthreads, with very similar scalability

    LASER: Light, Accurate Sharing dEtection and Repair

    Get PDF
    Contention for shared memory, in the forms of true sharing and false sharing, is a challenging performance bug to discover and to repair. Understanding cache contention requires global knowledge of the program\u27s actual sharing behavior, and can even arise invisibly in the program due to the opaque decisions of the memory allocator. Previous schemes have focused only on false sharing, and impose significant performance penalties or require non-trivial alterations to the operating system or runtime system environment. This paper presents the Light, Accurate Sharing dEtection and Repair (LASER) system, which leverages new performance counter capabilities available on Intel\u27s Haswell architecture that identify the source of expensive cache coherence events. Using records of these events generated by the hardware, we build a system for online contention detection and repair that operates with low performance overhead and does not require any invasive program, compiler or operating system changes. Our experiments show that LASER imposes just 2% average runtime overhead on the Phoenix, Parsec and Splash2x benchmarks. LASER can automatically improve the performance of programs by up to 19% on commodity hardware

    Technical Report: Anytime Computation and Control for Autonomous Systems

    Get PDF
    The correct and timely completion of the sensing and action loop is of utmost importance in safety critical autonomous systems. A crucial part of the performance of this feedback control loop are the computation time and accuracy of the estimator which produces state estimates used by the controller. These state estimators, especially those used for localization, often use computationally expensive perception algorithms like visual object tracking. With on-board computers on autonomous robots being computationally limited, the computation time of a perception-based estimation algorithm can at times be high enough to result in poor control performance. In this work, we develop a framework for co-design of anytime estimation and robust control algorithms while taking into account computation delays and estimation inaccuracies. This is achieved by constructing a perception-based anytime estimator from an off-the-shelf perception-based estimation algorithm, and in the process we obtain a trade-off curve for its computation time versus estimation error. This information is used in the design of a robust predictive control algorithm that at run-time decides a contract for the estimator, or the mode of operation of estimator, in addition to trying to achieve its control objectives at a reduced computation energy cost. In cases where the estimation delay can result in possibly degraded control performance, we provide an optimal manner in which the controller can use this trade-off curve to reduce estimation delay at the cost of higher inaccuracy, all the while guaranteeing that control objectives are robustly satisfied. Through experiments on a hexrotor platform running a visual odometry algorithm for state estimation, we show how our method results in upto a 10% improvement in control performance while saving 5-6% in computation energy as compared to a method that does not leverage the co-design

    Scholarship in Review 86(1)

    Get PDF
    Scholarship in Review was a magazine highlighting research and scholarly activities at Central Washington University, published by the Office of Graduate Studies and Research.https://digitalcommons.cwu.edu/scholarship_in_review/1002/thumbnail.jp

    Publications Conference Papers GPUDet: A Deterministic GPU Architecture

    No full text
    Research Interests My main research interests are in the fields of computer architecture and programming languages. I'm interested in using language and hardware innovations to provide better support for parallel programming

    Modelling water quality in tropical water distribution systems

    Get PDF
    At present, water treatment and distribution is of high priority to ensure that communities have access to safe and affordable drinking water. To achieve the desired outcomes for the drinking water distribution system, every aspect of the system must be designed to the water quality and drainage characteristics, and the nature and conditions of the infrastructure. Due to these design requirements, it can be considerably difficult to implement goals relating to the provision of safe drinking water.\ud \ud This study aimed to quantify key water quality parameters, such as chlorine concentration, flow velocity, pH, and biofilm growth, and evaluate the influence of these parameters on corrosion and iron release in a tropical drinking water distribution system.\ud \ud The approach of this study was split into three (3) components:\ud \ud • Conduct a pilot study on a distribution system in the tropics and assemble, calibrate and validate a network model, based on this water distribution system;\ud \ud • Design, construct and install an economic Biofilm Corrosion Reactor (BCR), to allow monitoring and evaluation of water quality parameters found within the distribution system;\ud \ud • Perform accelerated corrosion tests to investigate the effect of water quality parameters on corrosion rate.\ud \ud The outcomes of the study components provided knowledge and understanding on the performance of the drinking water distribution system. This understanding, combined with the observed relationships found during this research, allowed the development of corrosion management strategies relevant to the Ingham Water Supply Scheme.\ud \ud In general, the following relationships were observed during this research and were consistent with compared literature sources:\ud \ud • Corrosion rate increased significant with chlorine concentration above 1 mg/L;\ud \ud • During the initial phase of corrosion higher velocity results in higher corrosion rate;\ud \ud • During the final phase of corrosion lower velocity results in higher corrosion rate;\ud \ud • Corrosion rate increased with a decrease in pH;\ud \ud • Higher velocity results in increased iron release and therefore increased corrosion rate; and\ud \ud • Corrosion rate increased with increase in microorganisms/biofilm.\ud \ud Overall, the combination of the simulated network model for the Ingham Water Supply Scheme and the observed results and relationships during this research, resulted in the development of the Corrosion Hotspot Tool for Hinchinbrook Shire Council. The aim of the tool is to improve the management of the distribution system operationally and financially and also indicated some key aspects to monitor, in regards to corrosion such as chlorine residual levels.\ud \ud Furthermore, the research techniques and outcomes can be replicated with other water supply authorities, to assist in developing management strategies to improve drinking water quality and the operation of water treatment and distribution systems. \u
    • …
    corecore