5,729 research outputs found

    Software trace cache

    Get PDF
    We explore the use of compiler optimizations, which optimize the layout of instructions in memory. The target is to enable the code to make better use of the underlying hardware resources regardless of the specific details of the processor/architecture in order to increase fetch performance. The Software Trace Cache (STC) is a code layout algorithm with a broader target than previous layout optimizations. We target not only an improvement in the instruction cache hit rate, but also an increase in the effective fetch width of the fetch engine. The STC algorithm organizes basic blocks into chains trying to make sequentially executed basic blocks reside in consecutive memory positions, then maps the basic block chains in memory to minimize conflict misses in the important sections of the program. We evaluate and analyze in detail the impact of the STC, and code layout optimizations in general, on the three main aspects of fetch performance; the instruction cache hit rate, the effective fetch width, and the branch prediction accuracy. Our results show that layout optimized, codes have some special characteristics that make them more amenable for high-performance instruction fetch. They have a very high rate of not-taken branches and execute long chains of sequential instructions; also, they make very effective use of instruction cache lines, mapping only useful instructions which will execute close in time, increasing both spatial and temporal locality.Peer ReviewedPostprint (published version

    Communication Centric Design in Complex Automotive Embedded Systems

    Get PDF
    Automotive embedded applications like the engine management system are composed of multiple functional components that are tightly coupled via numerous communication dependencies and intensive data sharing, while also having real-time requirements. In order to cope with complexity, especially in multi-core settings, various communication mechanisms are used to ensure data consistency and temporal determinism along functional cause-effect chains. However, existing timing analysis methods generally only support very basic communication models that need to be extended to handle the analysis of industry grade problems which involve more complex communication semantics. In this work, we give an overview of communication semantics used in the automotive industry and the different constraints to be considered in the design process. We also propose a method for model transformation to increase the expressiveness of current timing analysis methods enabling them to work with more complex communication semantics. We demonstrate this transformation approach for concrete implementations of two communication semantics, namely, implicit and LET communication. We discuss the impact on end-to-end latencies and communication overheads based on a full blown engine management system

    Instant restore after a media failure

    Full text link
    Media failures usually leave database systems unavailable for several hours until recovery is complete, especially in applications with large devices and high transaction volume. Previous work introduced a technique called single-pass restore, which increases restore bandwidth and thus substantially decreases time to repair. Instant restore goes further as it permits read/write access to any data on a device undergoing restore--even data not yet restored--by restoring individual data segments on demand. Thus, the restore process is guided primarily by the needs of applications, and the observed mean time to repair is effectively reduced from several hours to a few seconds. This paper presents an implementation and evaluation of instant restore. The technique is incrementally implemented on a system starting with the traditional ARIES design for logging and recovery. Experiments show that the transaction latency perceived after a media failure can be cut down to less than a second and that the overhead imposed by the technique on normal processing is minimal. The net effect is that a few "nines" of availability are added to the system using simple and low-overhead software techniques

    ExplainIt! -- A declarative root-cause analysis engine for time series data (extended version)

    Full text link
    We present ExplainIt!, a declarative, unsupervised root-cause analysis engine that uses time series monitoring data from large complex systems such as data centres. ExplainIt! empowers operators to succinctly specify a large number of causal hypotheses to search for causes of interesting events. ExplainIt! then ranks these hypotheses, reducing the number of causal dependencies from hundreds of thousands to a handful for human understanding. We show how a declarative language, such as SQL, can be effective in declaratively enumerating hypotheses that probe the structure of an unknown probabilistic graphical causal model of the underlying system. Our thesis is that databases are in a unique position to enable users to rapidly explore the possible causal mechanisms in data collected from diverse sources. We empirically demonstrate how ExplainIt! had helped us resolve over 30 performance issues in a commercial product since late 2014, of which we discuss a few cases in detail.Comment: SIGMOD Industry Track 201

    High-Performance In-Memory OLTP via Coroutine-to-Transaction

    Get PDF
    Data stalls are a major overhead in main-memory database engines due to the use of pointer-rich data structures. Lightweight coroutines ease the implementation of software prefetching to hide data stalls by overlapping computation and asynchronous data prefetching. Prior solutions, however, mainly focused on (1) individual components and operations and (2) intra-transaction batching that requires interface changes, breaking backward compatibility. It was not clear how they apply to a full database engine and how much end-to-end benefit they bring under various workloads. This thesis presents CoroBase, a main-memory database engine that tackles these challenges with a new coroutine-to-transaction paradigm. Coroutine-to-transaction models transactions as coroutines and thus enables inter-transaction batching, avoiding application changes but retaining the benefits of prefetching. We show that on a 48-core server, CoroBase can perform close to 2× better for read-intensive workloads and remain competitive for workloads that inherently do not benefit from software prefetching

    Exploiting programmable architectures for WiFi/ZigBee inter-technology cooperation

    Get PDF
    The increasing complexity of wireless standards has shown that protocols cannot be designed once for all possible deployments, especially when unpredictable and mutating interference situations are present due to the coexistence of heterogeneous technologies. As such, flexibility and (re)programmability of wireless devices is crucial in the emerging scenarios of technology proliferation and unpredictable interference conditions. In this paper, we focus on the possibility to improve coexistence performance of WiFi and ZigBee networks by exploiting novel programmable architectures of wireless devices able to support run-time modifications of medium access operations. Differently from software-defined radio (SDR) platforms, in which every function is programmed from scratch, our programmable architectures are based on a clear decoupling between elementary commands (hard-coded into the devices) and programmable protocol logic (injected into the devices) according to which the commands execution is scheduled. Our contribution is two-fold: first, we designed and implemented a cross-technology time division multiple access (TDMA) scheme devised to provide a global synchronization signal and allocate alternating channel intervals to WiFi and ZigBee programmable nodes; second, we used the OMF control framework to define an interference detection and adaptation strategy that in principle could work in independent and autonomous networks. Experimental results prove the benefits of the envisioned solution
    corecore