11,045 research outputs found
Static analysis of energy consumption for LLVM IR programs
Energy models can be constructed by characterizing the energy consumed by
executing each instruction in a processor's instruction set. This can be used
to determine how much energy is required to execute a sequence of assembly
instructions, without the need to instrument or measure hardware.
However, statically analyzing low-level program structures is hard, and the
gap between the high-level program structure and the low-level energy models
needs to be bridged. We have developed techniques for performing a static
analysis on the intermediate compiler representations of a program.
Specifically, we target LLVM IR, a representation used by modern compilers,
including Clang. Using these techniques we can automatically infer an estimate
of the energy consumed when running a function under different platforms, using
different compilers.
One of the challenges in doing so is that of determining an energy cost of
executing LLVM IR program segments, for which we have developed two different
approaches. When this information is used in conjunction with our analysis, we
are able to infer energy formulae that characterize the energy consumption for
a particular program. This approach can be applied to any languages targeting
the LLVM toolchain, including C and XC or architectures such as ARM Cortex-M or
XMOS xCORE, with a focus towards embedded platforms. Our techniques are
validated on these platforms by comparing the static analysis results to the
physical measurements taken from the hardware. Static energy consumption
estimation enables energy-aware software development, without requiring
hardware knowledge
Symbolic Partial-Order Execution for Testing Multi-Threaded Programs
We describe a technique for systematic testing of multi-threaded programs. We
combine Quasi-Optimal Partial-Order Reduction, a state-of-the-art technique
that tackles path explosion due to interleaving non-determinism, with symbolic
execution to handle data non-determinism. Our technique iteratively and
exhaustively finds all executions of the program. It represents program
executions using partial orders and finds the next execution using an
underlying unfolding semantics. We avoid the exploration of redundant program
traces using cutoff events. We implemented our technique as an extension of
KLEE and evaluated it on a set of large multi-threaded C programs. Our
experiments found several previously undiscovered bugs and undefined behaviors
in memcached and GNU sort, showing that the new method is capable of finding
bugs in industrial-size benchmarks.Comment: Extended version of a paper presented at CAV'2
Recommended from our members
Computing infrastructure issues in distributed communications systems : a survey of operating system transport system architectures
The performance of distributed applications (such as file transfer, remote login, tele-conferencing, full-motion video, and scientific visualization) is influenced by several factors that interact in complex ways. In particular, application performance is significantly affected both by communication infrastructure factors and computing infrastructure factors. Several communication infrastructure factors include channel speed, bit-error rate, and congestion at intermediate switching nodes. Computing infrastructure factors include (among other things) both protocol processing activities (such as connection management, flow control, error detection, and retransmission) and general operating system factors (such as memory latency, CPU speed, interrupt and context switching overhead, process architecture, and message buffering). Due to a several orders of magnitude increase in network channel speed and an increase in application diversity, performance bottlenecks are shifting from the network factors to the transport system factors.This paper defines an abstraction called an "Operating System Transport System Architecture" (OSTSA) that is used to classify the major components and services in the computing infrastructure. End-to-end network protocols such as TCP, TP4, VMTP, XTP, and Delta-t typically run on general-purpose computers, where they utilize various operating system resources such as processors, virtual memory, and network controllers. The OSTSA provides services that integrate these resources to support distributed applications running on local and wide area networks.A taxonomy is presented to evaluate OSTSAs in terms of their support for protocol processing activities. We use this taxonomy to compare and contrast five general-purpose commercial and experimental operating systems including System V UNIX, BSD UNIX, the x-kernel, Choices, and Xinu
Formally based semi-automatic implementation of an open security protocol
International audienceThis paper presents an experiment in which an implementation of the client side of the SSH Transport Layer Protocol (SSH-TLP) was semi-automatically derived according to a model-driven development paradigm that leverages formal methods in order to obtain high correctness assurance. The approach used in the experiment starts with the formalization of the protocol at an abstract level. This model is then formally proved to fulfill the desired secrecy and authentication properties by using the ProVerif prover. Finally, a sound Java implementation is semi-automatically derived from the verified model using an enhanced version of the Spi2Java framework. The resulting implementation correctly interoperates with third party servers, and its execution time is comparable with that of other manually developed Java SSH-TLP client implementations. This case study demonstrates that the adopted model-driven approach is viable even for a real security protocol, despite the complexity of the models needed in order to achieve an interoperable implementation
Energy Transparency for Deeply Embedded Programs
Energy transparency is a concept that makes a program's energy consumption
visible, from hardware up to software, through the different system layers.
Such transparency can enable energy optimizations at each layer and between
layers, and help both programmers and operating systems make energy-aware
decisions. In this paper, we focus on deeply embedded devices, typically used
for Internet of Things (IoT) applications, and demonstrate how to enable energy
transparency through existing Static Resource Analysis (SRA) techniques and a
new target-agnostic profiling technique, without hardware energy measurements.
Our novel mapping technique enables software energy consumption estimations at
a higher level than the Instruction Set Architecture (ISA), namely the LLVM
Intermediate Representation (IR) level, and therefore introduces energy
transparency directly to the LLVM optimizer. We apply our energy estimation
techniques to a comprehensive set of benchmarks, including single- and also
multi-threaded embedded programs from two commonly used concurrency patterns,
task farms and pipelines. Using SRA, our LLVM IR results demonstrate a high
accuracy with a deviation in the range of 1% from the ISA SRA. Our profiling
technique captures the actual energy consumption at the LLVM IR level with an
average error of 3%.Comment: 33 pages, 7 figures. arXiv admin note: substantial text overlap with
arXiv:1510.0709
Automatic Software Repair: a Bibliography
This article presents a survey on automatic software repair. Automatic
software repair consists of automatically finding a solution to software bugs
without human intervention. This article considers all kinds of repairs. First,
it discusses behavioral repair where test suites, contracts, models, and
crashing inputs are taken as oracle. Second, it discusses state repair, also
known as runtime repair or runtime recovery, with techniques such as checkpoint
and restart, reconfiguration, and invariant restoration. The uniqueness of this
article is that it spans the research communities that contribute to this body
of knowledge: software engineering, dependability, operating systems,
programming languages, and security. It provides a novel and structured
overview of the diversity of bug oracles and repair operators used in the
literature
SICStus MT - A Multithreaded Execution Environment for SICStus Prolog
The development of intelligent software agents and other
complex applications which continuously interact with their
environments has been one of the reasons why explicit concurrency has
become a necessity in a modern Prolog system today. Such applications
need to perform several tasks which may be very different with respect
to how they are implemented in Prolog. Performing these tasks
simultaneously is very tedious without language support.
This paper describes the design, implementation and evaluation of a
prototype multithreaded execution environment for SICStus Prolog. The
threads are dynamically managed using a small and compact set of
Prolog primitives implemented in a portable way, requiring almost no
support from the underlying operating system
Massively Parallel Computation Using Graphics Processors with Application to Optimal Experimentation in Dynamic Control
The rapid increase in the performance of graphics hardware, coupled with recent improvements in its programmability has lead to its adoption in many non-graphics applications, including wide variety of scientific computing fields. At the same time, a number of important dynamic optimal policy problems in economics are athirst of computing power to help overcome dual curses of complexity and dimensionality. We investigate if computational economics may benefit from new tools on a case study of imperfect information dynamic programming problem with learning and experimentation trade-off that is, a choice between controlling the policy target and learning system parameters. Specifically, we use a model of active learning and control of linear autoregression with unknown slope that appeared in a variety of macroeconomic policy and other contexts. The endogeneity of posterior beliefs makes the problem difficult in that the value function need not be convex and policy function need not be continuous. This complication makes the problem a suitable target for massively-parallel computation using graphics processors. Our findings are cautiously optimistic in that new tools let us easily achieve a factor of 15 performance gain relative to an implementation targeting single-core processors and thus establish a better reference point on the computational speed vs. coding complexity trade-off frontier. While further gains and wider applicability may lie behind steep learning barrier, we argue that the future of many computations belong to parallel algorithms anyway.Graphics Processing Units, CUDA programming, Dynamic programming, Learning, Experimentation
- …