36,202 research outputs found
Automated Verification of Specifications with Typestates and Access Permissions
We propose an approach to formally verify Plural specifications of concurrent programs based on access permissions and typestates, by model-checking automatically generated abstract state-machines. Our approach captures all possible relevant behaviors of abstract concurrent programs implementing the specification. We describe the formal methodology employed in our technique and provide an example as proof of concept for the state-machine construction rules. We implemented the fully automated algorithm to generate and verify models as a freely available plug-in of the Plural tool, called Pulse. We tested Pulse on the full specification of a Multi Threaded Task Server commercial application and showed that this approach scales well and is efficient in finding errors in specifications that could not be previously detected with the Data Flow Analysis (DFA) capabilities of Plural
A GPU-accelerated Branch-and-Bound Algorithm for the Flow-Shop Scheduling Problem
Branch-and-Bound (B&B) algorithms are time intensive tree-based exploration
methods for solving to optimality combinatorial optimization problems. In this
paper, we investigate the use of GPU computing as a major complementary way to
speed up those methods. The focus is put on the bounding mechanism of B&B
algorithms, which is the most time consuming part of their exploration process.
We propose a parallel B&B algorithm based on a GPU-accelerated bounding model.
The proposed approach concentrate on optimizing data access management to
further improve the performance of the bounding mechanism which uses large and
intermediate data sets that do not completely fit in GPU memory. Extensive
experiments of the contribution have been carried out on well known FSP
benchmarks using an Nvidia Tesla C2050 GPU card. We compared the obtained
performances to a single and a multithreaded CPU-based execution. Accelerations
up to x100 are achieved for large problem instances
Complexity Information Flow in a Multi-threaded Imperative Language
We propose a type system to analyze the time consumed by multi-threaded
imperative programs with a shared global memory, which delineates a class of
safe multi-threaded programs. We demonstrate that a safe multi-threaded program
runs in polynomial time if (i) it is strongly terminating wrt a
non-deterministic scheduling policy or (ii) it terminates wrt a deterministic
and quiet scheduling policy. As a consequence, we also characterize the set of
polynomial time functions. The type system presented is based on the
fundamental notion of data tiering, which is central in implicit computational
complexity. It regulates the information flow in a computation. This aspect is
interesting in that the type system bears a resemblance to typed based
information flow analysis and notions of non-interference. As far as we know,
this is the first characterization by a type system of polynomial time
multi-threaded programs
Static analysis of energy consumption for LLVM IR programs
Energy models can be constructed by characterizing the energy consumed by
executing each instruction in a processor's instruction set. This can be used
to determine how much energy is required to execute a sequence of assembly
instructions, without the need to instrument or measure hardware.
However, statically analyzing low-level program structures is hard, and the
gap between the high-level program structure and the low-level energy models
needs to be bridged. We have developed techniques for performing a static
analysis on the intermediate compiler representations of a program.
Specifically, we target LLVM IR, a representation used by modern compilers,
including Clang. Using these techniques we can automatically infer an estimate
of the energy consumed when running a function under different platforms, using
different compilers.
One of the challenges in doing so is that of determining an energy cost of
executing LLVM IR program segments, for which we have developed two different
approaches. When this information is used in conjunction with our analysis, we
are able to infer energy formulae that characterize the energy consumption for
a particular program. This approach can be applied to any languages targeting
the LLVM toolchain, including C and XC or architectures such as ARM Cortex-M or
XMOS xCORE, with a focus towards embedded platforms. Our techniques are
validated on these platforms by comparing the static analysis results to the
physical measurements taken from the hardware. Static energy consumption
estimation enables energy-aware software development, without requiring
hardware knowledge
Software Model Checking with Explicit Scheduler and Symbolic Threads
In many practical application domains, the software is organized into a set
of threads, whose activation is exclusive and controlled by a cooperative
scheduling policy: threads execute, without any interruption, until they either
terminate or yield the control explicitly to the scheduler. The formal
verification of such software poses significant challenges. On the one side,
each thread may have infinite state space, and might call for abstraction. On
the other side, the scheduling policy is often important for correctness, and
an approach based on abstracting the scheduler may result in loss of precision
and false positives. Unfortunately, the translation of the problem into a
purely sequential software model checking problem turns out to be highly
inefficient for the available technologies. We propose a software model
checking technique that exploits the intrinsic structure of these programs.
Each thread is translated into a separate sequential program and explored
symbolically with lazy abstraction, while the overall verification is
orchestrated by the direct execution of the scheduler. The approach is
optimized by filtering the exploration of the scheduler with the integration of
partial-order reduction. The technique, called ESST (Explicit Scheduler,
Symbolic Threads) has been implemented and experimentally evaluated on a
significant set of benchmarks. The results demonstrate that ESST technique is
way more effective than software model checking applied to the sequentialized
programs, and that partial-order reduction can lead to further performance
improvements.Comment: 40 pages, 10 figures, accepted for publication in journal of logical
methods in computer scienc
SInC: An accurate and fast error-model based simulator for SNPs, Indels and CNVs coupled with a read generator for short-read sequence data
We report SInC (SNV, Indel and CNV) simulator and read generator, an
open-source tool capable of simulating biological variants taking into account
a platform-specific error model. SInC is capable of simulating and generating
single- and paired-end reads with user-defined insert size with high efficiency
compared to the other existing tools. SInC, due to its multi-threaded
capability during read generation, has a low time footprint. SInC is currently
optimised to work in limited infrastructure setup and can efficiently exploit
the commonly used quad-core desktop architecture to simulate short sequence
reads with deep coverage for large genomes. Sinc can be downloaded from
https://sourceforge.net/projects/sincsimulator/
BarrierPoint: sampled simulation of multi-threaded applications
Sampling is a well-known technique to speed up architectural simulation of long-running workloads while maintaining accurate performance predictions. A number of sampling techniques have recently been developed that extend well- known single-threaded techniques to allow sampled simulation of multi-threaded applications. Unfortunately, prior work is limited to non-synchronizing applications (e.g., server throughput workloads); requires the functional simulation of the entire application using a detailed cache hierarchy which limits the overall simulation speedup potential; leads to different units of work across different processor architectures which complicates performance analysis; or, requires massive machine resources to achieve reasonable simulation speedups. In this work, we propose BarrierPoint, a sampling methodology to accelerate simulation by leveraging globally synchronizing barriers in multi-threaded applications. BarrierPoint collects microarchitecture-independent code and data signatures to determine the most representative inter-barrier regions, called barrierpoints. BarrierPoint estimates total application execution time (and other performance metrics of interest) through detailed simulation of these barrierpoints only, leading to substantial simulation speedups. Barrierpoints can be simulated in parallel, use fewer simulation resources, and define fixed units of work to be used in performance comparisons across processor architectures. Our evaluation of BarrierPoint using NPB and Parsec benchmarks reports average simulation speedups of 24.7x (and up to 866.6x) with an average simulation error of 0.9% and 2.9% at most. On average, BarrierPoint reduces the number of simulation machine resources needed by 78x
- …