1,068 research outputs found
Parallel symbolic state-space exploration is difficult, but what is the alternative?
State-space exploration is an essential step in many modeling and analysis
problems. Its goal is to find the states reachable from the initial state of a
discrete-state model described. The state space can used to answer important
questions, e.g., "Is there a dead state?" and "Can N become negative?", or as a
starting point for sophisticated investigations expressed in temporal logic.
Unfortunately, the state space is often so large that ordinary explicit data
structures and sequential algorithms cannot cope, prompting the exploration of
(1) parallel approaches using multiple processors, from simple workstation
networks to shared-memory supercomputers, to satisfy large memory and runtime
requirements and (2) symbolic approaches using decision diagrams to encode the
large structured sets and relations manipulated during state-space generation.
Both approaches have merits and limitations. Parallel explicit state-space
generation is challenging, but almost linear speedup can be achieved; however,
the analysis is ultimately limited by the memory and processors available.
Symbolic methods are a heuristic that can efficiently encode many, but not all,
functions over a structured and exponentially large domain; here the pitfalls
are subtler: their performance varies widely depending on the class of decision
diagram chosen, the state variable order, and obscure algorithmic parameters.
As symbolic approaches are often much more efficient than explicit ones for
many practical models, we argue for the need to parallelize symbolic
state-space generation algorithms, so that we can realize the advantage of both
approaches. This is a challenging endeavor, as the most efficient symbolic
algorithm, Saturation, is inherently sequential. We conclude by discussing
challenges, efforts, and promising directions toward this goal
A Short Counterexample Property for Safety and Liveness Verification of Fault-tolerant Distributed Algorithms
Distributed algorithms have many mission-critical applications ranging from
embedded systems and replicated databases to cloud computing. Due to
asynchronous communication, process faults, or network failures, these
algorithms are difficult to design and verify. Many algorithms achieve fault
tolerance by using threshold guards that, for instance, ensure that a process
waits until it has received an acknowledgment from a majority of its peers.
Consequently, domain-specific languages for fault-tolerant distributed systems
offer language support for threshold guards.
We introduce an automated method for model checking of safety and liveness of
threshold-guarded distributed algorithms in systems where the number of
processes and the fraction of faulty processes are parameters. Our method is
based on a short counterexample property: if a distributed algorithm violates a
temporal specification (in a fragment of LTL), then there is a counterexample
whose length is bounded and independent of the parameters. We prove this
property by (i) characterizing executions depending on the structure of the
temporal formula, and (ii) using commutativity of transitions to accelerate and
shorten executions. We extended the ByMC toolset (Byzantine Model Checker) with
our technique, and verified liveness and safety of 10 prominent fault-tolerant
distributed algorithms, most of which were out of reach for existing
techniques.Comment: 16 pages, 11 pages appendi
Specifying and reasoning about concurrent systems in logic
Imperial Users onl
A Scalable Parallel Architecture with FPGA-Based Network Processor for Scientific Computing
This thesis discuss the design and the implementation of an FPGA-Based
Network Processor for scientific computing, like Lattice Quantum ChromoDinamycs
(LQCD) and fluid-dynamics applications based on Lattice Boltzmann
Methods (LBM). State-of-the-art programs in this (and other similar)
applications have a large degree of available parallelism, that can be easily
exploited on massively parallel systems, provided the underlying communication
network has not only high-bandwidth but also low-latency.
I have designed in details, built and tested in hardware, firmware and
software an implementation of a Network Processor, tailored for the most
recent families of multi-core processors. The implementation has been developed
on an FPGA device to easily interface the logic of NWP with the CPU
I/O sub-system.
In this work I have assessed several ways to move data between the main
memory of the CPU and the I/O sub-system to exploit high data throughput
and low latency, enabling the use of “Programmed Input Output” (PIO),
“Direct Memory Access” (DMA) and “Write Combining” memory-settings.
On the software side, I developed and test a device driver for the Linux
operating system to access the NWP device, as well as a system library to
efficiently access the network device from user-applications.
This thesis demonstrates the feasibility of a network infrastructure that
saturates the maximum bandwidth of the I/O sub-systems available on recent
CPUs, and reduces communication latencies to values very close to those
needed by the processor to move data across the chip boundary
Fifty years of Hoare's Logic
We present a history of Hoare's logic.Comment: 79 pages. To appear in Formal Aspects of Computin
- …