17,657 research outputs found

    Using Flow Specifications of Parameterized Cache Coherence Protocols for Verifying Deadlock Freedom

    Full text link
    We consider the problem of verifying deadlock freedom for symmetric cache coherence protocols. In particular, we focus on a specific form of deadlock which is useful for the cache coherence protocol domain and consistent with the internal definition of deadlock in the Murphi model checker: we refer to this deadlock as a system- wide deadlock (s-deadlock). In s-deadlock, the entire system gets blocked and is unable to make any transition. Cache coherence protocols consist of N symmetric cache agents, where N is an unbounded parameter; thus the verification of s-deadlock freedom is naturally a parameterized verification problem. Parametrized verification techniques work by using sound abstractions to reduce the unbounded model to a bounded model. Efficient abstractions which work well for industrial scale protocols typically bound the model by replacing the state of most of the agents by an abstract environment, while keeping just one or two agents as is. However, leveraging such efficient abstractions becomes a challenge for s-deadlock: a violation of s-deadlock is a state in which the transitions of all of the unbounded number of agents cannot occur and so a simple abstraction like the one above will not preserve this violation. In this work we address this challenge by presenting a technique which leverages high-level information about the protocols, in the form of message sequence dia- grams referred to as flows, for constructing invariants that are collectively stronger than s-deadlock. Efficient abstractions can be constructed to verify these invariants. We successfully verify the German and Flash protocols using our technique

    Design tradeoffs for simplicity and efficient verification in the Execution Migration Machine

    Get PDF
    As transistor technology continues to scale, the architecture community has experienced exponential growth in design complexity and significantly increasing implementation and verification costs. Moreover, Moore's law has led to a ubiquitous trend of an increasing number of cores on a single chip. Often, these large-core-count chips provide a shared memory abstraction via directories and coherence protocols, which have become notoriously error-prone and difficult to verify because of subtle data races and state space explosion. Although a very simple hardware shared memory implementation can be achieved by simply not allowing ad-hoc data replication and relying on remote accesses for remotely cached data (i.e., requiring no directories or coherence protocols), such remote-access-based directoryless architectures cannot take advantage of any data locality, and therefore suffer in both performance and energy. Our recently taped-out 110-core shared-memory processor, the Execution Migration Machine (EM[superscript 2]), establishes a new design point. On the one hand, EM[superscript 2] supports shared memory but does not automatically replicate data, and thus preserves the simplicity of directoryless architectures. On the other hand, it significantly improves performance and energy over remote-access-only designs by exploiting data locality at remote cores via fast hardware-level thread migration. In this paper, we describe the design choices made in the EM[superscript 2] chip as well as our choice of design methodology, and discuss how they combine to achieve design simplicity and verification efficiency. Even though EM[superscript 2] is a fairly large design-110 cores using a total of 357 million transistors-the entire chip design and implementation process (RTL, verification, physical design, tapeout) took only 18 man-months

    Interfering trajectories in experimental quantum-enhanced stochastic simulation

    Full text link
    Simulations of stochastic processes play an important role in the quantitative sciences, enabling the characterisation of complex systems. Recent work has established a quantum advantage in stochastic simulation, leading to quantum devices that execute a simulation using less memory than possible by classical means. To realise this advantage it is essential that the memory register remains coherent, and coherently interacts with the processor, allowing the simulator to operate over many time steps. Here we report a multi-time-step experimental simulation of a stochastic process using less memory than the classical limit. A key feature of the photonic quantum information processor is that it creates a quantum superposition of all possible future trajectories that the system can evolve into. This superposition allows us to introduce, and demonstrate, the idea of comparing statistical futures of two classical processes via quantum interference. We demonstrate interference of two 16-dimensional quantum states, representing statistical futures of our process, with a visibility of 0.96 ±\pm 0.02.Comment: 9 pages, 5 figure

    Reversibility in Massive Concurrent Systems

    Get PDF
    Reversing a (forward) computation history means undoing the history. In concurrent systems, undoing the history is not performed in a deterministic way but in a causally consistent fashion, where states that are reached during a backward computation are states that could have been reached during the computation history by just performing independent actions in a different order.Comment: Presented at MeCBIC 201

    Predicate Abstraction with Indexed Predicates

    Full text link
    Predicate abstraction provides a powerful tool for verifying properties of infinite-state systems using a combination of a decision procedure for a subset of first-order logic and symbolic methods originally developed for finite-state model checking. We consider models containing first-order state variables, where the system state includes mutable functions and predicates. Such a model can describe systems containing arbitrarily large memories, buffers, and arrays of identical processes. We describe a form of predicate abstraction that constructs a formula over a set of universally quantified variables to describe invariant properties of the first-order state variables. We provide a formal justification of the soundness of our approach and describe how it has been used to verify several hardware and software designs, including a directory-based cache coherence protocol.Comment: 27 pages, 4 figures, 1 table, short version appeared in International Conference on Verification, Model Checking and Abstract Interpretation (VMCAI'04), LNCS 2937, pages = 267--28

    Entanglement of spin waves among four quantum memories

    Get PDF
    Quantum networks are composed of quantum nodes that interact coherently by way of quantum channels and open a broad frontier of scientific opportunities. For example, a quantum network can serve as a `web' for connecting quantum processors for computation and communication, as well as a `simulator' for enabling investigations of quantum critical phenomena arising from interactions among the nodes mediated by the channels. The physical realization of quantum networks generically requires dynamical systems capable of generating and storing entangled states among multiple quantum memories, and of efficiently transferring stored entanglement into quantum channels for distribution across the network. While such capabilities have been demonstrated for diverse bipartite systems (i.e., N=2 quantum systems), entangled states with N > 2 have heretofore not been achieved for quantum interconnects that coherently `clock' multipartite entanglement stored in quantum memories to quantum channels. Here, we demonstrate high-fidelity measurement-induced entanglement stored in four atomic memories; user-controlled, coherent transfer of atomic entanglement to four photonic quantum channels; and the characterization of the full quadripartite entanglement by way of quantum uncertainty relations. Our work thereby provides an important tool for the distribution of multipartite entanglement across quantum networks.Comment: 4 figure
    corecore