369 research outputs found

    SIMD based multicore processor for image and video processing

    Get PDF
    制度:新 ; 報告番号:甲3602号 ; 学位の種類:博士(工学) ; 授与年月日:2012/3/15 ; 早大学位記番号:新595

    Formal methods for functional verification of cache-coherent systems-on-chip

    Get PDF
    State-of-the-art System-on-Chip (SoC) architectures integrate many different components, such as processors, accelerators, memories, and I/O blocks. Some of those components, but not all, may have caches. Because the effort of validation with simulation-based techniques, currently used in industry, grows exponentially with the complexity of the SoC, this thesis investigates the use of formal verification techniques in this context. More precisely, we use the CADP toolbox to develop and validate a generic formal model of a heterogeneous cache-coherent SoC compliant with the recent AMBA 4 ACE specification proposed by ARM. We use a constraint-oriented specification style to model the general requirements of the specification. We verify system properties on both the constrained and unconstrained model to detect the cache coherency corner cases. We take advantage of the parametrization of the proposed model to produce a comprehensive set of counterexamples of non-satisfied properties in the unconstrained model. The results of formal verification are then used to improve the industrial simulation-based verification techniques in two aspects. On the one hand, we suggest using the formal model to assess the sanity of an interface verification unit. On the other hand, in order to generate clever semi-directed test cases from temporal logic properties, we propose a two-step approach. One step consists in generating system-level abstract test cases using model-based testing tools of the CADP toolbox. The other step consists in refining those tests into interface-level concrete test cases that can be executed at RTL level with a commercial Coverage-Directed Test Generation tool. We found that our approach helps in the transition between interface-level and system-level verification, facilitates the validation of system-level properties, and enables early detection of bugs in both the SoC and the commercial test-bench.Les architectures des systèmes sur puce (System-on-Chip, SoC) actuelles intègrent de nombreux composants différents tels que les processeurs, les accélérateurs, les mémoires et les blocs d'entrée/sortie, certains pouvant contenir des caches. Vu que l'effort de validation basée sur la simulation, actuellement utilisée dans l'industrie, croît de façon exponentielle avec la complexité des SoCs, nous nous intéressons à des techniques de vérification formelle. Nous utilisons la boîte à outils CADP pour développer et valider un modèle formel d'un SoC générique conforme à la spécification AMBA 4 ACE récemment proposée par ARM dans le but de mettre en œuvre la cohérence de cache au niveau système. Nous utilisons une spécification orientée contraintes pour modéliser les exigences générales de cette spécification. Les propriétés du système sont vérifié à la fois sur le modèle avec contraintes et le modèle sans contraintes pour détecter les cas intéressants pour la cohérence de cache. La paramétrisation du modèle proposé a permis de produire l'ensemble complet des contre-exemples qui ne satisfont pas une certaine propriété dans le modèle non contraint. Notre approche améliore les techniques industrielles de vérification basées sur la simulation en deux aspects. D'une part, nous suggérons l'utilisation du modèle formel pour évaluer la bonne construction d'une unité de vérification d'interface. D'autre part, dans l'objectif de générer des cas de test semi-dirigés intelligents à partir des propriétés de logique temporelle, nous proposons une approche en deux étapes. La première étape consiste à générer des cas de tests abstraits au niveau système en utilisant des outils de test basé sur modèle de la boîte à outils CADP. La seconde étape consiste à affiner ces tests en cas de tests concrets au niveau de l'interface qui peuvent être exécutés en RTL grâce aux services d'un outil commercial de génération de tests dirigés par les mesures de couverture. Nous avons constaté que notre approche participe dans la transition entre la vérification du niveau interface, classiquement pratiquée dans l'industrie du matériel, et la vérification au niveau système. Notre approche facilite aussi la validation des propriétés globales du système, et permet une détection précoce des bugs, tant dans le SoC que dans les bancs de test commerciales

    Formal Verification of a MESI-based Cache Implementation

    Get PDF
    Cache coherency is crucial to multi-core systems with a shared memory programming model. Coherency protocols have been formally verified at the architectural level with relative ease. However, several subtle issues creep into the hardware realization of cache in a multi-processor environment. The assumption, made in the abstract model, that state transitions are atomic, is invalid for the HDL implementation. Each transition is composed of many concurrent multi-core operations. As a result, even with a blocking bus, several transient states come into existence. Most modern processors optimize communication with a split-transaction bus, this results in further transient states and race conditions. Therefore, the design and verification of cache coherency is increasingly complex and challenging. Simulation techniques are insufficient to ensure memory consistency and the absence of deadlock, livelock, and starvation. At best, it is tediously complex and time consuming to reach confidence in functionality with simulation. Formal methods are ideally suited to identify the numerous race conditions and subtle failures. In this study, we perform formal property verification on the RTL of a multi-core level-1 cache design based on snooping MESI protocol. We demonstrate full-proof verification of the coherence module in JasperGold using complexity reduction techniques through parameterization. We verify that the assumptions needed to constrain inputs of the stand-alone cache coherence module are satisfied as valid assertions in the instantiation environment. We compare results obtained from formal property verification against a state-of-the-art UVM environment. We highlight the benefits of a synergistic collaboration between simulation and formal techniques. We present formal analysis as a generic toolkit with numerous usage models in the digital design process

    Scalably Verifiable Cache Coherence

    Get PDF
    <p>The correctness of a cache coherence protocol is crucial to the system since a subtle bug in the protocol may lead to disastrous consequences. However, the verification of a cache coherence protocol is never an easy task due to the complexity of the protocol. Moreover, as more and more cores are compressed into a single chip, there is an urge for the cache coherence protocol to have higher performance, lower power consumption, and less storage overhead. People perform various optimizations to meet these goals, which unfortunately, further exacerbate the verification problem. The current situation is that there are no efficient and universal methods for verifying a realistic cache coherence protocol for a many-core system. </p><p>We, as architects, believe that we can alleviate the verification problem by changing the traditional design paradigm. We suggest taking verifiability as a first-class design constraint, just as we do with other traditional metrics, such as performance, power consumption, and area overhead. To do this, we need to incorporate verification effort in the early design stage of a cache coherence protocol and make wise design decisions regarding the verifiability. Such a protocol will be amenable to verification and easier to be verified in a later stage. Specifically, we propose two methods in this thesis for designing scalably verifiable cache coherence protocols. </p><p>The first method is Fractal Coherence, targeting verifiable hierarchical protocols. Fractal Coherence leverages the fractal idea to design a cache coherence protocol. The self-similarity of the fractal enables the inductive verification of the protocol. Such a verification process is independent of the number of nodes and thus is scalable. We also design example protocols to show that Fractal Coherence protocols can attain comparable performance compared to a traditional snooping or directory protocol. </p><p>As a system scales hierarchically, Fractal Coherence can perfectly solve the verification problem of the implemented cache coherence protocol. However, Fractal Coherence cannot help if the system scales horizontally. Therefore, we propose the second method, PVCoherence, targeting verifiable flat protocols. PVCoherence is based on parametric verification, a widely used method for verifying the coherence of a flat protocol with infinite number of nodes. PVCoherence captures the fundamental requirements and limitations of parametric verification and proposes a set of guidelines for designing cache coherence protocols that are compatible with parametric verification. As long as designers follow these guidelines, their protocols can be easily verified. </p><p>We further show that Fractal Coherence and PVCoherence can also facilitate the verification of memory consistency, another extremely challenging problem. One piece of previous work proves that the verification of memory consistency can be decomposed into three steps. The most complex and non-scalable step is the verification of the cache coherence protocol. If we design the protocol following the design methodology of Fractal Coherence or PVCoherence, we can easily verify the cache coherence protocol and overcome the biggest obstacle in the verification of memory consistency. </p><p>As system expands and cache coherence protocols get more complex, the verification problem of the protocol becomes more prominent. We believe it is time to reconsider the traditional design flow in which verification is totally separated from the design stage. We show that by incorporating the verifiability in the early design stage and designing protocols to be scalably verifiable in the first place, we can greatly reduce the burden of verification. Meanwhile, we perform various experiments and show that we do not lose benefits in performance as well as in other metrics when we obtain the correctness guarantee.</p>Dissertatio

    Programming Languages and Systems

    Get PDF
    This open access book constitutes the proceedings of the 29th European Symposium on Programming, ESOP 2020, which was planned to take place in Dublin, Ireland, in April 2020, as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The actual ETAPS 2020 meeting was postponed due to the Corona pandemic. The papers deal with fundamental issues in the specification, design, analysis, and implementation of programming languages and systems

    Studying and Analysing Transactional Memory Using Interval Temporal Logic and AnaTempura

    Get PDF
    Transactional memory (TM) is a promising lock-free synchronisation technique which offers a high-level abstract parallel programming model for future chip multiprocessor (CMP) systems. Moreover, it adapts the well-established popular paradigm of transactions and thus provides a general and flexible way to allow programs to read and modify disparate memory locations atomically as a single operation. In this thesis, we propose a general framework for validating a TM design, starting from a formal specification into a hardware implementation, with its underpinning theory and refinement. A methodology in this work starts with a high-level and executable specification model for an abstract TM with verification for various correctness conditions of concurrent transactions. This model is constructed within a flexible transition framework that allows verifying correctness of a TM system with animation. Then, we present a formal executable specification for a chip-dual single-cycle MIPS processor with a cache coherence protocol and integrate the provable TM system. Finally, we transform the dual processors with the TM from a high-level description into a Hardware Description Language (VHDL), using some proposed refinement and restriction rules. Interval Temporal Logic (ITL) and its programming language subset AnaTempura are used to build, execute and test the model, since they together provide a powerful framework supporting logical reasoning about time intervals as well as programming and simulation

    On a Multiprocessor Computer Farm for Online Physics Data Processing

    Get PDF
    The topic of this thesis is the design-phase performance evaluation of a large multiprocessor (MP) computer farm intended for the on-line data processing of the Compact Muon Solenoid (CMS) experiment. CMS is a high energy Physics experiment, planned to operate at CERN (Geneva, Switzerland) during the year 2005. The CMS computer farm is consisting of 1,000 MP computer systems and a 1,000 X 1,000 communications switch. The followed approach to the farm performance evaluation is through simulation studies and evaluation of small prototype systems building blocks of the farm. For the purposes of the simulation studies, we have developed a discrete-event, event-driven simulator that is capable to describe the high-level architecture of the farm and give estimates of the farm's performance. The simulator is designed in a modular way to facilitate the development of various modules that model the behavior of the farm building blocks in the desired level of detail. With the aid of this simulator, we make a particular study on the scheduling of the nodes of the farm, showing that a preemptive scheduling can increase farm's throughput. We have developed a prototype setup of a farm node an event filter unit. The setup consists of a high performance MP system (the farm node) connected to a second computer system (used to emulate the data sources) through an ATM network. The performance issues of interfacing a network interface controller (NIC) to the application running in the farm node, are explored. It is shown with the aid of this setup, that the switch-to-farm interface (SFI) a device used to put together the incoming data fragments into a single entity can be entirely avoided by emulating its function in software. We show that in order to meet the required event assembly performance in the filter node inputs, the development effort has to concentrate on the NIC hardware, software and its interface to the application, rather than building a custom designed device specialized to perform the task of event assembly. Finally, the farm scaling issues are investigated. Our aim is to obtain an "operational region" inside the farm configuration space, when the various networking speeds are taken into account. Analytically obtained results that have been confirmed with the above mentioned simulator, are discussed. We present also results showing the influence 8 of the inherent to the farm parameters (like the algorithm rejection factor) on the requirements for the farm building blocks (sustained I/O bandwidth) of the inherent to the farm parameters (like the algorithm rejection factor) on the requirements for the farm building blocks (sustained I/O bandwidth)

    McVerSi: A Test Generation Framework for Fast Memory Consistency Verification in Simulation

    Get PDF

    Design and implementation of a multi-purpose cluster system NIU

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.Includes bibliographical references (p. 209-221).by Boon Seong Ang.Ph.D
    corecore