48 research outputs found

    A Framework for Formal Verification of DRAM Controllers

    Full text link
    The large number of recent JEDEC DRAM standard releases and their increasing feature set makes it difficult for designers to rapidly upgrade the memory controller IPs to each new standard. Especially the hardware verification is challenging due to the higher protocol complexity of standards like DDR5, LPDDR5 or HBM3 in comparison with their predecessors. With traditional simulation-based verification it is laborious to guarantee the coverage of all possible states, especially for control flow rich memory controllers. This has a direct impact on the time-to-market. A promising alternative is formal verification because it allows to ensure protocol compliance based on mathematical proofs. However, with regard to memory controllers no fully-automated verification process has been presented in the state-of-the-art yet, which means there is still a potential risk of human error. In this paper we present a framework that automatically generates SystemVerilog Assertions for a DRAM protocol. In addition, we show how the framework can be used efficiently for different tasks of memory controller development.Comment: ACM/IEEE International Symposium on Memory Systems (MEMSYS 2022

    A Petri Net Model for Evaluating Packet Buffering Strategies in a Network Processor

    Full text link

    TokenPasser: A petri net specification tool

    Get PDF
    In computer program design it is essential to know the effectiveness of different design options in improving performance, and dependability. This paper provides a description of a CAD tool for distributed hierarchical Petri nets. After a brief review of Petri nets, Petri net languages, and Petri net transducers, and descriptions of several current Petri net tools, the specifications and design of the TokenPasser tool are presented. TokenPasser is a tool to allow design of distributed hierarchical systems based on Petri nets. A case study for an intelligent robotic system is conducted, a coordination structure with one dispatcher controlling three coordinators is built to model a proposed robotic assembly system. The system is implemented using TokenPasser, and the results are analyzed to allow judgment of the tool

    A network-based system for assessment and management of infrastructure interdependency

    Get PDF
    Critical infrastructures (CIs) provide services that are essential to both the economy and well-being of nations and their citizens. Over the years, CIs are becoming more complex and interconnected, they are all interdependent in various ways, including logically, functionally, and geographically. The interconnection between CIs results in a very complex and dynamic system which increases their vulnerability to failures. In fact, when an infrastructure is experiencing failures, it can rapidly generate a cascade or domino effect to impact the other infrastructures. Thus, identifying, understanding and modeling infrastructure interdependency is a new field of research that deals with interrelationships between critical infrastructure sectors for disaster management. In the present research project, an integrated network-based analysis system with a user-friendly graphic user interface (GUI) was developed for risk analysis of complex critical infrastructure systems and their component interdependencies, called FCEPN (Fragility Curve and Extended Petri Net analysis). This approach combines: 1) Fragility Curve analysis of the vulnerability of the infrastructure, based on predefined "damage states" due to particular "hazards"; 2) Extended Petri Net analysis of the infrastructure system interdependency to determine the possible failure states and risk values. Two types of Extended Petri Net, Stochastic Petri Net and Fuzzy Petri Net were discussed in this study respectively. The FCEPN system was evaluated using the Bluestone Dam in West Virginia and Huai River Watershed in China as the case studies. Evaluation study results suggested that the FCEPN system provides a useful approach for analyzing dam system design, potential and actual vulnerability of dam networks to flood related impact, performance and reliability of existing dam systems, and appropriate maintenance and inspection work

    Analysis and design of switched-capacitor DC-DC converters with discrete event models

    Get PDF
    Ph. D. Thesis.Switched-capacitor DC-DC converters (SCDDCs) play a critical role in low power integrated systems. The analysis and design processes of an SCDDC impact the performance and power efficiency of the whole system. Conventionally, researchers carry out the analysis and design processes by viewing SCDDCs as analogue circuits. Analogue attributes of an SCDDC, such as the charge flow current or the equivalent output impedance, have been studied in considerable detail for performance enhancement. However, in most existing work, less attention is paid to the analysis of discrete events (e.g. digital signal transitions) and the relationships between discrete events in SCDDCs. These discrete events and the relationships between discrete events also affect the performance of SCDDCs. Certain negative effects of SCDDCs such as leakage current are introduced by unhealthy discrete states. For example, MOS devices in an SCDDC could conduct undesirably under certain combinations of signals, resulting in reversion losses (a type of leakage in SCDDCs). However, existing work only use verbal reasoning and waveform descriptions when studying these discrete events, which may cause confusion and result in an informal design process consisting of intuitive design and backed up merely by validation based on natural language discussions and simulations. There is therefore a need for formalised methods to describe and analyse these discrete events which may facilitate systematic design techniques. This thesis presents a new method of analysing and designing SCDDCs using discrete event models. Discrete event models such as Petri nets and Signal Transition Graphs (STGs) are commonly used in asynchronous circuits to formally describe and analyse the relationships between discrete transitions. Modelling SCDDCs with discrete event models provides a formal way to describe the relations between discrete transitions in SCDDCs. These discrete event models can be used for analysis, verification and even design guidance for SCDDC design. The rich set of existing analysis methods and tools for discrete event models could be applied to SCDDCs, potentially improving the analysis and design flow for them. Moreover, since Petri nets and STGs are generally used to analyse and design asynchronous circuits, modelling and designing SCDDCs with STG models may additionally facilitate the incorporation of positive features of asynchronous circuits in SCDDCs (e.g. no clock skew). In this thesis, the relations between discrete events in SCDDCs are formally described with SC-STG (an extended STG targeting multi-voltage systems, to which SCDDCs belong), which avoids the potential confusion due to natural language and waveform descriptions. Then the concurrency and causality relations described in SC-STG model are extended to Petri nets, with which the presence of reversion losses can be formally determined and verified. Finally, based on the STG and Petri net models, a new design method for reversion-loss-free SCDDCs is proposed. In SCDDCs designed with the new method, reversion losses are entirely removed by introducing asynchronous controls, synthesised with the help of a software synthesis toolkit “Workcraft”. To demonstrate the analysis capabilities of the method, several cross-coupled voltage doublers (a type of SCDDC) are analysed and studied with discrete event models as examples in this thesis. To demonstrate the design capabilities of the method, a new reversion-loss-free cross-coupled voltage doubler is designed. The cross-coupled voltage doubler is widely used in low power integrated systems such as flash memories, LCD drivers and wireless energy harvesting systems. The proposed modelling method is potentially used in both research and industrial area of those applications for a formal and high-efficiency design proces

    Performability modelling of homogenous and heterogeneous multiserver systems with breakdowns and repairs

    Get PDF
    This thesis presents analytical modelling of homogeneous multi-server systems with reconfiguration and rebooting delays, heterogeneous multi-server systems with one main and several identical servers, and farm paradigm multi-server systems. This thesis also includes a number of other research works such as, fast performability evaluation models of open networks of nodes with repairs and finite queuing capacities, multi-server systems with deferred repairs, and two stage tandem networks with failures, repairs and multiple servers at the second stage. Applications of these for the popular Beowulf cluster systems and memory servers are also accomplished. Existing techniques used in performance evaluation of multi-server systems are investigated and analysed in detail. Pure performance modelling techniques, pure availability models, and performability models are also considered. First, the existing approaches for pure performance modelling are critically analysed with the discussions on merits and demerits. Then relevant terminology is defined and explained. Since the pure performance models tend to be too optimistic and pure availability models are too conservative, performability models are used for the evaluation of multi-server systems. Fault-tolerant multi-server systems can continue service in case of certain failures. If failure does not occur at a critical point (such as breakdown of the head processor of a farm paradigm system) the system continues serving in a degraded mode of operation. In such systems, reconfiguration and/or rebooting delays are expected while a processor is being mapped out from the system. These delay stages are also taken into account in addition to failures and repairs, in the exact performability models that are developed. Two dimensional Markov state space representations of the systems are used for performability modelling. Following the critical analysis of the existing solution techniques, the Spectral Expansion method is chosen for the solution of the models developed. In this work, open queuing networks are also considered. To evaluate their performability, existing modelling approaches are expanded and validated by simulations, for performability analysis of multistage open networks with finite queuing capacities. The performances of two extended modelling approaches are compared in terms of accuracy for open networks with various queuing capacities. Deferred repair strategies are becoming popular because of the cost reductions they can provide. Effects of using deferred repairs are analysed and performability models are provided for homogeneous multi-server systems and highly available farm paradigm multi-server systems. Since one of the random variables is used to represent the number of jobs in one of the queues, analytical models for performance evaluation of two stage tandem networks suffer because of numerical cumbersomeness. Existing approaches for modelling these systems are actually pure performance models since breakdowns and repairs cannot be considered. One way of modelling these systems can be to divide one of the random variables to present both the operative and non-operative states of the server in one dimension. However, this will give rise to state explosion problem severely limiting the maximum queue capacity that can be handled. In order to overcome this problem a new approach is presented for modelling two stage tandem networks in three dimensions. An approximate solution is presented to solve such a system. This approach manifests itself as a novel contribution for alleviating the state space explosion problem for large and/or complex systems. When two state tandem networks with feedback are modelled using this approach, the operative states can be handled independently and this makes it possible to consider multiple operative states at the second stage. The analytical models presented can be used with various parameters and they are extendible to consider systems with similar architectures. The developed three dimensional approach is capable to handle two stage tandem networks with various characteristics for performability measures. All the approaches presented give accurate results. Numerical solutions are presented for all models developed. In case the solution presented is not exact, simulations are performed to validate the accuracy of the results obtained

    Strategies for Optimising DRAM Repair

    Get PDF
    Dynamic Random Access Memories (DRAM) are large complex devices, prone to defects during manufacture. Yield is improved by the provision of redundant structures used to repair these defects. This redundancy is often implemented by the provision of excess memory capacity and programmable address logic allowing the replacement of faulty cells within the memory array. As the memory capacity of DRAM devices has increased, so has the complexity of their redundant structures, introducing increasingly complex restrictions and interdependencies upon the use of this redundant capacity. Currently redundancy analysis algorithms solving the problem of optimally allocating this redundant capacity must be manually customised for each new device. Compromises made to reduce the complexity, and human error, reduce the efficacy of these algorithms. This thesis develops a methodology for automating the customisation of these redundancy analysis algorithms. Included are: a modelling language describing the redundant structures (including the restrictions and interdependencies placed upon their use), algorithms manipulating this model to generate redundancy analysis algorithms, and methods for translating those algorithms into executable code. Finally these concepts are used to develop a prototype software tool capable of generating redundancy analysis algorithms customised for a specified device

    Revisiting Resource Utilization in The Internet: Architectural Considerations and Challenges

    Get PDF
    The Internet has been a success story for many years. Recently researchers have started to deal with new questions that challenge the effectiveness of the Internet architecture in response to the new demands, e.g. overwhelming traffic growth and latency optimizations. Various proposals ranging from new application level protocols to new network stacks are emerging to help the Internet to keep up with the demand. In this dissertation we look at a few different proposals that deal with improving the speed and resource utilization in the Internet. We first discuss improving the resource utilization in the current Internet by minor changes such as adjusting various parameters in TCP. We then discuss a more radical form of resource utilization through combining the network and the available storage. Combining these two resources, which have traditionally been considered separate, could provide many new speed improvement opportunities. We discuss relaxing the barrier between the storage and the network in the context of Information Centric Networking (ICN), which in itself is an alternative proposals to the current TCP/IP style Internet. With the help of ICN, we propose different forms of in-network caching below the application layer. We argue that, although useful, the new models of utilizing network resource could show to have their own challenges. We namely discuss the resource management and privacy challenges that are introduced with ICN in general and within our proposed solutions in particular. The lack of end-host bindings and the existence of network routable data names in different data chunks make the congestion control, reliability, and privacy in ICN rather different from TCP/IP. We discuss some of these differences and propose solutions that can help addressing each issue in our particular form of ICN-based mechanisms

    Formal Configuration of Fault-Tolerant Systems

    Get PDF
    Bit flips are known to be a source of strange system behavior, failures, and crashes. They can cause dramatic financial loss, security breaches, or even harm human life. Caused by energized particles arising from, e.g., cosmic rays or heat, they are hardly avoidable. Due to transistor sizes becoming smaller and smaller, modern hardware becomes more and more prone to bit flips. This yields a high scientific interest, and many techniques to make systems more resilient against bit flips are developed. Fault-tolerance techniques are techniques that detect and react to bit flips or their effects. Before using these techniques, they typically need to be configured for the particular system they shall protect, the grade of resilience that shall be achieved, and the environment. State-of-the-art configuration approaches have a high risk of being imprecise, of being affected by undesired side effects, and of yielding questionable resilience measures. In this thesis we encourage the usage of formal methods for resiliency configuration, point out advantages and investigate difficulties. We exemplarily investigate two systems that are equipped with fault-tolerance techniques, and we apply parametric variants of probabilistic model checking to obtain optimal configurations for pre-defined resilience criteria. Probabilistic model checking is an automated formal method that operates on Markov models, i.e., state-based models with probabilistic transitions, where costs or rewards can be assigned to states and transitions. Probabilistic model checking can be used to compute, e.g., the probability of having a failure, the conditional probability of detecting an error in case of bit-flip occurrence, or the overhead that arises due to error detection and correction. Parametric variants of probabilistic model checking allow parameters in the transition probabilities and in the costs and rewards. Instead of computing values for probabilities and overhead, parametric variants compute rational functions. These functions can then be analyzed for optimality. The considered fault-tolerant systems are inspired by the work of project partners. The first system is an inter-process communication protocol as it is used in the Fiasco.OC microkernel. The communication structures provided by the kernel are protected against bit flips by a fault-tolerance technique. The second system is inspired by the redo-based fault-tolerance technique \haft. This technique protects an application against bit flips by partitioning the application's instruction flow into transaction, adding redundance, and redoing single transactions in case of error detection. Driven by these examples, we study challenges when using probabilistic model checking for fault-tolerance configuration and present solutions. We show that small transition probabilities, as they arise in error models, can be a cause of previously known accuracy issues, when using numeric solver in probabilistic model checking. We argue that the use of non-iterative methods is an acceptable alternative. We debate on the usability of the rational functions for finding optimal configurations, and show that for relatively short rational functions the usage of mathematical methods is appropriate. The redo-based fault-tolerance model suffers from the well-known state-explosion problem. We present a new technique, counter-based factorization, that tackles this problem for system models that do not scale because of a counter, as it is the case for this fault-tolerance model. This technique utilizes the chain-like structure that arises from the counter, splits the model into several parts, and computes local characteristics (in terms of rational functions) for these parts. These local characteristics can then be combined to retrieve global resiliency and overhead measures. The rational functions retrieved for the redo-based fault-tolerance model are huge - for small model instances they already have the size of more than one gigabyte. We therefor can not apply precise mathematic methods to these functions. Instead, we use the short, matrix-based representation, that arises from factorization, to point-wise evaluate the functions. Using this approach, we systematically explore the design space of the redo-based fault-tolerance model and retrieve sweet-spot configurations
    corecore