26 research outputs found

    Design and Verification of a DFI-AXI DDR4 Memory PHY Bridge Suitable for FPGA Based RTL Emulation and Prototyping

    Get PDF
    System on chip (SoC) designers today are emphasizing on a process which can ensure robust silicon at the first tape-out. Given the complexity of modern SoC chips, there is compelling need to have suitable run time software, such at the Linux kernel and necessary drivers available once prototype silicon is available. Emulation and FPGA prototyping systems are exemplary platforms to run the tests for designs, are naturally efficient and perform well, and enable early software development. While useful, one needs to keep in mind that emulation and FPGA prototyping systems do not run at full silicon speed. In fact, the SoC target ported to the FPGA might achieve a clock speed less than 10 MHz. While still very useful for testing and software development, this low operating speed creates challenges for connecting to external devices such as DDR SDRAM. In this paper, the DDR-PHY INTERFACE (DFI) to Advanced eXtensible Interface (AXI) Bridge is designed to support a DDR4 memory sub-system design. This bridge module is developed based on the DDR PHY Interface version 5.0 specification, and once implemented in an FPGA, it transfers command information and data between the SoC DDR Memory controller being prototypes, across the AXI bus to an FPGA specific memory controller connected to a DDR SDRAM or other physical memory external to the FPGA. This bridge module enables multi-communication with the design under test (DUT) with a synthesizable SCE-MI based infrastructure between the bridge and logic simulator. SCE-MI provides a direct mechanism to inject the specific traffic, and monitor performance of the DFI-AXI DDR4 Memory PHY Bridge. Both Emulation and FPGA prototyping platforms can use this design and its testbench

    Low-power CMOS digital-pixel Imagers for high-speed uncooled PbSe IR applications

    Get PDF
    This PhD dissertation describes the research and development of a new low-cost medium wavelength infrared MWIR monolithic imager technology for high-speed uncooled industrial applications. It takes the baton on the latest technological advances in the field of vapour phase deposition (VPD) PbSe-based medium wavelength IR (MWIR) detection accomplished by the industrial partner NIT S.L., adding fundamental knowledge on the investigation of novel VLSI analog and mixed-signal design techniques at circuit and system levels for the development of the readout integrated device attached to the detector. The work supports on the hypothesis that, by the use of the preceding design techniques, current standard inexpensive CMOS technologies fulfill all operational requirements of the VPD PbSe detector in terms of connectivity, reliability, functionality and scalability to integrate the device. The resulting monolithic PbSe-CMOS camera must consume very low power, operate at kHz frequencies, exhibit good uniformity and fit the CMOS read-out active pixels in the compact pitch of the focal plane, all while addressing the particular characteristics of the MWIR detector: high dark-to-signal ratios, large input parasitic capacitance values and remarkable mismatching in PbSe integration. In order to achieve these demands, this thesis proposes null inter-pixel crosstalk vision sensor architectures based on a digital-only focal plane array (FPA) of configurable pixel sensors. Each digital pixel sensor (DPS) cell is equipped with fast communication modules, self-biasing, offset cancellation, analog-to-digital converter (ADC) and fixed pattern noise (FPN) correction. In-pixel power consumption is minimized by the use of comprehensive MOSFET subthreshold operation. The main aim is to potentiate the integration of PbSe-based infra-red (IR)-image sensing technologies so as to widen its use, not only in distinct scenarios, but also at different stages of PbSe-CMOS integration maturity. For this purpose, we posit to investigate a comprehensive set of functional blocks distributed in two parallel approaches: • Frame-based “Smart” MWIR imaging based on new DPS circuit topologies with gain and offset FPN correction capabilities. This research line exploits the detector pitch to offer fully-digital programmability at pixel level and complete functionality with input parasitic capacitance compensation and internal frame memory. • Frame-free “Compact”-pitch MWIR vision based on a novel DPS lossless analog integrator and configurable temporal difference, combined with asynchronous communication protocols inside the focal plane. This strategy is conceived to allow extensive pitch compaction and readout speed increase by the suppression of in-pixel digital filtering, and the use of dynamic bandwidth allocation in each pixel of the FPA. In order make the electrical validation of first prototypes independent of the expensive PbSe deposition processes at wafer level, investigation is extended as well to the development of affordable sensor emulation strategies and integrated test platforms specifically oriented to image read-out integrated circuits. DPS cells, imagers and test chips have been fabricated and characterized in standard 0.15μm 1P6M, 0.35μm 2P4M and 2.5μm 2P1M CMOS technologies, all as part of research projects with industrial partnership. The research has led to the first high-speed uncooled frame-based IR quantum imager monolithically fabricated in a standard VLSI CMOS technology, and has given rise to the Tachyon series [1], a new line of commercial IR cameras used in real-time industrial, environmental and transportation control systems. The frame-free architectures investigated in this work represent a firm step forward to push further pixel pitch and system bandwidth up to the limits imposed by the evolving PbSe detector in future generations of the device.La present tesi doctoral descriu la recerca i el desenvolupament d'una nova tecnologia monolítica d'imatgeria infraroja de longitud d'ona mitja (MWIR), no refrigerada i de baix cost, per a usos industrials d'alta velocitat. El treball pren el relleu dels últims avenços assolits pel soci industrial NIT S.L. en el camp dels detectors MWIR de PbSe depositats en fase vapor (VPD), afegint-hi coneixement fonamental en la investigació de noves tècniques de disseny de circuits VLSI analògics i mixtes pel desenvolupament del dispositiu integrat de lectura unit al detector pixelat. Es parteix de la hipòtesi que, mitjançant l'ús de les esmentades tècniques de disseny, les tecnologies CMOS estàndard satisfan tots els requeriments operacionals del detector VPD PbSe respecte a connectivitat, fiabilitat, funcionalitat i escalabilitat per integrar de forma econòmica el dispositiu. La càmera PbSe-CMOS resultant ha de consumir molt baixa potència, operar a freqüències de kHz, exhibir bona uniformitat, i encabir els píxels actius CMOS de lectura en el pitch compacte del pla focal de la imatge, tot atenent a les particulars característiques del detector: altes relacions de corrent d'obscuritat a senyal, elevats valors de capacitat paràsita a l'entrada i dispersions importants en el procés de fabricació. Amb la finalitat de complir amb els requisits previs, es proposen arquitectures de sensors de visió de molt baix acoblament interpíxel basades en l'ús d'una matriu de pla focal (FPA) de píxels actius exclusivament digitals. Cada píxel sensor digital (DPS) està equipat amb mòduls de comunicació d'alta velocitat, autopolarització, cancel·lació de l'offset, conversió analògica-digital (ADC) i correcció del soroll de patró fixe (FPN). El consum en cada cel·la es minimitza fent un ús exhaustiu del MOSFET operant en subllindar. L'objectiu últim és potenciar la integració de les tecnologies de sensat d'imatge infraroja (IR) basades en PbSe per expandir-ne el seu ús, no només a diferents escenaris, sinó també en diferents estadis de maduresa de la integració PbSe-CMOS. En aquest sentit, es proposa investigar un conjunt complet de blocs funcionals distribuïts en dos enfocs paral·lels: - Dispositius d'imatgeria MWIR "Smart" basats en frames utilitzant noves topologies de circuit DPS amb correcció de l'FPN en guany i offset. Aquesta línia de recerca exprimeix el pitch del detector per oferir una programabilitat completament digital a nivell de píxel i plena funcionalitat amb compensació de la capacitat paràsita d'entrada i memòria interna de fotograma. - Dispositius de visió MWIR "Compact"-pitch "frame-free" en base a un novedós esquema d'integració analògica en el DPS i diferenciació temporal configurable, combinats amb protocols de comunicació asíncrons dins del pla focal. Aquesta estratègia es concep per permetre una alta compactació del pitch i un increment de la velocitat de lectura, mitjançant la supressió del filtrat digital intern i l'assignació dinàmica de l'ample de banda a cada píxel de l'FPA. Per tal d'independitzar la validació elèctrica dels primers prototips respecte a costosos processos de deposició del PbSe sensor a nivell d'oblia, la recerca s'amplia també al desenvolupament de noves estratègies d'emulació del detector d'IR i plataformes de test integrades especialment orientades a circuits integrats de lectura d'imatge. Cel·les DPS, dispositius d'imatge i xips de test s'han fabricat i caracteritzat, respectivament, en tecnologies CMOS estàndard 0.15 micres 1P6M, 0.35 micres 2P4M i 2.5 micres 2P1M, tots dins el marc de projectes de recerca amb socis industrials. Aquest treball ha conduït a la fabricació del primer dispositiu quàntic d'imatgeria IR d'alta velocitat, no refrigerat, basat en frames, i monolíticament fabricat en tecnologia VLSI CMOS estàndard, i ha donat lloc a Tachyon, una nova línia de càmeres IR comercials emprades en sistemes de control industrial, mediambiental i de transport en temps real.Postprint (published version

    Erreichen von Performance in Netzwerken-On-Chip für Echtzeitsysteme

    Get PDF
    In many new applications, such as in automatic driving, high performance requirements have reached safety critical real-time systems. Consequently, Networks-on-Chip (NoCs) must efficiently host new sets of highly dynamic workloads e.g., high resolution sensor fusion and data processing, autonomous decision’s making combined with machine learning. The static platform management, as used in current safety critical systems, is no more sufficient to provide the needed level of service. A dynamic platform management could meet the challenge, but it usually suffers from a lack of predictability and the simplicity necessary for certification of safety and real-time properties. In this work, we propose a novel, global and dynamic arbitration for NoCs with real-time QoS requirements. The mechanism decouples the admission control from arbitration in routers thereby simplifying a dynamic adaptation and real-time analysis. Consequently, the proposed solution allows the deployment of a sophisticated contract-based QoS provisioning without introducing complicated and hard to maintain schemes, known from the frequently applied static arbiters. The presented work introduces an overlay network to synchronize transmissions using arbitration units called Resource Managers (RMs), which allows global and work-conserving scheduling. The description of resource allocation strategies is supplemented by protocol design and verification methodology bringing adaptive control to NoC communication in setups with different QoS requirements and traffic classes. For doing that, a formal worst-case timing analysis for the mechanism has been proposed which demonstrates that this solution not only exposes higher performance in simulation but, even more importantly, consistently reaches smaller formally guaranteed worst-case latencies than other strategies for realistic levels of system's utilization. The approach is not limited to a specific network architecture or topology as the mechanism does not require modifications of routers and therefore can be used together with the majority of existing manycore systems. Indeed, the evaluation followed using the generic performance optimized router designs, as well as two systems-on-chip focused on real-time deployments. The results confirmed that the proposed approach proves to exhibit significantly higher average performance in simulation and execution.In vielen neuen sicherheitskritische Anwendungen, wie z.B. dem automatisierten Fahren, werden große Anforderungen an die Leistung von Echtzeitsysteme gestellt. Daher müssen Networks-on-Chip (NoCs) neue, hochdynamische Workloads wie z.B. hochauflösende Sensorfusion und Datenverarbeitung oder autonome Entscheidungsfindung kombiniert mit maschineller Lernen, effizient auf einem System unterbringen. Die Steuerung der zugrunde liegenden NoC-Architektur, muss die Systemsicherheit vor Fehlern, resultierend aus dem dynamischen Verhalten des Systems schützen und gleichzeitig die geforderte Performance bereitstellen. In dieser Arbeit schlagen wir eine neuartige, globale und dynamische Steuerung für NoCs mit Echtzeit QoS Anforderungen vor. Das Schema entkoppelt die Zutrittskontrolle von der Arbitrierung in Routern. Hierdurch wird eine dynamische Anpassung ermöglicht und die Echtzeitanalyse vereinfacht. Der Einsatz einer ausgefeilten vertragsbasierten Ressourcen-Zuweisung wird so ermöglicht, ohne komplexe und schwer wartbare Mechanismen, welche bereits aus dem statischen Plattformmanagement bekannt sind einzuführen. Diese Arbeit stellt ein übergelagertes Netzwerk vor, welches Übertragungen mit Hilfe von Arbitrierungseinheiten, den so genannten Resource Managern (RMs), synchronisiert. Dieses überlagerte Netzwerk ermöglicht eine globale und lasterhaltende Steuerung. Die Beschreibung verschiedener Ressourcenzuweisungstrategien wird ergänzt durch ein Protokolldesign und Methoden zur Verifikation der adaptiven NoC Steuerung mit unterschiedlichen QoS Anforderungen und Verkehrsklassen. Hierfür wird eine formale Worst Case Timing Analyse präsentiert, welche das vorgestellte Verfahren abbildet. Die Resultate bestätitgen, dass die präsentierte Lösung nicht nur eine höhere Performance in der Simulation bietet, sondern auch formal kleinere Worst-Case Latenzen für realistische Systemauslastungen als andere Strategien garantiert. Der vorgestellte Ansatz ist nicht auf eine bestimmte Netzwerkarchitektur oder Topologie beschränkt, da der Mechanismus keine Änderungen an den unterliegenden Routern erfordert und kann daher zusammen mit bestehenden Manycore-Systemen eingesetzt werden. Die Evaluierung erfolgte auf Basis eines leistungsoptimierten Router-Designs sowie zwei auf Echtzeit-Anwendungen fokusierten Platformen. Die Ergebnisse bestätigten, dass der vorgeschlagene Ansatz im Durchschnitt eine deutlich höhere Leistung in der Simulation und Ausführung liefert

    Optimierung der Energie und Power getriebenen Architekturexploration für Multicore und heterogenes System on Chip

    Get PDF
    The contribution of this work builds on top of the established virtual prototype platforms to improve both SoC design quality and productivity. Initially, an automatic system-level power estimation framework was developed to address the critical issue of early power estimation in SoC design. The estimation framework models the static and dynamic power consumption of the hardware components. These models are created from the normalized values of the basic design components of SoC, obtained through one-time power simulation of RTL hardware models. The framework allows dynamic technology node reconfiguration for power estimation models. Its instantaneous power reporting aids the detection of possible hotspot early into the design process. Adding this additional data in conjunction with a steadily growing design space of complex heterogeneous SoC, finding the right parameter configuration is a challenging and laborious task for a system-level designer. This work addresses this bottleneck by optimizing the design space exploration (DSE) process for MPSoC design. An automatic DSE framework for virtual platforms (VPs) was developed which is flexible and allows the selection optimal parameter configuration without pre-existing knowledge. To reduce exploration time, the framework is equipped with several multi-objective optimization techniques based on simulated annealing and a genetic algorithm. Lastly, to aid HW/SW partitioning at system-level, a flexible and automated workflow (SW2TLM) is presented. It allows the designer to explore various possible partitioning scenarios without going into depth of the hardware architecture complexity and software integration. The framework generates system-level hardware accelerators from corresponding functionality encoded in the software code and integrates them into the VP. Power consumption and time speedups of acceleration is reported to the designer, which further increases the quality and productivity of the development process towards the final architecture. The presented tools are evaluated using a state-of-the-art VP for a range of single and multi-core applications. Viewing the energy delay product, a reduction in exploration time was recorded at approximately 62% (worst case), maintaining optimal parameter accuracy of 90% compared to previous techniques. While the SW2TLM further increases the exploration versatility by combining modern high-level synthesis with system-level architectural exploration.Der Beitrag dieser Arbeit baut auf dem etablierten Konzept der virtuellen Prototyp (VP) Plattformen auf, um die Qualität und die Produktivität des Entwurfsprozesses zu verbessern. Zunächst wurde ein automatisches System-Level-Framework entwickelt, um Verlustleistungsabschätzung für SoC-Designs in einer deutlich früheren Entwicklungsphase zu ermöglichen. Hierfür werden statischen und dynamischen Energieverbrauchsanteile individueller Hardwareelemente durch ein abstraktes Modell ausgedrückt. Das Framework ermöglicht eine dynamische Anpassung des Technologieknotens sowie die Integration neuer Leistungsmodelle für Drittanbieterkomponenten. Die kontinuierliche Erfassung der Energieverbrauchseigenschaften und ihre grafische Darstellung Benutzeroberfläche unterstützt zusätzlich die frühzeitige Identifikation möglicher Hotspots. Durch die Bereitstellung zusätzlicher Daten, in Verbindung mit einem stetig wachsenden Entwurfsraum komplexer SoCs, ist die Identifikation der richtigen Parameterkonfiguration eine zeitintensive Aufgabe. Die vorgelegten Konzepte erlauben eine gesteigerte Automatisierung des Explorationsprozesses. Techniken der mehrdimensionalen Optimierung, basierend auf Simulated Annealing und genetischer Algorithmen erlauben die Identifikation von geeigneten Konfigurationen ohne vorheriges Wissen oder Erfahrungswerte Schließlich wurde zur Unterstützung der HW/SW -Partitionierung auf System-Ebene ein flexibler und automatisierter Workflow entwickelt. Er ermöglicht es dem Designer verschiedene mögliche Partitionierungsszenarien zu untersuchen, ohne sich in die Komplexität der Hardwarearchitektur und der Softwareintegration zu vertiefen. Das Framework erzeugt abstrakte Beschleunigermodelle aus entsprechenden Softwarefunktionen und integriert sie nahtlos in den ausführbare VP. Detaillierte Daten zum Energieverbrauch, Beschleunigungsfaktor und Kommunikationsoverhead der Partitionierung werden erfasst und dem Designer zur Verfügung gestellt, was die Qualität und Produktivität des weiter erhöht. Die vorgestellten Tools werden mit einer modernen VP für verschiedene SW-Anwendungen evaluiert. Bei Betrachtung des Energieverzögerungsprodukts wurde eine Verringerung der Explorationszeit um mehr als 62% bei 90% Parametergenauigkeit festgestell. Darauf aufbauend, erleichtert die automatisierte Untersuchung verschiedener HW/SW Partitionierungen die Entwicklung heterogener Architekturen durch die Kombination moderner HLS mit Architektur-Exploration auf der Systemebene

    A new approach to the development and maintenance of industrial sequence logic

    Get PDF
    This thesis is concerned with sequence logic as found in industrial control systems, with the focus being on process and manufacturing control systems. At its core is the assertion that there is a need for a better approach to the development of industrial sequence logic to satisfy the life-cycle requirements, and that many of the ingredients required to deliver such an approach are now available. The needs are discussed by considering the business case for automation and deficiencies with traditional approaches. A set of requirements is then derived for an integrated development environment to address the business needs throughout the control system life-cycle. The strengths and weaknesses of relevant control system technology and standards are reviewed and their bias towards implementation described. Mathematical models, graphical methods and software tools are then assessed with respect to the requirements for an integrated development environment. A solution to the requirements, called Synect is then introduced. Synect combines a methodology using familiar graphical notations with Petri net modelling supported by a set of software tools. Its key features are justified with reference to the requirements. A set of case studies forms the basis of an evaluation against business needs by comparing the Synect methodology with current approaches. The industrial relevance and exploitation are then briefly described. The thesis ends with a review of the key conclusions along with contributions to knowledge and suggestions for further research

    Third International Symposium on Space Mission Operations and Ground Data Systems, part 2

    Get PDF
    Under the theme of 'Opportunities in Ground Data Systems for High Efficiency Operations of Space Missions,' the SpaceOps '94 symposium included presentations of more than 150 technical papers spanning five topic areas: Mission Management, Operations, Data Management, System Development, and Systems Engineering. The symposium papers focus on improvements in the efficiency, effectiveness, and quality of data acquisition, ground systems, and mission operations. New technology, methods, and human systems are discussed. Accomplishments are also reported in the application of information systems to improve data retrieval, reporting, and archiving; the management of human factors; the use of telescience and teleoperations; and the design and implementation of logistics support for mission operations. This volume covers expert systems, systems development tools and approaches, and systems engineering issues

    A hardware-software codesign framework for cellular computing

    Get PDF
    Until recently, the ever-increasing demand of computing power has been met on one hand by increasing the operating frequency of processors and on the other hand by designing architectures capable of exploiting parallelism at the instruction level through hardware mechanisms such as super-scalar execution. However, both these approaches seem to have reached a plateau, mainly due to issues related to design complexity and cost-effectiveness. To face the stabilization of performance of single-threaded processors, the current trend in processor design seems to favor a switch to coarser-grain parallelization, typically at the thread level. In other words, high computational power is achieved not only by a single, very fast and very complex processor, but through the parallel operation of several processors, each executing a different thread. Extrapolating this trend to take into account the vast amount of on-chip hardware resources that will be available in the next few decades (either through further shrinkage of silicon fabrication processes or by the introduction of molecular-scale devices), together with the predicted features of such devices (e.g., the impossibility of global synchronization or higher failure rates), it seems reasonable to foretell that current design techniques will not be able to cope with the requirements of next-generation electronic devices and that novel design tools and programming methods will have to be devised. A tempting source of inspiration to solve the problems implied by a massively parallel organization and inherently error-prone substrates is biology. In fact, living beings possess characteristics, such as robustness to damage and self-organization, which were shown in previous research as interesting to be implemented in hardware. For instance, it was possible to realize relatively simple systems, such as a self-repairing watch. Overall, these bio-inspired approaches seem very promising but their interest for a wider audience is problematic because their heavily hardware-oriented designs lack some of the flexibility achievable with a general purpose processor. In the context of this thesis, we will introduce a processor-grade processing element at the heart of a bio-inspired hardware system. This processor, based on a single-instruction, features some key properties that allow it to maintain the versatility required by the implementation of bio-inspired mechanisms and to realize general computation. We will also demonstrate that the flexibility of such a processor enables it to be evolved so it can be tailored to different types of applications. In the second half of this thesis, we will analyze how the implementation of a large number of these processors can be used on a hardware platform to explore various bio-inspired mechanisms. Based on an extensible platform of many FPGAs, configured as a networked structure of processors, the hardware part of this computing framework is backed by an open library of software components that provides primitives for efficient inter-processor communication and distributed computation. We will show that this dual software–hardware approach allows a very quick exploration of different ways to solve computational problems using bio-inspired techniques. In addition, we also show that the flexibility of our approach allows it to exploit replication as a solution to issues that concern standard embedded applications

    Understanding Quantum Technologies 2022

    Full text link
    Understanding Quantum Technologies 2022 is a creative-commons ebook that provides a unique 360 degrees overview of quantum technologies from science and technology to geopolitical and societal issues. It covers quantum physics history, quantum physics 101, gate-based quantum computing, quantum computing engineering (including quantum error corrections and quantum computing energetics), quantum computing hardware (all qubit types, including quantum annealing and quantum simulation paradigms, history, science, research, implementation and vendors), quantum enabling technologies (cryogenics, control electronics, photonics, components fabs, raw materials), quantum computing algorithms, software development tools and use cases, unconventional computing (potential alternatives to quantum and classical computing), quantum telecommunications and cryptography, quantum sensing, quantum technologies around the world, quantum technologies societal impact and even quantum fake sciences. The main audience are computer science engineers, developers and IT specialists as well as quantum scientists and students who want to acquire a global view of how quantum technologies work, and particularly quantum computing. This version is an extensive update to the 2021 edition published in October 2021.Comment: 1132 pages, 920 figures, Letter forma
    corecore