Search CORE

175 research outputs found

Distributed clock generator for synchronous SoC using ADPLL network

Author: Akre Jean-Michel
Anceau François
Billoint Olivier
Colinet Eric
Galayko Dimitri
Javidan Mohammad
Juillard Jérôme
Korniienko Anton
Scorletti Gérard
Shan Chuan
Zianbetov Eldar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/09/2013
Field of study

International audienceThis paper presents a novel architecture of on-chip clock generation employing a network of oscillators synchronized by the distributed all-digital PLLs (ADPLLs). The implemented prototype has 16 clocking domains operating synchronously in a frequency range of 1.1-2.4 GHz. The synchronization error between the neighboring clock domains is less than 60 ps. The fully digital architecture of the generation offers flexibility and efficient synchronization control suitable for use in synchronous SoCs

FPGA implementation of reconfigurable ADPLL network for distributed clock generation

Author: Anceau François
Colinet Eric
Feruglio Sylvain
Galayko Dimitri
Javidan Mohammad
Juillard Jérome
Romain Olivier
Shan Chuan
Terosiet Mehdi
Zianbetov Eldar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/12/2011
Field of study

International audienceThis paper presents an FPGA platform for the design and study of network of coupled All-Digital Phase Locked Loops (ADPLLs), destined for clock generation in large synchronous System on Chip (SoC). An implementation of a programmable and reconfigurable 4×4 ADPLL network is described. The paper emphasizes the difference between the FPGA and ASIC-based implementation of such a system, in particular, implementation of digitally controlled oscillators and phase-frequency detector. The FPGA-implemented network allows studying complex phenomena related to coupled ADPLL operation and exploiting stability issues and nonlinear behavior. A dynamic setup mechanism has been proposed for the network, allowing selecting the desirable synchronized state. Experimental results demonstrate the global synchronization of network and performance of the network for different configurations

HAL-CentraleSupelec

HAL-CEA

HAL-Rennes 1

A Design Approach for Networks of Self-Sampled All-Digital Phase-Locked Loops

Author: Akré Jean-Michel
Colinet Eric
Galayko Dimitri
Javidan Mohammad
Juillard Jérôme
Korniienko Anton
Zianbetov Eldar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2011
Field of study

International audienceThis paper addresses the problem of the stability and the performance analysis of N-nodes Cartesian networks of self-sampled all digital phase-locked loops. It can be demonstrated that under certain conditions (such as filter coefficients value) a global and a local synchronization can be obtained. Our approach to find the optimal conditions consists in analyzing an corresponding linear average system of the Cartesian network rather than constructing a piecewise-linear system which is extremely difficult to analysis. The constructed corresponding system takes into account the non-linearity of the network and especially the self-sampling property. It is then analyzed by linear performance criteria such as modulus margin to guarantee a robust stability of the Cartesian network. The reliability of our approach is proved by transient simulations in networks of different sizes

Rapid SoC Design: On Architectures, Methodologies and Frameworks

Author: Ajayi Adetutu
Publication venue
Publication date: 01/01/2020
Field of study

Modern applications like machine learning, autonomous vehicles, and 5G networking require an order of magnitude boost in processing capability. For several decades, chip designers have relied on Moore’s Law - the doubling of transistor count every two years to deliver improved performance, higher energy efficiency, and an increase in transistor density. With the end of Dennard’s scaling and a slowdown in Moore’s Law, system architects have developed several techniques to deliver on the traditional performance and power improvements we have come to expect. More recently, chip designers have turned towards heterogeneous systems comprised of more specialized processing units to buttress the traditional processing units. These specialized units improve the overall performance, power, and area (PPA) metrics across a wide variety of workloads and applications. While the GPU serves as a classical example, accelerators for machine learning, approximate computing, graph processing, and database applications have become commonplace. This has led to an exponential growth in the variety (and count) of these compute units found in modern embedded and high-performance computing platforms. The various techniques adopted to combat the slowing of Moore’s Law directly translates to an increase in complexity for modern system-on-chips (SoCs). This increase in complexity in turn leads to an increase in design effort and validation time for hardware and the accompanying software stacks. This is further aggravated by fabrication challenges (photo-lithography, tooling, and yield) faced at advanced technology nodes (below 28nm). The inherent complexity in modern SoCs translates into increased costs and time-to-market delays. This holds true across the spectrum, from mobile/handheld processors to high-performance data-center appliances. This dissertation presents several techniques to address the challenges of rapidly birthing complex SoCs. The first part of this dissertation focuses on foundations and architectures that aid in rapid SoC design. It presents a variety of architectural techniques that were developed and leveraged to rapidly construct complex SoCs at advanced process nodes. The next part of the dissertation focuses on the gap between a completed design model (in RTL form) and its physical manifestation (a GDS file that will be sent to the foundry for fabrication). It presents methodologies and a workflow for rapidly walking a design through to completion at arbitrary technology nodes. It also presents progress on creating tools and a flow that is entirely dependent on open-source tools. The last part presents a framework that not only speeds up the integration of a hardware accelerator into an SoC ecosystem, but emphasizes software adoption and usability.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/168119/1/ajayi_1.pd

Deep Blue Documents at the University of Michigan

Recommended from our members

Heterogeneous Integration on Silicon-Interconnect Fabric using fine-pitch interconnects (≤10 �m)

Author: Jangam SivaChandra
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Today, the ever-growing data-bandwidth demand is pushing the boundaries of the traditional printed circuit board (PCB) based integration schemes. Moreover, with the apparent saturation of semiconductor scaling, commonly called Moore's law, system scaling warrants a paradigm shift in packaging technologies, assembly techniques, and integration methodologies. In this work, a superior alternative to PCBs called the Silicon-Interconnect Fabric (Si-IF) is investigated. The Si-IF is a silicon-based, package-less, fine-pitch, highly scalable, heterogeneous integration platform for wafer-scale systems. In this technology, unpackaged dielets are assembled on the Si-IF at small inter-dielet spacings (≤100 �m) using fine-pitch (≤10 �m) die-to-substrate interconnects. A novel assembly process using a solder-less direct metal-metal (gold-gold and copper-copper) thermal compression bonding was developed. Using this process, sub-10 �m pitch interconnects with a low specific contact resistance of ≤0.7 Ω-�m2 were successfully demonstrated. Because of the tightly packed Si-IF assembly, the communication links between the neighboring dies are short (≤500 �m) with low loss (≤2 dB), comparable to on-chip connections. Consequently, simple buffers can transfer data between dies using a Simple Universal Parallel intERface for chips (SuperCHIPS) at low latency (<30 ps), low energy per bit (≤0.03 pJ/b), and high data-rates (up to 10 Gbps/link), corresponding to an aggregate bandwidth up to 8 Tbps/mm. The benefits of the SuperCHIPS protocol were experimentally demonstrated to provide 5-90X higher data-bandwidth, 8-30X lower latency, and 5-40X lower energy per bit compared to existing integration schemes. This dissertation addresses the assembly technology and communication protocols of the Si-IF technology

eScholarship - University of California

Implementation of Bus-Based and NoC-Based MP3 Decoders on FPGA

Author: Kasisomayajula Subramanyam Venkata
Publication venue: North Dakota State University
Publication date: 01/01/2011
Field of study

The trend of modern System-on-Chip (SoC) design is increasing in size and number of Processing Elements (PE) for various and general purpose tasks. Emergence of Field Programmable Gate Array (FPGA) into the world of technology has lowered the limitations faced by Application Specific Integrated Circuit (ASIC) design. FPGA has a less timeto- market and is a perfect candidate for prototyping purposes due to the flexibility they create for the design and this is the key feature of the FPGA technology. Technology advancements have introduced reconfiguration concepts which increase the flexibility of FPGA designs more. One method to improve SoC's performance is to adopt a sophi sticated communication medium between PEs to achieve a high throughput. Bus architecture has been improved to meet the requirements of high-performance SoCs, however, its inherently poor scalability limjts their enhancement. The Network-on-Chip (NoC) design paradigm has emerged to overcome the scalability limitations of point-to-point and bus communkation. This thesis presents an investigation towards NoC versus bus based implementation of an SoC. An MP3 decoder has been selected as an application to be implemented on the proposed design. The final design in the thes is demonstrated that the NoC based MP3 decoder achieves a 14% faster clock frequency and real time operation with the NoC based design decode an MP3 frame on average in 10% less time that the bus based MP3 decoder

NDSU Libraries Institutional Repository

Memory hierarchy and data communication in heterogeneous reconfigurable SoCs

Author: Vitkovskiy Arseniy <1979>
Publication venue: Alma Mater Studiorum - Università di Bologna
Publication date: 08/04/2008
Field of study

The miniaturization race in the hardware industry aiming at continuous increasing of transistor density on a die does not bring respective application performance improvements any more. One of the most promising alternatives is to exploit a heterogeneous nature of common applications in hardware. Supported by reconfigurable computation, which has already proved its efficiency in accelerating data intensive applications, this concept promises a breakthrough in contemporary technology development. Memory organization in such heterogeneous reconfigurable architectures becomes very critical. Two primary aspects introduce a sophisticated trade-off. On the one hand, a memory subsystem should provide well organized distributed data structure and guarantee the required data bandwidth. On the other hand, it should hide the heterogeneous hardware structure from the end-user, in order to support feasible high-level programmability of the system. This thesis work explores the heterogeneous reconfigurable hardware architectures and presents possible solutions to cope the problem of memory organization and data structure. By the example of the MORPHEUS heterogeneous platform, the discussion follows the complete design cycle, starting from decision making and justification, until hardware realization. Particular emphasis is made on the methods to support high system performance, meet application requirements, and provide a user-friendly programmer interface. As a result, the research introduces a complete heterogeneous platform enhanced with a hierarchical memory organization, which copes with its task by means of separating computation from communication, providing reconfigurable engines with computation and configuration data, and unification of heterogeneous computational devices using local storage buffers. It is distinguished from the related solutions by distributed data-flow organization, specifically engineered mechanisms to operate with data on local domains, particular communication infrastructure based on Network-on-Chip, and thorough methods to prevent computation and communication stalls. In addition, a novel advanced technique to accelerate memory access was developed and implemented

AMS Tesi di Dottorato

Physical Fault Injection and Side-Channel Attacks on Mobile Devices:A Comprehensive Analysis

Author: Aboulkassimi Driss
Gaine Clement
Heckmann Thibaut
Markantonakis Konstantinos
Naccache David
Shepherd Carlton
Van Heijningen Nico
Publication venue: 'Elsevier BV'
Publication date: 17/09/2021
Field of study

Today's mobile devices contain densely packaged system-on-chips (SoCs) with multi-core, high-frequency CPUs and complex pipelines. In parallel, sophisticated SoC-assisted security mechanisms have become commonplace for protecting device data, such as trusted execution environments, full-disk and file-based encryption. Both advancements have dramatically complicated the use of conventional physical attacks, requiring the development of specialised attacks. In this survey, we consolidate recent developments in physical fault injections and side-channel attacks on modern mobile devices. In total, we comprehensively survey over 50 fault injection and side-channel attack papers published between 2009-2021. We evaluate the prevailing methods, compare existing attacks using a common set of criteria, identify several challenges and shortcomings, and suggest future directions of research

arXiv.org e-Print Archive

Royal Holloway - Pure

INRIA a CCSD electronic archive server

Strategies towards high performance (high-resolution/linearity) time-to-digital converters on field-programmable gate arrays

Author: Xie Wujun
Publication venue
Publication date
Field of study

Time-correlated single-photon counting (TCSPC) technology has become popular in scientific research and industrial applications, such as high-energy physics, bio-sensing, non-invasion health monitoring, and 3D imaging. Because of the increasing demand for high-precision time measurements, time-to-digital converters (TDCs) have attracted attention since the 1970s. As a fully digital solution, TDCs are portable and have great potential for multichannel applications compared to bulky and expensive time-to-amplitude converters (TACs). A TDC can be implemented in ASIC and FPGA devices. Due to the low cost, flexibility, and short development cycle, FPGA-TDCs have become promising. Starting with a literature review, three original FPGA-TDCs with outstanding performance are introduced. The first design is the first efficient wave union (WU) based TDC implemented in Xilinx UltraScale (20 nm) FPGAs with a bubble-free sub-TDL structure. Combining with other existing methods, the resolution is further enhanced to 1.23 ps. The second TDC has been designed for LiDAR applications, especially in driver-less vehicles. Using the proposed new calibration method, the resolution is adjustable (50, 80, and 100 ps), and the linearity is exceptionally high (INL pk-pk and INL pk-pk are lower than 0.05 LSB). Meanwhile, a software tool has been open-sourced with a graphic user interface (GUI) to predict TDCs’ performance. In the third TDC, an onboard automatic calibration (AC) function has been realized by exploiting Xilinx ZYNQ SoC architectures. The test results show the robustness of the proposed method. Without the manual calibration, the AC function enables FPGA-TDCs to be applied in commercial products where mass production is required.Time-correlated single-photon counting (TCSPC) technology has become popular in scientific research and industrial applications, such as high-energy physics, bio-sensing, non-invasion health monitoring, and 3D imaging. Because of the increasing demand for high-precision time measurements, time-to-digital converters (TDCs) have attracted attention since the 1970s. As a fully digital solution, TDCs are portable and have great potential for multichannel applications compared to bulky and expensive time-to-amplitude converters (TACs). A TDC can be implemented in ASIC and FPGA devices. Due to the low cost, flexibility, and short development cycle, FPGA-TDCs have become promising. Starting with a literature review, three original FPGA-TDCs with outstanding performance are introduced. The first design is the first efficient wave union (WU) based TDC implemented in Xilinx UltraScale (20 nm) FPGAs with a bubble-free sub-TDL structure. Combining with other existing methods, the resolution is further enhanced to 1.23 ps. The second TDC has been designed for LiDAR applications, especially in driver-less vehicles. Using the proposed new calibration method, the resolution is adjustable (50, 80, and 100 ps), and the linearity is exceptionally high (INL pk-pk and INL pk-pk are lower than 0.05 LSB). Meanwhile, a software tool has been open-sourced with a graphic user interface (GUI) to predict TDCs’ performance. In the third TDC, an onboard automatic calibration (AC) function has been realized by exploiting Xilinx ZYNQ SoC architectures. The test results show the robustness of the proposed method. Without the manual calibration, the AC function enables FPGA-TDCs to be applied in commercial products where mass production is required

STAX (Strathclyde Repository)