277 research outputs found

    All-digital self-adaptive PVTA variation aware clock generation system for DFS

    Get PDF
    An all-digital self-adaptive clock generation system capable of adapt the clock frequency to compensate the effects of PVTA variations on the IC propagation delay and satisfy an externally set propagation length condition is presented. The design uses time-to-digital converters (TDCs) to measure the propagation length and a variable length ring oscillator (VLRO) to synthesize the clock signal. The VLRO naturally adapts its frequency to the PVTA variations suffered by its logic gates while the TDCs are used to track these variations across the chip and modify the VLRO length in order to adapt the clock frequency to them. The system measurements, for a 45nm FPGA, show that it adapts the VLRO length, and therefore the clock frequency, to satisfy the propagation length condition. Measurements also prove the system capabilities to act as a dynamic frequency scaling clock source since the propagation length condition value act as a frequency selection input and a strong linear relation between the input value and the resultant clock period is present.Peer ReviewedPostprint (author’s final draft

    Asynchronous techniques for system-on-chip design

    Get PDF
    SoC design will require asynchronous techniques as the large parameter variations across the chip will make it impossible to control delays in clock networks and other global signals efficiently. Initially, SoCs will be globally asynchronous and locally synchronous (GALS). But the complexity of the numerous asynchronous/synchronous interfaces required in a GALS will eventually lead to entirely asynchronous solutions. This paper introduces the main design principles, methods, and building blocks for asynchronous VLSI systems, with an emphasis on communication and synchronization. Asynchronous circuits with the only delay assumption of isochronic forks are called quasi-delay-insensitive (QDI). QDI is used in the paper as the basis for asynchronous logic. The paper discusses asynchronous handshake protocols for communication and the notion of validity/neutrality tests, and completion tree. Basic building blocks for sequencing, storage, function evaluation, and buses are described, and two alternative methods for the implementation of an arbitrary computation are explained. Issues of arbitration, and synchronization play an important role in complex distributed systems and especially in GALS. The two main asynchronous/synchronous interfaces needed in GALS-one based on synchronizer, the other on stoppable clock-are described and analyzed

    Timing Measurement Platform for Arbitrary Black-Box Circuits Based on Transition Probability

    No full text

    Delay Measurements and Self Characterisation on FPGAs

    No full text
    This thesis examines new timing measurement methods for self delay characterisation of Field-Programmable Gate Arrays (FPGAs) components and delay measurement of complex circuits on FPGAs. Two novel measurement techniques based on analysis of a circuit's output failure rate and transition probability is proposed for accurate, precise and efficient measurement of propagation delays. The transition probability based method is especially attractive, since it requires no modifications in the circuit-under-test and requires little hardware resources, making it an ideal method for physical delay analysis of FPGA circuits. The relentless advancements in process technology has led to smaller and denser transistors in integrated circuits. While FPGA users benefit from this in terms of increased hardware resources for more complex designs, the actual productivity with FPGA in terms of timing performance (operating frequency, latency and throughput) has lagged behind the potential improvements from the improved technology due to delay variability in FPGA components and the inaccuracy of timing models used in FPGA timing analysis. The ability to measure delay of any arbitrary circuit on FPGA offers many opportunities for on-chip characterisation and physical timing analysis, allowing delay variability to be accurately tracked and variation-aware optimisations to be developed, reducing the productivity gap observed in today's FPGA designs. The measurement techniques are developed into complete self measurement and characterisation platforms in this thesis, demonstrating their practical uses in actual FPGA hardware for cross-chip delay characterisation and accurate delay measurement of both complex combinatorial and sequential circuits, further reinforcing their positions in solving the delay variability problem in FPGAs

    Metastability-Containing Circuits

    Get PDF
    In digital circuits, metastability can cause deteriorated signals that neither are logical 0 or logical 1, breaking the abstraction of Boolean logic. Unfortunately, any way of reading a signal from an unsynchronized clock domain or performing an analog-to-digital conversion incurs the risk of a metastable upset; no digital circuit can deterministically avoid, resolve, or detect metastability (Marino, 1981). Synchronizers, the only traditional countermeasure, exponentially decrease the odds of maintained metastability over time. Trading synchronization delay for an increased probability to resolve metastability to logical 0 or 1, they do not guarantee success. We propose a fundamentally different approach: It is possible to contain metastability by fine-grained logical masking so that it cannot infect the entire circuit. This technique guarantees a limited degree of metastability in---and uncertainty about---the output. At the heart of our approach lies a time- and value-discrete model for metastability in synchronous clocked digital circuits. Metastability is propagated in a worst-case fashion, allowing to derive deterministic guarantees, without and unlike synchronizers. The proposed model permits positive results and passes the test of reproducing Marino's impossibility results. We fully classify which functions can be computed by circuits with standard registers. Regarding masking registers, we show that they become computationally strictly more powerful with each clock cycle, resulting in a non-trivial hierarchy of computable functions

    A Low-Power DSP Architecture for a Fully Implantable Cochlear Implant System-on-a-Chip.

    Full text link
    The National Science Foundation Wireless Integrated Microsystems (WIMS) Engineering Research Center at the University of Michigan developed Systems-on-a-Chip to achieve biomedical implant and environmental monitoring functionality in low-milliwatt power consumption and 1-2 cm3 volume. The focus of this work is implantable electronics for cochlear implants (CIs), surgically implanted devices that utilize existing nerve connections between the brain and inner-ear in cases where degradation of the sensory hair cells in the cochlea has occurred. In the absence of functioning hair cells, a CI processes sound information and stimulates the nderlying nerve cells with currents from implanted electrodes, enabling the patient to understand speech. As the brain of the WIMS CI, the WIMS microcontroller unit (MCU) delivers the communication, signal processing, and storage capabilities required to satisfy the aggressive goals set forth. The 16-bit MCU implements a custom instruction set architecture focusing on power-efficient execution by providing separate data and address register windows, multi-word arithmetic, eight addressing modes, and interrupt and subroutine support. Along with 32KB of on-chip SRAM, a low-power 512-byte scratchpad memory is utilized by the WIMS custom compiler to obtain an average of 18% energy savings across benchmarks. A synthesizable dynamic frequency scaling circuit allows the chip to select a precision on-chip LC or ring oscillator, and perform clock scaling to minimize power dissipation; it provides glitch-free, software-controlled frequency shifting in 100ns, and dissipates only 480μW. A highly flexible and expandable 16-channel Continuous Interleaved Sampling Digital Signal Processor (DSP) is included as an MCU peripheral component. Modes are included to process data, stimulate through electrodes, and allow experimental stimulation or processing. The entire WIMS MCU occupies 9.18mm2 and consumes only 1.79mW from 1.2V in DSP mode. This is the lowest reported consumption for a cochlear DSP. Design methodologies were analyzed and a new top-down design flow is presented that encourages hardware and software co-design as well as cross-domain verification early in the design process. An O(n) technique for energy-per-instruction estimations both pre- and post-silicon is presented that achieves less than 4% error across benchmarks. This dissertation advances low-power system design while providing an improvement in hearing recovery devices.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91488/1/emarsman_1.pd

    Hardware design of cryptographic algorithms for low-cost RFID tags

    Get PDF
    Mención Internacional en el título de doctorRadio Frequency Identification (RFID) is a wireless technology for automatic identification that has experienced a notable growth in the last years. RFID is an important part of the new trend named Internet of Things (IoT), which describes a near future where all the objects are connected to the Internet and can interact between them. The massive deployment of RFID technology depends on device costs and dependability. In order to make these systems dependable, security needs to be added to RFID implementations, as RF communications can be accessed by an attacker who could extract or manipulate private information from the objects. On the other hand, reduced costs usually imply resource-constrained environments. Due to these resource limitations necessary to low-cost implementations, typical cryptographic primitives cannot be used to secure low-cost RFID systems. A new concept emerged due to this necessity, Lightweight Cryptography. This term was used for the first time in 2003 by Vajda et al. and research on this topic has been done widely in the last decade. Several proposals oriented to low-cost RFID systems have been reported in the literature. Many of these proposals do not tackle in a realistic way the multiple restrictions required by the technology or the specifications imposed by the different standards that have arose for these technologies. The objective of this thesis is to contribute in the field of lightweight cryptography oriented to low-cost RFID tags from the microelectronics point of view. First, a study about the implementation of lightweight cryptographic primitives is presented . Specifically, the area used in the implementation, which is one of the most important requirements of the technology as it is directly related to the cost. After this analysis, a footprint area estimator of lightweight algorithms has been developed. This estimator calculates an upper-bound of the area used in the implementation. This estimator will help in making some choices at the algorithmic level, even for designers without hardware design skills. Second, two pseudo-random number generators have been proposed. Pseudorandom number generators are essential cryptographic blocks in RFID systems. According to the most extended RFID standard, EPC Class-1 Gen-2, it is mandatory to include a generator in RFID tags. Several architectures for the two proposed generators have been presented in this thesis and they have been integrated in two authentication protocols, and the main metrics (area, throughput and power consumption) have been analysed. Finally, the topic of True Random Number Generators is studied. These generators are also very important in secure RFID, and are currently a trending research line. A novel generator, presented by Cherkaoui et al., has been evaluated under different attack scenarios. A new true random number generator based on coherent sampling and suitable for low-cost RFID systems has been proposed.La tecnología de Identificación por Radio Frecuencia, más conocida por sus siglas en inglés RFID, se ha convertido en una de las tecnologías de autoidentificación más importantes dentro de la nueva corriente de identificación conocida como Internet de las Cosas (IoT). Esta nueva tendencia describe un futuro donde todos los objetos están conectados a internet y son capaces de identificarse ante otros objetos. La implantación masiva de los sistemas RFID está hoy en día limitada por el coste de los dispositivos y la fiabilidad. Para que este tipo de sistemas sea fiable, es necesario añadir seguridad a las implementaciones RFID, ya que las comunicaciones por radio frecuencia pueden ser fácilmente atacadas y la información sobre objetos comprometida. Por otro lado, para que todos los objetos estén conectados es necesario que el coste de la tecnología de identificación sea muy reducido, lo que significa una gran limitación de recursos en diferentes ámbitos. Dada la limitación de recursos necesaria en implementaciones de bajo coste, las primitivas criptográficas típicas no pueden ser usadas para dotar de seguridad a un sistema RFID de bajo coste. El concepto de primitiva criptográfica ligera fue introducido por primera vez 2003 por Vajda et al. y ha sido desarrollado ampliamente en los últimos años, dando como resultados una serie de algoritmos criptográficos ligeros adecuados para su uso en tecnología RFID de bajo coste. El principal problema de muchos de los algoritmos presentados es que no abordan de forma realista las múltiples limitaciones de la tecnología. El objetivo de esta tesis es el de contribuir en el campo de la criptografía ligera orientada a etiquetas RFID de bajo coste desde el punto de vista de la microelectrónica. En primer lugar se presenta un estudio de la implementación de las primitivas criptográficas ligeras más utilizadas, concretamente analizando el área ocupado por dichas primitivas, ya que es uno de los parámetros críticos considerados a la hora de incluir dichas primitivas criptográficas en los dispositivos RFID de bajo coste. Tras el análisis de estas primitivas se ha desarrollado un estimador de área para algoritmos criptográficos ultraligeros que trata de dar una cota superior del área total ocupada por el algoritmo (incluyendo registros y lógica de control). Este estimador permite al diseñador, en etapas tempranas del diseño y sin tener ningún conocimiento sobre implementaciones, saber si el algoritmo está dentro de los límites de área mpuestos por la tecnología RFID. También se proponen 2 generadores de números pseudo-aleatorios. Estos generadores son uno de los bloques criptográficos más importantes en un sistema RFID. El estándar RFID más extendido entre la industria, EPC Class-1 Gen-2, establece el uso obligatorio de dicho tipo de generadores en las etiquetas RFID. Los generadores propuestos han sido implementados e integrados en 2 protocolos de comunicación orientados a RFID, obteniendo buenos resultados en las principales características del sistema. Por último, se ha estudiado el tema de los generadores de números aleatorios. Este tipo de generadores son frecuentemente usados en seguridad RFID. Actualmente esta línea de investigación es muy popular. En esta tesis, se ha evaluado la seguridad de un novedoso TRNG, presentado por Cherkaoui et al., frente ataques típicos considerados en la literatura. Además, se ha presentado un nuevo TRNG de bajo coste basado en la técnica de muestreo por pares.Programa Oficial de Doctorado en Ingeniería Eléctrica, Electrónica y AutomáticaPresidente: Teresa Riesgo Alcaide.- Secretario: Emilio Olías Ruiz.- Vocal: Giorgio di Natal

    物理複製不能関数における安全性の評価と向上に関する研究

    Get PDF
    In this thesis, we focus on Physically Unclonable Functions (PUFs), which are expected as one of the most promising cryptographic primitives for secure chip authentication. Generally, PUFbased authentication is achieved by two approaches: (A) using a PUF itself, which has multiple challenge (input) and response (output) pairs, or (B) using a cryptographic function, the secret key of which is generated from a PUF with a single challenge-response pair (CRP). We contribute to:(1) evaluate the security of Approach (A), and (2) improve the security of Approach (B). (1) Arbiter-based PUFs were the most feasible type of PUFs, which was used to construct Approach (A). However, Arbiter-based PUFs have a vulnerability; if an attacker knows some CRPs, she/he can predict the remaining unknown CRPs with high probability. Bistable Ring PUF (BR-PUF) was proposed as an alternative, but has not been evaluated by third parties. In this thesis, in order to construct Approach (A) securely, we evaluate the difficulty of predicting responses of a BR-PUF experimentally. As a result, the same responses are frequently generated for two challenges with small Hamming distance. Also, particular bits of challenges have a great impact on the responses. In conclusion, BR-PUFs are not suitable for achieving Approach (A)securely. In future work, we should discuss an alternative PUF suitable for secure Approach (A).(2) In order to achieve Approach (B) securely, a secret key ? generated from a PUF response?should have high entropy. We propose a novel method of extracting high entropy from PUF responses. The core idea is to effectively utilize the information on the proportion of ‘1’s including in repeatedly-measured PUF responses. We evaluate its effectiveness by fabricated test chips. As a result, the extracted entropy is about 1.72 times as large as that without the proposed method.Finally, we organize newly gained knowledge in this thesis, and discuss a new application of PUF-based technologies.電気通信大学201

    Alternative Timing in Digital Logic

    Get PDF
    Abstract For many decades using a system clock has been the go-to method of timing circuits. CPUs in particular have been at least partially defined by the speed of their clock. As technology moves forward, this is proving more and more problematic. At first, clock rates increased as transistor sized reduced. Now, transistor sizes still go down while clock rates remain stable. As a result, the focus has shifted to trying to do more with each cycle. A greater emphasis has been placed on efficiency, because less power draw in each cycle means either less battery drain for mobile devices or more things that can be done within power limitations for circuits with a less transient power supply. To that end, I propose that alternative timing schemes have as yet untapped potential and warrant further industry focus and research. To demonstrate this, various methods of timing are discussed and analyzed, and a demonstration is provided for techniques that have no available statistics. What follows is an examination of existing and new ideas in circuit timing, with a focus on microprocessors. The first method discussed involves eliminating the clock entirely. The resulting asynchronous circuits are a well studied and discussed idea, which was dismissed previously as being not worth the cost. The progress of processor design in the last few years indicates a renewed study of asynchronous circuits is warranted. The other option explored is when the clock becomes aperiodic. If this elastic clock is one whose width can change from cycle to cycle, instructions with varying worst case timing can control the clock to run a system closer to average case time. This method has not received the same attention as asynchronous circuits, so some new ideas are proposed and demonstrated for generating and utilizing elastic clocks. Tests were run on a custom CPU design to prove the elastic clock design viable. The single-cycle processor was implemented with 45nm technology, and simulated using NanoSim. ii The results show that while the average power increases, the total energy required to execute the test program decreases. The savings are enough to offset the power overhead the new components require. The area overhead is 3% or less; better, if used in more complex designs. Given the complexity of typical pipeline CPUs, the area and power savings of a single-cycle design combined with the throughput improvement shown by the test makes this an interesting alternative for low power applications. Other uses of this technology are discussed and logically analyzed. iii Acknowledgment

    펄스 기반 피드 포워드 이퀄라이저를 갖춘 고용량 DRAM을 위한 컨트롤러 PHY 설계

    Get PDF
    학위논문 (박사) -- 서울대학교 대학원 : 공과대학 전기·정보공학부, 2020. 8. 김수환.A controller PHY for managed DRAM solution, which is a new memory structure to maximize capacity while minimizing refresh power, is presented. Inter-symbol interference is critical in such a high-capacity DRAM interface in which many DRAM chips share a command/address (C/A) channel. A pulse-based feed-forward equalizer (PB-FFE) is introduced to reduce ISI on a C/A channel. The controller PHY supports all the training sequences specified in the DDR4 standard. A glitch-free DCDL is also adopted to perform link training efficiently and to reduce training time. The DQ transmitter adopts quarter-rate architecture to reduce output latency. For the quarter-rate transmitters in DQ, we propose a quadrature error corrector (QEC), in which clock signal phase errors are corrected using two replicas of the 4:1 serializer of the output stage. Pulse shrinking is used to compare and equalize the outputs of these two replica serializers. A controller PHY was fabricated in 55nm CMOS. The PB-FFE increases the timing margin from 0.23UI to 0.29UI at 1067Mbps. At 2133Mbps, the read timing and voltage margins are 0.53UI and 211mV after read training, and the write margins are 0.72UI and 230mV after write training. To validate the QEC effectiveness, a prototype quarter-rate transmitter, including the QEC, was fabricated to another chip in 65nm CMOS. Adopting our QEC, the experimental results show that the output phase errors of the transmitter are reduced to a residual error of 0.8ps, and the output eye width and height are improved by 84% and 61%, respectively, at a data-rate of 12.8Gbps.본 연구에서 용량을 최대화하면서도 리프레시 전력을 최소화할 수 있는 새로운 메모리 구조인 관리형 DRAM 솔루션을 위한 컨트롤러 PHY를 제시하였다. 이와 같은 고용량 DRAM 인터페이스에서는 많은 DRAM 칩이 명령 / 주소 (C/A) 채널을 공유하고 있어서 심볼 간 간섭이 발생한다. 본 연구에서는 이러한 C/A 채널에서의 심볼 간 간섭을 줄이기 위해 펄스 기반 피드 포워드 이퀄라이저 (PB-FFE)를 채택하였다. 또한 본 연구의 컨트롤러 PHY는 DDR4 표준에 지정된 모든 트레이닝 시퀀스를 지원한다. 링크 트레이닝을 효율적으로 수행하고 트레이닝 시간을 줄이기 위해 글리치가 발생하지 않는 디지털 제어 지연 라인 (DCDL)을 채택하였다. 컨트롤러 PHY의 DQ 송신기는 출력 대기 시간을 줄이기 위해 쿼터 레이트 구조를 채택하였다. 쿼터 레이트 송신기의 경우에는 직교 클럭 간 위상 오류가 출력 신호의 무결성에 영향을 주게 된다. 이러한 영향을 최소화하기 위해 본 연구에서는 출력 단의 4 : 1 직렬 변환기의 두 복제본을 사용하여 클록 신호 위상 오류를 수정하는 QEC (Quadrature Error Corrector)를 제안하였다. 복제된 2개의 직렬 변환기의 출력을 비교하고 균등화하기 위해 펄스 수축 지연 라인이 사용되었다. 컨트롤러 PHY는 55nm CMOS 공정으로 제조되었다. PB-FFE는 1067Mbps에서 C/A 채널 타이밍 마진을 0.23UI에서 0.29UI로 증가시킨다. 읽기 트레이닝 후 읽기 타이밍 및 전압 마진은 2133Mbps에서 0.53UI 및 211mV이고, 쓰기 트레이닝 후 쓰기 마진은 0.72UI 및 230mV이다. QEC의 효과를 검증하기 위해 QEC를 포함한 프로토 타입 쿼터 레이트 송신기를 65nm CMOS의 다른 칩으로 제작하였다. QEC를 적용한 실험 결과, 송신기의 출력 위상 오류가 0.8ps의 잔류 오류로 감소하고, 출력 데이터 눈의 폭과 높이가 12.8Gbps의 데이터 속도에서 각각 84 %와 61 % 개선되었음을 보여준다.CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.1.1 HEAVY LOAD C/A CHANNEL 5 1.1.2 QUARTER-RATE ARCHITECTURE IN DQ TRANSMITTER 7 1.1.3 SUMMARY 8 1.2 THESIS ORGANIZATION 10 CHAPTER 2 ARCHITECTURE 11 2.1 MDS DIMM STRUCTURE 11 2.2 MDS CONTROLLER 15 2.3 MDS CONTROLLER PHY 17 2.3.1 INITIALIZATION SEQUENCE 20 2.3.2 LINK TRAINING FINITE-STATE MACHINE 23 2.3.3 POWER DOWN MODE 28 CHAPTER 3 PULSE-BASED FEED-FORWARD EQUALIZER 29 3.1 COMMAND/ADDRESS CHANNEL 29 3.2 COMMAND/ADDRESS TRANSMITTER 33 3.3 PULSE-BASED FEED-FORWARD EQUALIZER 35 CHAPTER 4 CIRCUIT IMPLEMENTATION 39 4.1 BUILDING BLOCKS 39 4.1.1 ALL-DIGITAL PHASE-LOCKED LOOP (ADPLL) 39 4.1.2 ALL-DIGITAL DELAY-LOCKED LOOP (ADDLL) 44 4.1.3 GLITCH-FREE DCDL CONTROL 47 4.1.4 DUTY-CYCLE CORRECTOR (DCC) 50 4.1.5 DQ/DQS TRANSMITTER 52 4.1.6 DQ/DQS RECEIVER 54 4.1.7 ZQ CALIBRATION 56 4.2 MODELING AND VERIFICATION OF LINK TRAINING 59 4.3 BUILT-IN SELF-TEST CIRCUITS 66 CHAPTER 5 QUADRATURE ERROR CORRECTOR USING REPLICA SERIALIZERS AND PULSE-SHRINKING DELAY LINES 69 5.1 PHASE CORRECTION USING REPLICA SERIALIZERS AND PULSE-SHRINKING UNITS 69 5.2 OVERALL QEC ARCHITECTURE AND ITS OPERATION 71 5.3 FINE DELAY UNIT IN THE PSDL 76 CHAPTER 6 EXPERIMENTAL RESULTS 78 6.1 CONTROLLER PHY 78 6.2 PROTOTYPE QEC 88 CHAPTER 7 CONCLUSION 94 BIBLIOGRAPHY 96Docto
    corecore