80 research outputs found

    Mixed-mode multicore reliability

    Get PDF
    Future processors are expected to observe increasing rates of hardware faults. Using Dual-Modular Redundancy (DMR), two cores of a multicore can be loosely coupled to redundantly execute a single software thread, providing very high coverage from many difference sources of faults. This reliability, however, comes at a high price in terms of per-thread IPC and overall system throughput. We make the observation that a user may want to run both applications requiring high reliability, such as financial software, and more fault tolerant applications requiring high performance, such as media or web software, on the same machine at the same time. Yet a traditional DMR system must fully operate in redundant mode whenever any application requires high reliability. This paper proposes a Mixed-Mode Multicore (MMM), which enables most applications, including the system software, to run with high reliability in DMR mode, while applications that need high performance can avoid the penalty of DMR. Though conceptually simple, two key challenges arise: 1) care must be taken to protect reliable applications from any faults occurring to applications running in high performance mode, and 2) the desire to execute additional independent software threads for a performance application complicates the scheduling of computation to cores. After solving these issues, an MMM is shown to improve overall system performance, compared to a traditional DMR system, by approximately 2X when one reliable and one performance application are concurrently executing

    A 2-40 Gb/s PAM4/NRZ dual-mode wireline transmitter with 4:1 MUX in 65-nm CMOS

    Get PDF
    This paper presents a 2-40 Gb/s dual-mode wireline transmitter supporting the four-level pulse amplitude modulation (PAM4) and non-return-to-zero (NRZ) modulation with a multiplexer (MUX)-based two-tap feed-forward equalizer (FFE). An edge-acceleration technique is proposed for the 4:1 MUX to increase the bandwidth. By utilizing a dedicated cascode current source, the output swing can achieve 900 mV with a level deviation of only 0.12% for PAM4. Fabricated in a 65-nm CMOS process, the transmitter consumes 117 mW and 89 mW at 40 Gb/s in PAM4 and NRZ at 1.2 V supply. © 2018, Institute of Electronics Engineers of Korea. All rights reserved

    Doctor of Philosophy

    Get PDF
    dissertationSince the late 1950s, scientists have been working toward realizing implantable devices that would directly monitor or even control the human body's internal activities. Sophisticated microsystems are used to improve our understanding of internal biological processes in animals and humans. The diversity of biomedical research dictates that microsystems must be developed and customized specifically for each new application. For advanced long-term experiments, a custom designed system-on-chip (SoC) is usually necessary to meet desired specifications. Custom SoCs, however, are often prohibitively expensive, preventing many new ideas from being explored. In this work, we have identified a set of sensors that are frequently used in biomedical research and developed a single-chip integrated microsystem that offers the most commonly used sensor interfaces, high computational power, and which requires minimum external components to operate. Included peripherals can also drive chemical reactions by setting the appropriate voltages or currents across electrodes. The SoC is highly modular and well suited for prototyping in and ex vivo experimental devices. The system runs from a primary or secondary battery that can be recharged via two inductively coupled coils. The SoC includes a 16-bit microprocessor with 32 kB of on chip SRAM. The digital core consumes 350 μW at 10 MHz and is capable of running at frequencies up to 200 MHz. The integrated microsystem has been fabricated in a 65 nm CMOS technology and the silicon has been fully tested. Integrated peripherals include two sigma-delta analog-to-digital converters, two 10-bit digital-to-analog converters, and a sleep mode timer. The system also includes a wireless ultra-wideband (UWB) transmitter. The fullydigital transmitter implementation occupies 68 x 68 μm2 of silicon area, consumes 0.72 μW static power, and achieves an energy efficiency of 19 pJ/pulse at 200 MHz pulse repetition frequency. An investigation of the suitability of the UWB technology for neural recording systems is also presented. Experimental data capturing the UWB signal transmission through an animal head are presented and a statistical model for large-scale signal fading is developed

    An Energy-Efficient Reconfigurable Mobile Memory Interface for Computing Systems

    Get PDF
    The critical need for higher power efficiency and bandwidth transceiver design has significantly increased as mobile devices, such as smart phones, laptops, tablets, and ultra-portable personal digital assistants continue to be constructed using heterogeneous intellectual properties such as central processing units (CPUs), graphics processing units (GPUs), digital signal processors, dynamic random-access memories (DRAMs), sensors, and graphics/image processing units and to have enhanced graphic computing and video processing capabilities. However, the current mobile interface technologies which support CPU to memory communication (e.g. baseband-only signaling) have critical limitations, particularly super-linear energy consumption, limited bandwidth, and non-reconfigurable data access. As a consequence, there is a critical need to improve both energy efficiency and bandwidth for future mobile devices.;The primary goal of this study is to design an energy-efficient reconfigurable mobile memory interface for mobile computing systems in order to dramatically enhance the circuit and system bandwidth and power efficiency. The proposed energy efficient mobile memory interface which utilizes an advanced base-band (BB) signaling and a RF-band signaling is capable of simultaneous bi-directional communication and reconfigurable data access. It also increases power efficiency and bandwidth between mobile CPUs and memory subsystems on a single-ended shared transmission line. Moreover, due to multiple data communication on a single-ended shared transmission line, the number of transmission lines between mobile CPU and memories is considerably reduced, resulting in significant technological innovations, (e.g. more compact devices and low cost packaging to mobile communication interface) and establishing the principles and feasibility of technologies for future mobile system applications. The operation and performance of the proposed transceiver are analyzed and its circuit implementation is discussed in details. A chip prototype of the transceiver was implemented in a 65nm CMOS process technology. In the measurement, the transceiver exhibits higher aggregate data throughput and better energy efficiency compared to prior works

    A 40-Gb/s Quarter-Rate SerDes Transmitter and Receiver Chipset in 65-nm CMOS

    Get PDF
    This paper presents a 40-Gb/s transmitter (TX) and receiver (RX) chipset for chip-to-chip communications in a 65-nm CMOS process. The TX implements a quarter-rate multi-multiplexer (MUX)-based four-tap feed-forward equalizer (FFE), where a charge-sharing-effect elimination technique is introduced into the 4:1 MUX to optimize its jitter performance and power efficiency. The RX employs a two-stage continuous-time linear equalizer as the analog front end and integrates a low-cost sign-based zero-forcing engine relying on edge-data correlation to automatically adjust the tap weights of the TX-FFE. By embedding low-pass filters with an adaptively adjusting bandwidth into the data-sampling path and adopting high-linearity compensating phase interpolators, the clock data recovery achieves both high jitter tolerance and low jitter generation. The fabricated TX and RX chipset delivers 40-Gb/s PRBS data at BER 16-dB loss at half-baud frequency, while consuming a total power of 370 mW

    WiSync: an architecture for fast synchronization through on-chip wireless communication

    Get PDF
    In shared-memory multiprocessing, fine-grain synchronization is challenging because it requires frequent communication. As technology scaling delivers larger manycore chips, such pattern is expected to remain costly to support.; In this paper, we propose to address this challenge by using on-chip wireless communication. Each core has a transceiver and an antenna to communicate with all the other cores. This environment supports very low latency global communication. Our architecture, called WiSync, uses a per-core Broadcast Memory (BM). When a core writes to its BM, all the other 100+ BMs get updated in less than 10 processor cycles. We also use a second wireless channel with cheaper transfers to execute barriers efficiently. WiSync supports multiprogramming, virtual memory, and context switching. Our evaluation with simulations of 128-threaded kernels and 64-threaded applications shows that WiSync speeds-up synchronization substantially. Compared to using advanced conventional synchronization, WiSync attains an average speedup of nearly one order of magnitude for the kernels, and 1.12 for PARSEC and SPLASH-2.Peer ReviewedPostprint (author's final draft

    Design Techniques for Energy Efficient Multi-GB/S Serial I/O Transceivers

    Get PDF
    Total I/O bandwidth demand is growing in high-performance systems due to the emergence of many-core microprocessors and in mobile devices to support the next generation of multi-media features. High-speed serial I/O energy efficiency must improve in order to enable continued scaling of these parallel computing platforms in applications ranging from data centers to smart mobile devices. The first work, a low-power forwarded-clock I/O transceiver architecture is presented that employs a high degree of output/input multiplexing, supply-voltage scaling with data rate, and low-voltage circuit techniques to enable low-power operation. The transmitter utilizes a 4:1 output multiplexing voltage-mode driver along with 4-phase clocking that is efficiently generated from a passive poly-phase filter. The output driver voltage swing is accurately controlled from 100-200 mV_(ppd) using a low-voltage pseudo-differential regulator that employs a partial negative-resistance load for improved low frequency gain. 1:8 input de-multiplexing is performed at the receiver equalizer output with 8 parallel input samplers clocked from an 8-phase injection-locked oscillator that provides more than 1UI de-skew range. Low-power high-speed serial I/O transmitters which include equalization to compensate for channel frequency dependent loss are required to meet the aggressive link energy efficiency targets of future systems. The second work presents a low power serial link transmitter design that utilizes an output stage which combines a voltage-mode driver, which offers low static-power dissipation, and current-mode equalization, which offers low complexity and dynamic-power dissipation. The utilization of current-mode equalization decouples the equalization settings and termination impedance, allowing for a significant reduction in pre-driver complexity relative to segmented voltage-mode drivers. Proper transmitter series termination is set with an impedance control loop which adjusts the on-resistance of the output transistors in the driver voltage-mode portion. Further reductions in dynamic power dissipation are achieved through scaling the serializer and local clock distribution supply with data rate. Finally, it presents that a scalable quarter-rate transmitter employs an analog-controlled impedance-modulated 2-tap voltage-mode equalizer and achieves fast power-state transitioning with a replica-biased regulator and ILO clock generation. Capacitively-driven 2 mm global clock distribution and automatic phase calibration allows for aggressive supply scaling

    Real-time multi-domain optimization controller for multi-motor electric vehicles using automotive-suitable methods and heterogeneous embedded platforms

    Get PDF
    Los capítulos 2,3 y 7 están sujetos a confidencialidad por el autor. 145 p.In this Thesis, an elaborate control solution combining Machine Learning and Soft Computing techniques has been developed, targeting a chal lenging vehicle dynamics application aiming to optimize the torque distribution across the wheels with four independent electric motors.The technological context that has motivated this research brings together potential -and challenges- from multiple dom ains: new automotive powertrain topologies with increased degrees of freedom and controllability, which can be approached with innovative Machine Learning algorithm concepts, being implementable by exploiting the computational capacity of modern heterogeneous embedded platforms and automated toolchains. The complex relations among these three domains that enable the potential for great enhancements, do contrast with the fourth domain in this context: challenging constraints brought by industrial aspects and safe ty regulations. The innovative control architecture that has been conce ived combines Neural Networks as Virtual Sensor for unmeasurable forces , with a multi-objective optimization function driven by Fuzzy Logic , which defines priorities basing on the real -time driving situation. The fundamental principle is to enhance vehicle dynamics by implementing a Torque Vectoring controller that prevents wheel slip using the inputs provided by the Neural Network. Complementary optimization objectives are effici ency, thermal stress and smoothness. Safety -critical concerns are addressed through architectural and functional measures.Two main phases can be identified across the activities and milestones achieved in this work. In a first phase, a baseline Torque Vectoring controller was implemented on an embedded platform and -benefiting from a seamless transition using Hardware-in -the -Loop - it was integrated into a real Motor -in -Wheel vehicle for race track tests. Having validated the concept, framework, methodology and models, a second simulation-based phase proceeds to develop the more sophisticated controller, targeting a more capable vehicle, leading to the final solution of this work. Besides, this concept was further evolved to support a joint research work which lead to outstanding FPGA and GPU based embedded implementations of Neural Networks. Ultimately, the different building blocks that compose this work have shown results that have met or exceeded the expectations, both on technical and conceptual level. The highly non-linear multi-variable (and multi-objective) control problem was tackled. Neural Network estimations are accurate, performance metrics in general -and vehicle dynamics and efficiency in particular- are clearly improved, Fuzzy Logic and optimization behave as expected, and efficient embedded implementation is shown to be viable. Consequently, the proposed control concept -and the surrounding solutions and enablers- have proven their qualities in what respects to functionality, performance, implementability and industry suitability.The most relevant contributions to be highlighted are firstly each of the algorithms and functions that are implemented in the controller solutions and , ultimately, the whole control concept itself with the architectural approaches it involves. Besides multiple enablers which are exploitable for future work have been provided, as well as an illustrative insight into the intricacies of a vivid technological context, showcasing how they can be harmonized. Furthermore, multiple international activities in both academic and professional contexts -which have provided enrichment as well as acknowledgement, for this work-, have led to several publications, two high-impact journal papers and collateral work products of diverse nature

    Real-time multi-domain optimization controller for multi-motor electric vehicles using automotive-suitable methods and heterogeneous embedded platforms

    Get PDF
    Los capítulos 2,3 y 7 están sujetos a confidencialidad por el autor. 145 p.In this Thesis, an elaborate control solution combining Machine Learning and Soft Computing techniques has been developed, targeting a chal lenging vehicle dynamics application aiming to optimize the torque distribution across the wheels with four independent electric motors.The technological context that has motivated this research brings together potential -and challenges- from multiple dom ains: new automotive powertrain topologies with increased degrees of freedom and controllability, which can be approached with innovative Machine Learning algorithm concepts, being implementable by exploiting the computational capacity of modern heterogeneous embedded platforms and automated toolchains. The complex relations among these three domains that enable the potential for great enhancements, do contrast with the fourth domain in this context: challenging constraints brought by industrial aspects and safe ty regulations. The innovative control architecture that has been conce ived combines Neural Networks as Virtual Sensor for unmeasurable forces , with a multi-objective optimization function driven by Fuzzy Logic , which defines priorities basing on the real -time driving situation. The fundamental principle is to enhance vehicle dynamics by implementing a Torque Vectoring controller that prevents wheel slip using the inputs provided by the Neural Network. Complementary optimization objectives are effici ency, thermal stress and smoothness. Safety -critical concerns are addressed through architectural and functional measures.Two main phases can be identified across the activities and milestones achieved in this work. In a first phase, a baseline Torque Vectoring controller was implemented on an embedded platform and -benefiting from a seamless transition using Hardware-in -the -Loop - it was integrated into a real Motor -in -Wheel vehicle for race track tests. Having validated the concept, framework, methodology and models, a second simulation-based phase proceeds to develop the more sophisticated controller, targeting a more capable vehicle, leading to the final solution of this work. Besides, this concept was further evolved to support a joint research work which lead to outstanding FPGA and GPU based embedded implementations of Neural Networks. Ultimately, the different building blocks that compose this work have shown results that have met or exceeded the expectations, both on technical and conceptual level. The highly non-linear multi-variable (and multi-objective) control problem was tackled. Neural Network estimations are accurate, performance metrics in general -and vehicle dynamics and efficiency in particular- are clearly improved, Fuzzy Logic and optimization behave as expected, and efficient embedded implementation is shown to be viable. Consequently, the proposed control concept -and the surrounding solutions and enablers- have proven their qualities in what respects to functionality, performance, implementability and industry suitability.The most relevant contributions to be highlighted are firstly each of the algorithms and functions that are implemented in the controller solutions and , ultimately, the whole control concept itself with the architectural approaches it involves. Besides multiple enablers which are exploitable for future work have been provided, as well as an illustrative insight into the intricacies of a vivid technological context, showcasing how they can be harmonized. Furthermore, multiple international activities in both academic and professional contexts -which have provided enrichment as well as acknowledgement, for this work-, have led to several publications, two high-impact journal papers and collateral work products of diverse nature

    NASA Tech Briefs, May 2012

    Get PDF
    Topics covered include: An "Inefficient Fin" Non-Dimensional Parameter to Measure Gas Temperatures Efficiently; On-Wafer Measurement of a Multi-Stage MMIC Amplifier with 10 dB of Gain at 475 GHz; Software to Control and Monitor Gas Streams; Miniaturized Laser Heterodyne Radiometer (LHR) for Measurements of Greenhouse Gases in the Atmospheric Column; Anomaly Detection in Test Equipment via Sliding Mode Observers; Absolute Position of Targets Measured Through a Chamber Window Using Lidar Metrology Systems; Goldstone Solar System Radar Waveform Generator; Fast and Adaptive Lossless Onboard Hyperspectral Data Compression System; Iridium Interfacial Stack - IrIS; Downsampling Photodetector Array with Windowing; Optical Phase Recovery and Locking in a PPM Laser Communication Link; High-Speed Edge-Detecting Line Scan Smart Camera; Optical Communications Channel Combiner; Development of Thermal Infrared Sensor to Supplement Operational Land Imager; Amplitude-Stabilized Oscillator for a Capacitance-Probe Electrometer; Automated Performance Characterization of DSN System Frequency Stability Using Spacecraft Tracking Data; Histogrammatic Method for Determining Relative Abundance of Input Gas Pulse; Predictive Sea State Estimation for Automated Ride Control and Handling - PSSEARCH; LEGION: Lightweight Expandable Group of Independently Operating Nodes; Real-Time Projection to Verify Plan Success During Execution; Automated Performance Characterization of DSN System Frequency Stability Using Spacecraft Tracking Data; Web-Based Customizable Viewer for Mars Network Overflight Opportunities; Fabrication of a Cryogenic Terahertz Emitter for Bolometer Focal Plane Calibrations; Fabrication of an Absorber-Coupled MKID Detector; Graphene Transparent Conductive Electrodes for Next- Generation Microshutter Arrays; Method of Bonding Optical Elements with Near-Zero Displacement; Free-Mass and Interface Configurations of Hammering Mechanisms; Wavefront Compensation Segmented Mirror Sensing and Control; Long-Life, Lightweight, Multi-Roller Traction Drives for Planetary Vehicle Surface Exploration; Reliable Optical Pump Architecture for Highly Coherent Lasers Used in Space Metrology Applications; Electrochemical Ultracapacitors Using Graphitic Nanostacks; Improved Whole-Blood-Staining Device; Monitoring Location and Angular Orientation of a Pill; Molecular Technique to Reduce PCR Bias for Deeper Understanding of Microbial Diversity; Laser Ablation Electrodynamic Ion Funnel for In Situ Mass Spectrometry on Mars; High-Altitude MMIC Sounding Radiometer for the Global Hawk Unmanned Aerial Vehicle; PRTs and Their Bonding for Long-Duration, Extreme-Temperature Environments; Mid- and Long-IR Broadband Quantum Well Photodetector; 3D Display Using Conjugated Multiband Bandpass Filters; Real-Time, Non-Intrusive Detection of Liquid Nitrogen in Liquid Oxygen at High Pressure and High Flow; Method to Enhance the Operation of an Optical Inspection Instrument Using Spatial Light Modulators; Dual-Compartment Inflatable Suitlock; Large-Strain Transparent Magnetoactive Polymer Nanocomposites; Thermodynamic Vent System for an On-Orbit Cryogenic Reaction Control Engine; Time Distribution Using SpaceWire in the SCaN Testbed on ISS; and Techniques for Solution- Assisted Optical Contacting
    • …
    corecore