2,687 research outputs found

    Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip

    Get PDF
    The sustained demand for faster, more powerful chips has been met by the availability of chip manufacturing processes allowing for the integration of increasing numbers of computation units onto a single die. The resulting outcome, especially in the embedded domain, has often been called SYSTEM-ON-CHIP (SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC). MPSoC design brings to the foreground a large number of challenges, one of the most prominent of which is the design of the chip interconnection. With a number of on-chip blocks presently ranging in the tens, and quickly approaching the hundreds, the novel issue of how to best provide on-chip communication resources is clearly felt. NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable answer to this design concern. By bringing large-scale networking concepts to the on-chip domain, they guarantee a structured answer to present and future communication requirements. The point-to-point connection and packet switching paradigms they involve are also of great help in minimizing wiring overhead and physical routing issues. However, as with any technology of recent inception, NoC design is still an evolving discipline. Several main areas of interest require deep investigation for NoCs to become viable solutions: • The design of the NoC architecture needs to strike the best tradeoff among performance, features and the tight area and power constraints of the onchip domain. • Simulation and verification infrastructure must be put in place to explore, validate and optimize the NoC performance. • NoCs offer a huge design space, thanks to their extreme customizability in terms of topology and architectural parameters. Design tools are needed to prune this space and pick the best solutions. • Even more so given their global, distributed nature, it is essential to evaluate the physical implementation of NoCs to evaluate their suitability for next-generation designs and their area and power costs. This dissertation performs a design space exploration of network-on-chip architectures, in order to point-out the trade-offs associated with the design of each individual network building blocks and with the design of network topology overall. The design space exploration is preceded by a comparative analysis of state-of-the-art interconnect fabrics with themselves and with early networkon- chip prototypes. The ultimate objective is to point out the key advantages that NoC realizations provide with respect to state-of-the-art communication infrastructures and to point out the challenges that lie ahead in order to make this new interconnect technology come true. Among these latter, technologyrelated challenges are emerging that call for dedicated design techniques at all levels of the design hierarchy. In particular, leakage power dissipation, containment of process variations and of their effects. The achievement of the above objectives was enabled by means of a NoC simulation environment for cycleaccurate modelling and simulation and by means of a back-end facility for the study of NoC physical implementation effects. Overall, all the results provided by this work have been validated on actual silicon layout

    Combined HW/SW Drift and Variability Mitigation for PCM-based Analog In-memory Computing for Neural Network Applications

    Get PDF
    Matrix-Vector Multiplications (MVMs) represent a heavy workload for both training and inference in Deep Neural Networks (DNNs) applications. Analog In-memory Computing (AIMC) systems based on Phase Change Memory (PCM) has been shown to be a valid competitor to enhance the energy efficiency of DNN accelerators. Although DNNs are quite resilient to computation inaccuracies, PCM non-idealities could strongly affect MVM operations precision, and thus the accuracy of DNNs. In this paper, a combined hardware and software solution to mitigate the impact of PCM non-idealities is presented. The drift of PCM cells conductance is compensated at the circuit level through the introduction of a conductance ratio at the core of the MVM computation. A model of the behaviour of PCM cells is employed to develop a device-aware training for DNNs and the accuracy is estimated in a CIFAR-10 classification task. This work is supported by a PCM-based AIMC prototype, designed in a 90-nm STMicroelectronics technology, and conceived to perform Multiply-and-Accumulate (MAC) computations, which are the kernel of MVMs. Results show that the MAC computation accuracy is around 95% even under the effect of cells drift. The use of a device-aware DNN training makes the networks less sensitive to weight variability, with a 15% increase in classification accuracy over a conventionally-trained Lenet-5 DNN, and a 36% gain when drift compensation is applied

    High-speed communication circuits: voltage control oscillators and VCO-derived filters

    Get PDF
    Voltage Controlled Oscillators (VCO) and filters are the two main topics of focus in this dissertation.;A temperature and process compensated VCO, which is designed to operate at 2 GHz, and whose frequency variation due to incoming data is limited to 1% of its center frequency was presented. The test results show that, without process changes present, the frequency variation due to a temperature change over 0°C to 100°C is around 1.1% of its center frequency. This is a reduction of a factor of 10 when compared to the temperature variation of a conventional VCO.;A new method of designing continuous-time monolithic filters derived from well-known voltage controlled oscillators (VCOs) was introduced. These VCO-derived filters are capable of operating at very high frequencies in standard CMOS processes. Prototype low-pass and band-pass filters designed in a TSMC 0.25 mum process are discussed. Simulation results for the low-pass filter designed for a cutoff frequency of 4.3 GHz show a THD of -40 dB for a 200 mV peak-peak sinusoidal input. The band-pass filter has a resonant frequency programmable from 2.3 GHz to 3.1 GHz, a programmable Q from 3 to 85, and mid-band THD of -40 dB for an 80 mV peak-peak sinusoidal input signal.;A third contribution in this dissertation was the design of a new current mirror with accurate mirror gain for low beta bipolar transistors. High mirror gain accuracy is achieved by using a split-collector transistor to compensate for base currents of the source-coupled

    The ROTSE-III Robotic Telescope System

    Get PDF
    The observation of a prompt optical flash from GRB990123 convincingly demonstrated the value of autonomous robotic telescope systems. Pursuing a program of rapid follow-up observations of gamma-ray bursts, the Robotic Optical Transient Search Experiment (ROTSE) has developed a next-generation instrument, ROTSE-III, that will continue the search for fast optical transients. The entire system was designed as an economical robotic facility to be installed at remote sites throughout the world. There are seven major system components: optics, optical tube assembly, CCD camera, telescope mount, enclosure, environmental sensing & protection and data acquisition. Each is described in turn in the hope that the techniques developed here will be useful in similar contexts elsewhere.Comment: 19 pages, including 4 figures. To be published in PASP in January, 2003. PASP Number IP02-11

    Mach-Zehnder Modulator Driver Designs in 28 nm CMOS Technology for Coherent Optical Systems

    Get PDF
    Since the beginning of the Internet, the number of connected devices has experienced an exponential growth. While increasing in users number, also a huge number of services and applications have been made available through the network. The forecasts tell us that we are still at the beginning of this journey, even if the numbers are already extremely high. In order to satisfy these demands, always more capable networks have been developed. Optical links have been proven to be the best candidates for long reach backbone connections, given the low losses introduced. The final target of a link is to deliver the highest amount of data for a given bit error rate (BER). So, coherent modulations move towards this direction, providing better spectral efficiency compared to other schemes. Quadrature Phase Shift Keying (QPSK) and Quadrature Amplitude Modulation (QAM) can be exploited, but linearity and phase accuracy become crucial both for the electrical and optical portion of the system. Electro-optical modulators (EOM) are used to combine laser beams with different amplitudes and phases, to provide such complex schemes. CMOS technology is not so widely used in coherent applications, mainly because of the higher break-down voltage and gm/ID of BiCMOS devices. Yet CMOS has some interesting features, such as scalability and integration between analog and digital circuits, that might result in a reduction of the overall system costs. Furthermore, in the latest technology nodes, p- and n-type MOS transistors have very similar performance, making available complementary structures which can compensate the poor MOS transconductance efficiency. The required electrical signal at the EOM input should be large enough to fully steer the light phase, linear to preserve phase and amplitude, and broad-band to achieve the highest bitrate. This thesis reports two CMOS designs. A first driver has been designed, fabricated and tested. The proposed structure is a four stages chain, with two gain blocks, a pre-driver and a main driver. To reach good linearity, cascoded pseudo-differential structures have been implemented, apart for the pre-driver. The cascode transistor allows to bias the common source (CS) in triode region, resulting in a linear voltage-to-current conversion. Working in triode region means a lower transistor gm, and a strong dependence between transconductance and drain-to-source voltage. In this way gain variability can be introduced changing the cascode voltage. The pre-driver is a pn source follower, which feeds the main driver without impairing the gain at high frequency. This solution is capable to provide an output voltage of 1.5 Vpp-diff, with a total harmonic distortion (THD) lower than 1.8%. The gain variation over frequency is always below 3 dB up to 58 GHz. A second design has been realized and sent for fabrication, but at the moment of this dissertation not yet available. The first stage of this design is a transconductor, which provides voltage-to-current conversion. Since the involved amplitude is small, the amount of distortion introduced (which is proportional to the voltage swing) is very low. Part of the gain is provided in current domain through a current mirror-like structure, allowing, at least in principle, self cancellation of spurious components. Then, the output current-to-voltage conversion is realized with a closed-loop transimpedance amplifier (TIA). This solution intends to exploit loop gain (Gloop) in order to reduce the distortion. At the same time, a loop designed with a phase margin (PM) lower than 60°, results in high frequency peak for the closed-loop transfer function. The simulated THD for a 1.5 Vpp-diff output signal is frequency dependent, and it ranges from 0.3% at 1 GHz, up to 2% at 9 GHz. Ripples in the transfer function are below 3 dB up to 51 GHz, for all the gain configurations

    A 90 nm CMOS 16 Gb/s Transceiver for Optical Interconnects

    Get PDF
    Interconnect architectures which leverage high-bandwidth optical channels offer a promising solution to address the increasing chip-to-chip I/O bandwidth demands. This paper describes a dense, high-speed, and low-power CMOS optical interconnect transceiver architecture. Vertical-cavity surface-emitting laser (VCSEL) data rate is extended for a given average current and corresponding reliability level with a four-tap current summing FIR transmitter. A low-voltage integrating and double-sampling optical receiver front-end provides adequate sensitivity in a power efficient manner by avoiding linear high-gain elements common in conventional transimpedance-amplifier (TIA) receivers. Clock recovery is performed with a dual-loop architecture which employs baud-rate phase detection and feedback interpolation to achieve reduced power consumption, while high-precision phase spacing is ensured at both the transmitter and receiver through adjustable delay clock buffers. A prototype chip fabricated in 1 V 90 nm CMOS achieves 16 Gb/s operation while consuming 129 mW and occupying 0.105 mm^2

    High-Speed and Low-Energy On-Chip Communication Circuits.

    Full text link
    Continuous technology scaling sharply reduces transistor delays, while fixed-length global wire delays have increased due to less wiring pitch with higher resistance and coupling capacitance. Due to this ever growing gap, long on-chip interconnects pose well-known latency, bandwidth, and energy challenges to high-performance VLSI systems. Repeaters effectively mitigate wire RC effects but do little to improve their energy costs. Moreover, the increased complexity and high level of integration requires higher wire densities, worsening crosstalk noise and power consumption of conventionally repeated interconnects. Such increasing concerns in global on-chip wires motivate circuits to improve wire performance and energy while reducing the number of repeaters. This work presents circuit techniques and investigation for high-performance and energy-efficient on-chip communication in the aspects of encoding, data compression, self-timed current injection, signal pre-emphasis, low-swing signaling, and technology mapping. The improved bus designs also consider the constraints of robust operation and performance/energy gains across process corners and design space. Measurement results from 5mm links on 65nm and 90nm prototype chips validate 2.5-3X improvement in energy-delay product.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/75800/1/jseo_1.pd

    Efficient start-up of crystal oscillators

    Get PDF
    • …
    corecore