1,546 research outputs found

    MicroTCA implementation of synchronous Ethernet-Based DAQ systems for large scale experiments

    Full text link
    Large LAr TPCs are among the most powerful detectors to address open problems in particle and astro-particle physics, such as CP violation in leptonic sector, neutrino properties and their astrophysical implications, proton decay search etc. The scale of such detector implies severe constraints on their readout and DAQ system. In this article we describe a data acquisition scheme for this new generation of large detectors. The main challenge is to propose a scalable and easy to use solution able to manage a large number of channels at the lowest cost. It is interesting to note that these constraints are very similar to those existing in Network Telecommunication Industry. We propose to study how emerging technologies like ATCA and μ\muTCA could be used in neutrino experiments. We describe the design of an Advanced Mezzanine Board (AMC) including 32 ADC channels. This board receives 32 analogical channels at the front panel and sends the formatted data through the μ\muTCA backplane using a Gigabit Ethernet link. The gigabit switch of the MCH is used to centralize and to send the data to the event building computer. The core of this card is a FPGA (ARIA-GX from ALTERA) including the whole system except the memories. A hardware accelerator has been implemented using a NIOS II μ\muP and a Gigabit MAC IP. Obviously, in order to be able to reconstruct the tracks from the events a time synchronisation system is mandatory. We decided to implement the IEEE1588 standard also called Precision Timing Protocol, another emerging and promising technology in Telecommunication Industry. In this article we describe a Gigabit PTP implementation using the recovered clock of the gigabit link. By doing so the drift is directly cancelled and the PTP will be used only to evaluate and to correct the offset.Comment: Talk presented at the 2009 Real Time Conference, Beijing, May '09, submitted to the proceeding

    Flexible Power Modeling of LTE Base Stations

    Get PDF
    With the explosion of wireless communications in number of users and data rates, the reduction of network power consumption becomes more and more critical. This is especially true for base stations which represent a dominant share of the total power in cellular networks. In order to study power reduction techniques, a convenient power model is required, providing estimates of the power consumption in different scenarios. This paper proposes such a model, accurate but simple to use. It evaluates the base station power consumption for different types of cells supporting the 3GPP LTE standard. It is flexible enough to enable comparisons between state-of-the-art and advanced configurations, and an easy adaptation to various scenarios. The model is based on a combination of base station components and sub-components as well as power scaling rules as functions of the main system parameters

    Low-latency adiabatic quantum-flux-parametron circuit integrated with a hybrid serializer/deserializer

    Full text link
    Adiabatic quantum-flux-parametron (AQFP) logic is an ultra-low-power superconductor logic family. AQFP logic gates are powered and clocked by dedicated clocking schemes using ac excitation currents to implement an energy-efficient switching process, adiabatic switching. We have proposed a low-latency clocking scheme, delay-line clocking, and demonstrated basic AQFP logic gates. In order to test more complex circuits, a serializer/deserializer (SerDes) should be incorporated into the AQFP circuit under test, since the number of input/output (I/O) cables is limited by equipment. Therefore, in the present study we propose and develop a novel SerDes for testing delay-line-clocked AQFP circuits by combining AQFP and rapid single-flux-quantum (RSFQ) logic families, which we refer to as the AQFP/RSFQ hybrid SerDes. The hybrid SerDes comprises RSFQ shift registers to facilitate the data storage during serial-to-parallel and parallel-to-serial conversion. Furthermore, all the component circuits in the hybrid SerDes are clocked by the identical excitation current to synchronize the AQFP and RSFQ parts. We fabricate and demonstrate a delay-line-clocked AQFP circuit (8-to-3 encoder, which is the largest delay-line-clocked circuit ever designed) integrated with the hybrid SerDes at 4.2 K up to 4.5 GHz. Our measurement results indicate that the hybrid SerDes enables the testing of delay-line-clocked AQFP circuits with only a few I/O cables and is thus a powerful tool for the development of very large-scale integration AQFP circuits.Comment: 7 pages, 6 figure

    NaNet: a Low-Latency, Real-Time, Multi-Standard Network Interface Card with GPUDirect Features

    Full text link
    While the GPGPU paradigm is widely recognized as an effective approach to high performance computing, its adoption in low-latency, real-time systems is still in its early stages. Although GPUs typically show deterministic behaviour in terms of latency in executing computational kernels as soon as data is available in their internal memories, assessment of real-time features of a standard GPGPU system needs careful characterization of all subsystems along data stream path. The networking subsystem results in being the most critical one in terms of absolute value and fluctuations of its response latency. Our envisioned solution to this issue is NaNet, a FPGA-based PCIe Network Interface Card (NIC) design featuring a configurable and extensible set of network channels with direct access through GPUDirect to NVIDIA Fermi/Kepler GPU memories. NaNet design currently supports both standard - GbE (1000BASE-T) and 10GbE (10Base-R) - and custom - 34~Gbps APElink and 2.5~Gbps deterministic latency KM3link - channels, but its modularity allows for a straightforward inclusion of other link technologies. To avoid host OS intervention on data stream and remove a possible source of jitter, the design includes a network/transport layer offload module with cycle-accurate, upper-bound latency, supporting UDP, KM3link Time Division Multiplexing and APElink protocols. After NaNet architecture description and its latency/bandwidth characterization for all supported links, two real world use cases will be presented: the GPU-based low level trigger for the RICH detector in the NA62 experiment at CERN and the on-/off-shore data link for KM3 underwater neutrino telescope

    High-Speed Low-Voltage Line Driver for SerDes Applications

    Get PDF
    The driving factor behind this research was to design & develop a line driver capable of meeting the demanding specifications of the next generation of SerDes devices. In this thesis various line driver topologies were analysed to identify a topology suited for a high-speed low-voltage operating environment. This thesis starts of by introducing a relatively new high-speed communication Device called SerDes. SerDes is used in wired chip-to-chip communications and operates by converting a parallel data stream in a serial data stream that can be then transmitted at a higher bit rate, existing SerDes devices operate up to 12.5Gbps. A matching SerDes device at the destination will then convert the serial data stream back into a parallel data stream to be read by the destination ASIC. SerDes typically uses a line driver with a differential output. Using a differential line driver increases the resilience to outside sources of noise and reduces the amount of EM radiation produced by transmission. The focus of this research is to design and develop a line driver that can operate at 40Gbps and can function with a power supply of less than IV. This demanding specification was decided to be an accurate representation of future requirements that a line driver in a SerDes device will have to conform to. A suitable line driver with a differential output was identified to meet the demanding specifications and was modified so that it can perfonn an equalisation technique called pre-distortion. Two variations of the new topology were outlined and a behavioural model was created for both using Matlab Simulink. The behavioural model for both variants proved the concept, however only one variant maintained its perfomance once the designs were implemented at transistor level in Cadence, using a 65nm CMOS technology provided by Texas Instruments. The final line driver design was then converted into a layout design, again using Cadence, and RC parasitics were extracted to perfom a post-layout simulation. The post layout simulation shows that the novel line driver can operate at 40Gbps with a power supply of 1 V - O.8V and has a power consumption of 4.54m W /Gbps. The Deterministic Jitter added by the line driver is 12.9ps

    CMOS VLSI correlator design for radio-astronomical signal processing : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Engineering at Massey University, Auckland, New Zealand

    Get PDF
    Multi-element radio telescopes employ methods of indirect imaging to capture the image of the sky. These methods are in contrast to direct imaging methods whereby the image is constructed from sensor measurements directly and involve extensive signal processing on antenna signals. The Square Kilometre Array, or the SKA, is a future radio telescope of this type that, once built, will become the largest telescope in the world. The unprecedented scale of the SKA requires novel solutions to be developed for its signal processing pipeline one of the most resource-consuming parts of which is the correlator. The SKA uses the FX correlator construction that consists of two parts: the F part that translates antenna signals into frequency domain and the X part that cross-correlates these signals between each other. This research focuses on the integrated circuit design and VLSI implementation issues of the X part of a very large FX correlator in 28 nm and 130 nm CMOS. The correlator’s main processing operation is the complex multiply-accumulation (CMAC) for which custom 28 nm CMAC designs are presented and evaluated. Performance of various memories inside the correlator also affects overall efficiency, and input-buffered and output-buffered approaches are considered with the goal of improving upon it. For output-buffered designs, custom memory control circuits have been designed and prototyped in 130 nm that improve upon eDRAM by taking advantage of sequential access patterns. For the input-buffered architecture, a new scheme is proposed that decreases the usage of the input-buffer memory by a third by making use of multiple accumulators in every CMAC. Because cross-correlation is a very data-intensive process, high-performance SerDes I/O is essential to any practical ASIC implementation. On the I/O design, the 28 nm full-rate transmitter delivering 15 Gbps per lane is presented. This design consists of the scrambler, the serialiser, the digital VCO with analog fine-tuning and the SST driver including features of a 4-tap FFE, impedance tuning and amplitude tuning

    Demystifying the Characteristics of 3D-Stacked Memories: A Case Study for Hybrid Memory Cube

    Full text link
    Three-dimensional (3D)-stacking technology, which enables the integration of DRAM and logic dies, offers high bandwidth and low energy consumption. This technology also empowers new memory designs for executing tasks not traditionally associated with memories. A practical 3D-stacked memory is Hybrid Memory Cube (HMC), which provides significant access bandwidth and low power consumption in a small area. Although several studies have taken advantage of the novel architecture of HMC, its characteristics in terms of latency and bandwidth or their correlation with temperature and power consumption have not been fully explored. This paper is the first, to the best of our knowledge, to characterize the thermal behavior of HMC in a real environment using the AC-510 accelerator and to identify temperature as a new limitation for this state-of-the-art design space. Moreover, besides bandwidth studies, we deconstruct factors that contribute to latency and reveal their sources for high- and low-load accesses. The results of this paper demonstrates essential behaviors and performance bottlenecks for future explorations of packet-switched and 3D-stacked memories.Comment: EEE Catalog Number: CFP17236-USB ISBN 13: 978-1-5386-1232-

    Evaluation of Giga-bit Ethernet Instrumentation for SalSA Electronics Readout (GEISER)

    Full text link
    An instrumentation prototype for acquiring high-speed transient data from an array of high bandwidth antennas is presented. Multi-kilometer cable runs complicate acquisition of such large bandwidth radio signals from an extensive antenna array. Solutions using analog fiber optic links are being explored, though are very expensive. We propose an inexpensive solution that allows for individual operation of each antenna element, operating at potentially high local self-trigger rates. Digitized data packets are transmitted to the surface via commercially available Giga-bit Ethernet hardware. Events are then reconstructed on a computer farm by sorting the received packets using standard networking gear, eliminating the need for custom, very high-speed trigger hardware. Such a system is completely scalable and leverages the hugh capital investment made by the telecommunications industry. Test results from a demonstration prototype are presented.Comment: 8 pages, to be submitted to NIM
    corecore