210 research outputs found

    A case study for NoC based homogeneous MPSoC architectures

    Get PDF
    The many-core design paradigm requires flexible and modular hardware and software components to provide the required scalability to next-generation on-chip multiprocessor architectures. A multidisciplinary approach is necessary to consider all the interactions between the different components of the design. In this paper, a complete design methodology that tackles at once the aspects of system level modeling, hardware architecture, and programming model has been successfully used for the implementation of a multiprocessor network-on-chip (NoC)-based system, the NoCRay graphic accelerator. The design, based on 16 processors, after prototyping with field-programmable gate array (FPGA), has been laid out in 90-nm technology. Post-layout results show very low power, area, as well as 500 MHz of clock frequency. Results show that an array of small and simple processors outperform a single high-end general purpose processo

    Development of the DAQ Front-end for the DSSC Detector at the European XFEL

    Get PDF
    The European XFEL is an international photon science facility currently under construction at DESY, Hamburg. Its unique characteristics will open up new research opportunities for investigating tiny structures, ultra-fast processes, and also matter under extreme conditions. The research will allow invaluable insights for many scientific disciplines like biology, medicine, and chemistry, but also for nano-technology, astro-physics, and others. The DSSC detector is one of three 2d megapixel detectors presently being developed for application at the XFEL facility. A challange is the acquisition of the huge data amount produced by the detector system. The total payload data rate is estimated to be in the order of 67.2 Gb/s. This thesis presents the DAQ front-end for the DSSC detector. A special focus is on the development of the I/O Board, which represents the basic component of the lower DAQ layer. The DSSC front-end DAQ system exploits the features of latest technology in microelectronics and high-speed data transmission. Organized as a two-staged hierarchical system, it comprises 20 readout nodes in total, based on FPGA technology. The 16 slave nodes of the first DAQ layer receive data from the detector front-end at an aggregate link bandwidth of 89.6 Gb/s via 256 electrical links. The accumulated data are then concentrated into four 3.125 Gb/s high-speed links per node for transmission towards the four master nodes of the second DAQ layer, the Patch Panel Transceivers. Custom-built firmware on the slave node FPGAs implements the readout logic and concentrator mechanism for the acquired detector data. It additionally comprises several controller modules, which are responsible for operating critical detector electronics. The test results and measurements show that the I/O Board is able both to manage data acquisition at the required bandwith and also to perform low-level controlling tasks as required for proper detector operation

    FlexWAFE - eine Architektur fĂĽr rekonfigurierbare-Bildverarbeitungssysteme

    Get PDF
    Recently there has been an increase in demand for high-resolution digital media content in both cinema and television industries. Currently existing equipment does not meet the requirements, or is too costly. New hardware systems and new programming techniques are needed in order to meet the high-resolution, high-quality, image requirements and reduce costs. The industry seeks a flexible architecture capable of running multiple applications on top of standard off-the-shelf components, with reduced development time. Until now, standard practice has been to develop specialized architectures and systems that target a single application. This has little flexibility and leads to high developments costs, every new application is designed almost from scratch. Our focus was to develop an architecture that is suited to image stream processing and has the flexibility to run multiple applications using the same FPGA-based hardware platform. The novelty in our approach is that we reconfigure parts of the architecture at run-time, but without incurring in the time and added constraints penalty of FPGA-partial-reconfiguration techniques. The architecture uses a hierarchical control structure that is well suited to parallel processing, and allows single cycle latency reconfiguration of parts of the processing pipeline. This is achieved using relatively little resources for the distributed control structures. To test the developed architecture a complex film-grain noise reduction algorithm was implemented on an off-the-shelf hardware platform developed by Thomson-Grass Valley. The system meet all the requirements and had very little load on the hierarchical control structures, there is growth headroom for much complexer control demands. The architecture has been ported to other hardware platforms, and other applications have been implemented as well. The run-time reconfigurability has proven to be a key factor in the success of the FlexWAFE.Kürzlich gab es eine Zunahme der Nachfrage nach hochauflösenden digitalen Medieninhalten in den Kino- und Fernsehenindustrien. Derzeit vorhandene Systeme entsprechen nicht den Anforderungen, oder sind zu teuer. Neue Hardware-Systeme und neuer Programmiertechniken sind erforderlich, um den hochauflösenden, hochwertigen, Bildanforderungen zu genügen und Kosten zu verringern. Die Industrie sucht eine flexible Architektur zur Ausführung mehrerer Anwendungen auf Standard-Komponenten, mit reduzierten Entwicklungszeiten. Bis jetzt ist gängige Praxis, spezialisierten Architektur und Systeme zu entwickeln, die eine einzelne Anwendung zielen. Dieses hat wenig Flexibilität und führt zu hohe Entwicklungskosten, jede neue Anwendung ist fast von Grund auf neu konzipiert. Unser Fokus war es, eine für Bild Verarbeitung geeignet Architektur zu entwickeln dass die Flexibilität hat mehrere Anwendungen an dieselbe FPGA-basierte Hardware-Plattform zu laufen. Die Neuheit in unserem Ansatz ist, dass wir Teile der Architektur zur Laufzeit rekonfigurieren, aber, ohne das Zeit und constraints strafe von FPGA Partielle-Rekonfiguration-Techniken. Die Architektur verwendet eine hierarchische Kontrollstruktur, die zur parallel Verarbeitung gut geeignet ist, und Single-Cycle-Latenz Rekonfiguration von Teilen der Verarbeitungs-Pipeline ermöglicht. Dieses wird unter Verwendung relativ weniger Ressourcen für die verteiltes Steuerung Strukturen erzielt. Um das entwickelte Architektur zu testen ein komplexer Film-Korn-Rauschunterdrückung Algorithmus wurde auf einer von Thomson-Grass Valley entwickelt standard Hardware-Plattform umgesetzt. Das System erfüllt alle Anforderungen und hatte sehr wenig Last auf den hierarchischen Kontrollstrukturen, es gibt viel Wachstum Spielraum für viel kompliziertere Steuerunganforderungen. Die Architektur ist zu anderen Hardwareplattformen portiert worden, und andere Anwendungen wurden ebenfalls implementiert. Der Laufzeitreconfigurability ist ein Schlüsselfaktor im Erfolg des FlexWAFE gewesen

    Readout Electronics for the Upgraded ITS Detector in the ALICE Experiment

    Get PDF
    ALICE is undergoing upgrades during the Long Shutdown (LS) 2 of the LHC to improve its performance and capabilities, and to prepare the experiment for the increases in luminosity provided by the LHC in Run 3 and Run 4. One of the most extensive upgrades of the experiment (and the topic of this thesis) is the replacement of the Inner Tracking System (ITS) in its entirety with a new and upgraded system. The new ITS consists exclusively of pixel sensors organized in seven cylindrical layers, and offers significantly improved tracking capabilities at higher interaction rates. And in contrast to the previous system, which would only trigger on a subset of the available events that were deemed “interesting”, the upgraded ITS will capture all events; either in a triggered mode using minimum-bias triggers, or in a “trigger-less” continuous mode where event data is continuously read out. The key component of the upgrade is a novel pixel sensor chip, the ALPIDE, which was developed at CERN specifically for the ALICE ITS upgrade. The seven layers of the ITS is assembled from sub-assemblies of sensor chips referred to as staves, and the entire detector consists of 24 120 chips in total. The staves come in three different configurations; they range from 9 chips per stave for the innermost layers, and up to 196 chips per stave in the outer layers. The number of control and data links, as well as the bit-rate of the data links, differs widely between the staves as well. Data readout from the high-speed copper links of the detector requires dedicated readout electronics in the vicinity of the detector. The core component of this system is the FPGA-based Readout Unit (RU). It facilitates the readout of the data links and transfer data to the experiment’s server farms via optical links; provides control, configuration and monitoring of the sensor chips using the same optical links, as well as over CAN-bus for redundancy; distributes trigger signals to the sensor, either by forwarding the minimum-bias triggers of the experiment, or by local generation of trigger pulses for the continuous mode. And the field-programmable devices of the RU allows for future updates and changes of functionality, which can be performed remotely via several redundant paths to the RUs. This is an important feature, since the RUs are not easily accessible when they are installed in the cavern of the experiment and will be exposed to radiation when the LHC is in operation. Radiation tolerance has been an important concern during the development of the FPGA designs, as well as the RU hardware itself, since radiation-induced errors in the RUs are expected during operation. Techniques such as Triple Modular Redundancy (TMR) were used in the FPGA designs to mitigate these effects. One example is the radiation tolerant CAN controller design which is introduced in this thesis. A different challenge, which is also addressed in this thesis, is the monitoring of internal status and quantities such as temperature and voltage in the ALPIDE chips. This is performed over the ALPIDE’s control bus, but must be carefully coordinated as the control bus is also used for triggers. The detector and readout electronics are designed to operate under a wide set of conditions. Considering events from Pb–Pb collisions, which may have thousands of pixel hits in the detector, a typical pp event has comparatively few pixel hits, but the collision rate is significantly higher for pp runs than it is for Pb–Pb runs. And the detector can be used with two triggering modes, where the continuous trigger mode has additional parameters for trigger period. A simulation model of the ALPIDE and ITS, presented in this thesis, was developed to simulate the readout performance and efficiency of the detector under a wide set of circumstances. The simulated results show that the detector should perform with a high efficiency at the collision rates that are planned for Run 3. Initial plans for a dedicated hardware, to handle and coordinate busy status for the detector, was deemed superfluous and the plans were canceled based on these results. Collision rates higher than those planned for Run 3 were also simulated to yield parameters for optimal performance at those rates. For the RU, which was designed to interface to three widely different stave designs, the simulations quantified the amount of data the readout electronics will have to handle depending on the detector layer and operating conditions. Furthermore, the simulation model was adapted for simulations of two other ALPIDE-based detector projects; the Proton CT (pCT) project at University of Bergen (UiB), a Digital Tracking Calorimeter (DTC) used for dose planning of particle therapy in cancer treatment; and the planned Forward Calorimeter (FoCal) for ALICE, where there will be two layers of pixel sensors among the 18 layers of Si-W calorimeter pads in the electromagnetic part of the detector (FoCal-E). Since the size of a calorimeter pad is relatively large, around 1 cm², the fine grained pixels of the ALPIDE (29.24 µm × 26.88 µm) will help distinguish between multiple showers and improve the overall spatial resolution of the detector. The simulations helped prove the feasibility of the ALPIDE for this detector, from a readout perspective, and FoCal was later approved by the LHCC committee at CERN.Doktorgradsavhandlin

    Microdot - A Four-Bit Microcontroller Designed for Distributed Low-End Computing in Satellites

    Get PDF
    Many satellites are an integrated collection of sensors and actuators that require dedicated real-time control. For single processor systems, additional sensors require an increase in computing power and speed to provide the multi-tasking capability needed to service each sensor. Faster processors cost more and consume more power, which taxes a satellite\u27s power resources and may lead to shorter satellite lifetimes. An alternative design approach is a distributed network of small and low power microcontrollers designed for space that handle the computing requirements of each individual sensor and actuator. The design of microdot, a four-bit microcontroller for distributed low-end computing, is presented. The design is based on previous research completed at the Space Electronics Branch, Air Force Research Laboratory (AFRL/VSSE) at Kirtland AFB, NM, and the Air Force Institute of Technology at Wright-Patterson AFB, OH. The Microdot has 29 instructions and a 1K x 4 instruction memory. The distributed computing architecture is based on the Philips Semiconductor I2C Serial Bus Protocol. A prototype was implemented and tested using an Altera Field Programmable Gate Array (FPGA). The prototype was operable to 9.1 MHz. The design was targeted for fabrication in a radiation-hardened-by-design gate-array cell library for the TSMC 0.35 micrometer CMOS process
    • …
    corecore