25 research outputs found

    Understanding Quantum Technologies 2022

    Full text link
    Understanding Quantum Technologies 2022 is a creative-commons ebook that provides a unique 360 degrees overview of quantum technologies from science and technology to geopolitical and societal issues. It covers quantum physics history, quantum physics 101, gate-based quantum computing, quantum computing engineering (including quantum error corrections and quantum computing energetics), quantum computing hardware (all qubit types, including quantum annealing and quantum simulation paradigms, history, science, research, implementation and vendors), quantum enabling technologies (cryogenics, control electronics, photonics, components fabs, raw materials), quantum computing algorithms, software development tools and use cases, unconventional computing (potential alternatives to quantum and classical computing), quantum telecommunications and cryptography, quantum sensing, quantum technologies around the world, quantum technologies societal impact and even quantum fake sciences. The main audience are computer science engineers, developers and IT specialists as well as quantum scientists and students who want to acquire a global view of how quantum technologies work, and particularly quantum computing. This version is an extensive update to the 2021 edition published in October 2021.Comment: 1132 pages, 920 figures, Letter forma

    Readout Electronics for the Upgraded ITS Detector in the ALICE Experiment

    Get PDF
    ALICE is undergoing upgrades during the Long Shutdown (LS) 2 of the LHC to improve its performance and capabilities, and to prepare the experiment for the increases in luminosity provided by the LHC in Run 3 and Run 4. One of the most extensive upgrades of the experiment (and the topic of this thesis) is the replacement of the Inner Tracking System (ITS) in its entirety with a new and upgraded system. The new ITS consists exclusively of pixel sensors organized in seven cylindrical layers, and offers significantly improved tracking capabilities at higher interaction rates. And in contrast to the previous system, which would only trigger on a subset of the available events that were deemed “interesting”, the upgraded ITS will capture all events; either in a triggered mode using minimum-bias triggers, or in a “trigger-less” continuous mode where event data is continuously read out. The key component of the upgrade is a novel pixel sensor chip, the ALPIDE, which was developed at CERN specifically for the ALICE ITS upgrade. The seven layers of the ITS is assembled from sub-assemblies of sensor chips referred to as staves, and the entire detector consists of 24 120 chips in total. The staves come in three different configurations; they range from 9 chips per stave for the innermost layers, and up to 196 chips per stave in the outer layers. The number of control and data links, as well as the bit-rate of the data links, differs widely between the staves as well. Data readout from the high-speed copper links of the detector requires dedicated readout electronics in the vicinity of the detector. The core component of this system is the FPGA-based Readout Unit (RU). It facilitates the readout of the data links and transfer data to the experiment’s server farms via optical links; provides control, configuration and monitoring of the sensor chips using the same optical links, as well as over CAN-bus for redundancy; distributes trigger signals to the sensor, either by forwarding the minimum-bias triggers of the experiment, or by local generation of trigger pulses for the continuous mode. And the field-programmable devices of the RU allows for future updates and changes of functionality, which can be performed remotely via several redundant paths to the RUs. This is an important feature, since the RUs are not easily accessible when they are installed in the cavern of the experiment and will be exposed to radiation when the LHC is in operation. Radiation tolerance has been an important concern during the development of the FPGA designs, as well as the RU hardware itself, since radiation-induced errors in the RUs are expected during operation. Techniques such as Triple Modular Redundancy (TMR) were used in the FPGA designs to mitigate these effects. One example is the radiation tolerant CAN controller design which is introduced in this thesis. A different challenge, which is also addressed in this thesis, is the monitoring of internal status and quantities such as temperature and voltage in the ALPIDE chips. This is performed over the ALPIDE’s control bus, but must be carefully coordinated as the control bus is also used for triggers. The detector and readout electronics are designed to operate under a wide set of conditions. Considering events from Pb–Pb collisions, which may have thousands of pixel hits in the detector, a typical pp event has comparatively few pixel hits, but the collision rate is significantly higher for pp runs than it is for Pb–Pb runs. And the detector can be used with two triggering modes, where the continuous trigger mode has additional parameters for trigger period. A simulation model of the ALPIDE and ITS, presented in this thesis, was developed to simulate the readout performance and efficiency of the detector under a wide set of circumstances. The simulated results show that the detector should perform with a high efficiency at the collision rates that are planned for Run 3. Initial plans for a dedicated hardware, to handle and coordinate busy status for the detector, was deemed superfluous and the plans were canceled based on these results. Collision rates higher than those planned for Run 3 were also simulated to yield parameters for optimal performance at those rates. For the RU, which was designed to interface to three widely different stave designs, the simulations quantified the amount of data the readout electronics will have to handle depending on the detector layer and operating conditions. Furthermore, the simulation model was adapted for simulations of two other ALPIDE-based detector projects; the Proton CT (pCT) project at University of Bergen (UiB), a Digital Tracking Calorimeter (DTC) used for dose planning of particle therapy in cancer treatment; and the planned Forward Calorimeter (FoCal) for ALICE, where there will be two layers of pixel sensors among the 18 layers of Si-W calorimeter pads in the electromagnetic part of the detector (FoCal-E). Since the size of a calorimeter pad is relatively large, around 1 cm², the fine grained pixels of the ALPIDE (29.24 µm × 26.88 µm) will help distinguish between multiple showers and improve the overall spatial resolution of the detector. The simulations helped prove the feasibility of the ALPIDE for this detector, from a readout perspective, and FoCal was later approved by the LHCC committee at CERN.Doktorgradsavhandlin

    RadioLab tra presente e futuro

    Get PDF
    Il progetto nazionale dell’INFN sul monitoraggio ambientale del radon ha coinvolto per oltre un decennio scuole su tutto il territorio nazionale. Recentemente, alcune attività hanno coinvolto molte sedi rafforzandone l’efficacia e l’impatto su studenti e insegnanti. Tra queste ricordiamo il sondaggio sulla conoscenza del radon, la scuola estiva nazionale e le attività di calibrazione con protocolli comuni. La pandemia ha interrotto bruscamente le attività in presenza e l’organizzazione scolastica post-lockdown richiede di ripensare alcune azioni per ampliare la diffusione della consapevolezza di questa problematica tra i cittadini, ora che il recepimento della normativa europea sul radon `e giunto a compimento

    GPU 에러 안정성 보장을 위한 컴파일러 기법

    Get PDF
    학위논문 (박사) -- 서울대학교 대학원 : 공과대학 전기·컴퓨터공학부, 2020. 8. 이재진.Due to semiconductor technology scaling and near-threshold voltage computing, soft error resilience has become more important. Nowadays, GPUs are widely used in high performance computing (HPC) because of its efficient parallel processing and modern GPUs designed for HPC use error correction code (ECC) to protect their storage including register files. However, adopting ECC in the register file imposes high area and energy overhead. To replace the expensive hardware cost of ECC, we propose Penny, a lightweight compiler-directed resilience scheme for GPU register file protection. We combine recent advances in idempotent recovery with low-cost error detection code. Our approach focuses on solving two important problems: 1. Can we guarantee correct error recovery using idempotent execution with error detection code? We show that when an error detection code is used with idempotence recovery, certain restrictions required by previous idempotent recovery schemes are no longer needed. We also propose a software-based scheme to prevent the checkpoint value from being overwritten before the end of the region where the value is required for correct recovery. 2. How do we reduce the execution overhead caused by checkpointing? In GPUs additional checkpointing store instructions inflicts considerably higher overhead compared to CPUs, due to its architectural characteristics, such as lack of store buffers. We propose a number of compiler optimizations techniques that significantly reduce the overhead.반도체 미세공정 기술이 발전하고 문턱전압 근처 컴퓨팅(near-threashold voltage computing)이 도입됨에 따라서 소프트 에러로부터의 복원이 중요한 과제가 되었다. 강력한 병렬 계산 성능을 지닌 GPU는 고성능 컴퓨팅에서 중요한 위치를 차지하게 되었고, 슈퍼 컴퓨터에서 쓰이는 GPU들은 에러 복원 코드인 ECC를 사용하여 레지스터 파일 및 메모리 등에 저장된 데이터를 보호하게 되었다. 하지만 레지스터 파일에 ECC를 사용하는 것은 큰 하드웨어나 에너지 비용을 필요로 한다. 이런 값비싼 ECC의 하드웨어 비용을 줄이기 위해 본 논문에서는 컴파일러 기반의 저비용 GPU 레지스터 파일 복원 기법인 Penny를 제안한다. 이는 최신의 멱등성(idempotency) 기반 에러 복원 기법을 저비용의 에러 검출 코드(EDC)와 결합한 것이다. 본 논문은 다음 두가지 문제를 해결하는 데에 집중한다. 1. 에러 검출 코드 기반으로 멱등성 기반 에러 복원을 사용시 소프트 에러로부터의 안전한 복원을 보장할 수 있는가?} 본 논문에서는 에러 검출 코드가 멱등성 기반 복원 기술과 같이 사용되었을 경우 기존의 복원 기법에서 필요로 했던 조건들 없이도 안전하게 에러로부터 복원할 수 있음을 보인다. 2. 체크포인팅에드는 비용을 어떻게 절감할 수 있는가?} GPU는 스토어 버퍼가 없는 등 아키텍쳐적인 특성으로 인해서 CPU와 비교하여 체크포인트 값을 저장하는 데에 큰 오버헤드가 든다. 이 문제를 해결하기 위해 본 논문에서는 다양한 컴파일러 최적화 기법을 통하여 오버헤드를 줄인다.1 Introduction 1 1.1 Why is Soft Error Resilience Important in GPUs 1 1.2 How can the ECC Overhead be Reduced 3 1.3 What are the Challenges 4 1.4 How do We Solve the Challenges 5 2 Comparison of Error Detection and Correction Coding Schemes for Register File Protection 7 2.1 Error Correction Codes and Error Detection Codes 8 2.2 Cost of Coding Schemes 9 2.3 Soft Error Frequency of GPUs 11 3 Idempotent Recovery and Challenges 13 3.1 Idempotent Execution 13 3.2 Previous Idempotent Schemes 13 3.2.1 De Kruijf's Idempotent Translation 14 3.2.2 Bolts's Idempotent Recovery 15 3.2.3 Comparison between Idempotent Schemes 15 3.3 Idempotent Recovery Process 17 3.4 Idempotent Recovery Challenges for GPUs 18 3.4.1 Checkpoint Overwriting 20 3.4.2 Performance Overhead 20 4 Correctness of Recovery 22 4.1 Proof of Safe Recovery 23 4.1.1 Prevention of Error Propagation 23 4.1.2 Proof of Correct State Recovery 24 4.1.3 Correctness in Multi-Threaded Execution 28 4.2 Preventing Checkpoint Overwriting 30 4.2.1 Register renaming 31 4.2.2 Storage Alternation by Checkpoint Coloring 33 4.2.3 Automatic Algorithm Selection 38 4.2.4 Future Works 38 5 Performance Optimizations 40 5.1 Compilation Phases of Penny 40 5.1.1 Region Formation 41 5.1.2 Bimodal Checkpoint Placement 41 5.1.3 Storage Alternation 42 5.1.4 Checkpoint Pruning 43 5.1.5 Storage Assignment 44 5.1.6 Code Generation and Low-level Optimizations 45 5.2 Cost Estimation Model 45 5.3 Region Formation 46 5.3.1 De Kruijf's Heuristic Region Formation 46 5.3.2 Region splitting and Region Stitching 47 5.3.3 Checkpoint-Cost Aware Optimal Region Formation 48 5.4 Bimodal Checkpoint Placement 52 5.5 Optimal Checkpoint Pruning 55 5.5.1 Bolt's Naive Pruning Algorithm and Overview of Penny's Optimal Pruning Algorithm 55 5.5.2 Phase 1: Collecting Global-Decision Independent Status 56 5.5.3 Phase2: Ordering and Finalizing Renaming Decisions 60 5.5.4 Effectiveness of Eliminating the Checkpoints 63 5.6 Automatic Checkpoint Storage Assignment 69 5.7 Low-Level Optimizations and Code Generation 70 6 Evaluation 74 6.1 Test Environment 74 6.1.1 GPU Architecture and Simulation Setup 74 6.1.2 Tested Applications 75 6.1.3 Register Assignment 76 6.2 Performance Evaluation 77 6.2.1 Overall Performance Overheads 77 6.2.2 Impact of Penny's Optimizations 78 6.2.3 Assigning Checkpoint Storage and Its Integrity 79 6.2.4 Impact of Optimal Checkpoint Pruning 80 6.2.5 Impact of Alias Analysis 81 6.3 Repurposing the Saved ECC Area 82 6.4 Energy Impact on Execution 83 6.5 Performance Overhead on Volta Architecture 85 6.6 Compilation Time 85 7 RelatedWorks 87 8 Conclusion and Future Works 89 8.1 Limitation and Future Work 90Docto

    GSI Scientific Report 2014 / GSI Report 2015-1

    Get PDF

    Reliability in the face of variability in nanometer embedded memories

    Get PDF
    In this thesis, we have investigated the impact of parametric variations on the behaviour of one performance-critical processor structure - embedded memories. As variations manifest as a spread in power and performance, as a first step, we propose a novel modeling methodology that helps evaluate the impact of circuit-level optimizations on architecture-level design choices. Choices made at the design-stage ensure conflicting requirements from higher-levels are decoupled. We then complement such design-time optimizations with a runtime mechanism that takes advantage of adaptive body-biasing to lower power whilst improving performance in the presence of variability. Our proposal uses a novel fully-digital variation tracking hardware using embedded DRAM (eDRAM) cells to monitor run-time changes in cache latency and leakage. A special fine-grain body-bias generator uses the measurements to generate an optimal body-bias that is needed to meet the required yield targets. A novel variation-tolerant and soft-error hardened eDRAM cell is also proposed as an alternate candidate for replacing existing SRAM-based designs in latency critical memory structures. In the ultra low-power domain where reliable operation is limited by the minimum voltage of operation (Vddmin), we analyse the impact of failures on cache functional margin and functional yield. Towards this end, we have developed a fully automated tool (INFORMER) capable of estimating memory-wide metrics such as power, performance and yield accurately and rapidly. Using the developed tool, we then evaluate the #effectiveness of a new class of hybrid techniques in improving cache yield through failure prevention and correction. Having a holistic perspective of memory-wide metrics helps us arrive at design-choices optimized simultaneously for multiple metrics needed for maintaining lifetime requirements
    corecore