11,539 research outputs found
Maximizing CNN Accelerator Efficiency Through Resource Partitioning
Convolutional neural networks (CNNs) are revolutionizing machine learning,
but they present significant computational challenges. Recently, many
FPGA-based accelerators have been proposed to improve the performance and
efficiency of CNNs. Current approaches construct a single processor that
computes the CNN layers one at a time; the processor is optimized to maximize
the throughput at which the collection of layers is computed. However, this
approach leads to inefficient designs because the same processor structure is
used to compute CNN layers of radically varying dimensions.
We present a new CNN accelerator paradigm and an accompanying automated
design methodology that partitions the available FPGA resources into multiple
processors, each of which is tailored for a different subset of the CNN
convolutional layers. Using the same FPGA resources as a single large
processor, multiple smaller specialized processors increase computational
efficiency and lead to a higher overall throughput. Our design methodology
achieves 3.8x higher throughput than the state-of-the-art approach on
evaluating the popular AlexNet CNN on a Xilinx Virtex-7 FPGA. For the more
recent SqueezeNet and GoogLeNet, the speedups are 2.2x and 2.0x
A review of advances in pixel detectors for experiments with high rate and radiation
The Large Hadron Collider (LHC) experiments ATLAS and CMS have established
hybrid pixel detectors as the instrument of choice for particle tracking and
vertexing in high rate and radiation environments, as they operate close to the
LHC interaction points. With the High Luminosity-LHC upgrade now in sight, for
which the tracking detectors will be completely replaced, new generations of
pixel detectors are being devised. They have to address enormous challenges in
terms of data throughput and radiation levels, ionizing and non-ionizing, that
harm the sensing and readout parts of pixel detectors alike. Advances in
microelectronics and microprocessing technologies now enable large scale
detector designs with unprecedented performance in measurement precision (space
and time), radiation hard sensors and readout chips, hybridization techniques,
lightweight supports, and fully monolithic approaches to meet these challenges.
This paper reviews the world-wide effort on these developments.Comment: 84 pages with 46 figures. Review article.For submission to Rep. Prog.
Phy
The ATLAS Inner Detector operation, data quality and tracking performance
The ATLAS Inner Detector is responsible for particle tracking in ATLAS
experiment at CERN Large Hadron Collider (LHC) and comprises silicon and gas
based detectors. The combination of both silicon and gas based detectors
provides high precision impact parameter and momentum measurement of charged
particles, with high efficiency and small fake rate. The ID has been used to
exploit fully the physics potential of the LHC since the first proton-proton
collisions at 7 TeV were delivered in 2009. The performance of track and vertex
reconstruction is presented, as well as the operation aspects of the Inner
Detector and the data quality during the many months of data taking.Comment: Physics in Collision, Slovakia, 201
Design techniques for low-power systems
Portable products are being used increasingly. Because these systems are battery powered, reducing power consumption is vital. In this report we give the properties of low-power design and techniques to exploit them on the architecture of the system. We focus on: minimizing capacitance, avoiding unnecessary and wasteful activity, and reducing voltage and frequency. We review energy reduction techniques in the architecture and design of a hand-held computer and the wireless communication system including error control, system decomposition, communication and MAC protocols, and low-power short range networks
A Comprehensive Workflow for General-Purpose Neural Modeling with Highly Configurable Neuromorphic Hardware Systems
In this paper we present a methodological framework that meets novel
requirements emerging from upcoming types of accelerated and highly
configurable neuromorphic hardware systems. We describe in detail a device with
45 million programmable and dynamic synapses that is currently under
development, and we sketch the conceptual challenges that arise from taking
this platform into operation. More specifically, we aim at the establishment of
this neuromorphic system as a flexible and neuroscientifically valuable
modeling tool that can be used by non-hardware-experts. We consider various
functional aspects to be crucial for this purpose, and we introduce a
consistent workflow with detailed descriptions of all involved modules that
implement the suggested steps: The integration of the hardware interface into
the simulator-independent model description language PyNN; a fully automated
translation between the PyNN domain and appropriate hardware configurations; an
executable specification of the future neuromorphic system that can be
seamlessly integrated into this biology-to-hardware mapping process as a test
bench for all software layers and possible hardware design modifications; an
evaluation scheme that deploys models from a dedicated benchmark library,
compares the results generated by virtual or prototype hardware devices with
reference software simulations and analyzes the differences. The integration of
these components into one hardware-software workflow provides an ecosystem for
ongoing preparative studies that support the hardware design process and
represents the basis for the maturity of the model-to-hardware mapping
software. The functionality and flexibility of the latter is proven with a
variety of experimental results
Belle II Technical Design Report
The Belle detector at the KEKB electron-positron collider has collected
almost 1 billion Y(4S) events in its decade of operation. Super-KEKB, an
upgrade of KEKB is under construction, to increase the luminosity by two orders
of magnitude during a three-year shutdown, with an ultimate goal of 8E35 /cm^2
/s luminosity. To exploit the increased luminosity, an upgrade of the Belle
detector has been proposed. A new international collaboration Belle-II, is
being formed. The Technical Design Report presents physics motivation, basic
methods of the accelerator upgrade, as well as key improvements of the
detector.Comment: Edited by: Z. Dole\v{z}al and S. Un
- âŠ