9,248 research outputs found
Interval simulation: raising the level of abstraction in architectural simulation
Detailed architectural simulators suffer from a long development cycle and extremely long evaluation times. This longstanding problem is further exacerbated in the multi-core processor era. Existing solutions address the simulation problem by either sampling the simulated instruction stream or by mapping the simulation models on FPGAs; these approaches achieve substantial simulation speedups while simulating performance in a cycle-accurate manner This paper proposes interval simulation which rakes a completely different approach: interval simulation raises the level of abstraction and replaces the core-level cycle-accurate simulation model by a mechanistic analytical model. The analytical model estimates core-level performance by analyzing intervals, or the timing between two miss events (branch mispredictions and TLB/cache misses); the miss events are determined through simulation of the memory hierarchy, cache coherence protocol, interconnection network and branch predictor By raising the level of abstraction, interval simulation reduces both development time and evaluation time. Our experimental results using the SPEC CPU2000 and PARSEC benchmark suites and the MS multi-core simulator show good accuracy up to eight cores (average error of 4.6% and max error of 11% for the multi-threaded full-system workloads), while achieving a one order of magnitude simulation speedup compared to cycle-accurate simulation. Moreover interval simulation is easy to implement: our implementation of the mechanistic analytical model incurs only one thousand lines of code. Its high accuracy, fast simulation speed and ease-of-use make interval simulation a useful complement to the architect's toolbox for exploring system-level and high-level micro-architecture trade-offs
Dynamic Energy Management for Chip Multi-processors under Performance Constraints
We introduce a novel algorithm for dynamic energy management (DEM) under performance constraints in chip multi-processors (CMPs). Using the novel concept of delayed instructions count, performance loss estimations are calculated at the end of each control period for each core. In addition, a Kalman filtering based approach is employed to predict workload in the next control period for which voltage-frequency pairs must be selected. This selection is done with a novel dynamic voltage and frequency scaling (DVFS) algorithm whose objective is to reduce energy consumption but without degrading performance beyond the user set threshold. Using our customized Sniper based CMP system simulation framework, we demonstrate the effectiveness of the proposed algorithm for a variety of benchmarks for 16 core and 64 core network-on-chip based CMP architectures. Simulation results show consistent energy savings across the board. We present our work as an investigation of the tradeoff between the achievable energy reduction via DVFS when predictions are done using the effective Kalman filter for different performance penalty thresholds
Recommended from our members
TAO Conceptual Design Report: A Precision Measurement of the Reactor Antineutrino Spectrum with Sub-percent Energy Resolution
The Taishan Antineutrino Observatory (TAO, also known as JUNO-TAO) is a
satellite experiment of the Jiangmen Underground Neutrino Observatory (JUNO). A
ton-level liquid scintillator detector will be placed at about 30 m from a core
of the Taishan Nuclear Power Plant. The reactor antineutrino spectrum will be
measured with sub-percent energy resolution, to provide a reference spectrum
for future reactor neutrino experiments, and to provide a benchmark measurement
to test nuclear databases. A spherical acrylic vessel containing 2.8 ton
gadolinium-doped liquid scintillator will be viewed by 10 m^2 Silicon
Photomultipliers (SiPMs) of >50% photon detection efficiency with almost full
coverage. The photoelectron yield is about 4500 per MeV, an order higher than
any existing large-scale liquid scintillator detectors. The detector operates
at -50 degree C to lower the dark noise of SiPMs to an acceptable level. The
detector will measure about 2000 reactor antineutrinos per day, and is designed
to be well shielded from cosmogenic backgrounds and ambient radioactivities to
have about 10% background-to-signal ratio. The experiment is expected to start
operation in 2022
Real time unsupervised learning of visual stimuli in neuromorphic VLSI systems
Neuromorphic chips embody computational principles operating in the nervous
system, into microelectronic devices. In this domain it is important to
identify computational primitives that theory and experiments suggest as
generic and reusable cognitive elements. One such element is provided by
attractor dynamics in recurrent networks. Point attractors are equilibrium
states of the dynamics (up to fluctuations), determined by the synaptic
structure of the network; a `basin' of attraction comprises all initial states
leading to a given attractor upon relaxation, hence making attractor dynamics
suitable to implement robust associative memory. The initial network state is
dictated by the stimulus, and relaxation to the attractor state implements the
retrieval of the corresponding memorized prototypical pattern. In a previous
work we demonstrated that a neuromorphic recurrent network of spiking neurons
and suitably chosen, fixed synapses supports attractor dynamics. Here we focus
on learning: activating on-chip synaptic plasticity and using a theory-driven
strategy for choosing network parameters, we show that autonomous learning,
following repeated presentation of simple visual stimuli, shapes a synaptic
connectivity supporting stimulus-selective attractors. Associative memory
develops on chip as the result of the coupled stimulus-driven neural activity
and ensuing synaptic dynamics, with no artificial separation between learning
and retrieval phases.Comment: submitted to Scientific Repor
Ku-band system design study and TDRSS interface analysis
The capabilities of the Shuttle/TDRSS link simulation program (LinCsim) were expanded to account for radio frequency interference (RFI) effects on the Shuttle S-band links, the channel models were updated to reflect the RFI related hardware changes, the ESTL hardware modeling of the TDRS communication payload was reviewed and evaluated, in LinCsim the Shuttle/TDRSS signal acquisition was modeled, LinCsim was upgraded, and possible Shuttle on-orbit navigation techniques was evaluated
The multilevel trigger system of the DIRAC experiment
The multilevel trigger system of the DIRAC experiment at CERN is presented.
It includes a fast first level trigger as well as various trigger processors to
select events with a pair of pions having a low relative momentum typical of
the physical process under study. One of these processors employs the drift
chamber data, another one is based on a neural network algorithm and the others
use various hit-map detector correlations. Two versions of the trigger system
used at different stages of the experiment are described. The complete system
reduces the event rate by a factor of 1000, with efficiency 95% of
detecting the events in the relative momentum range of interest.Comment: 21 pages, 11 figure
- …