9,414 research outputs found
Analysis of Intel's Haswell Microarchitecture Using The ECM Model and Microbenchmarks
This paper presents an in-depth analysis of Intel's Haswell microarchitecture
for streaming loop kernels. Among the new features examined is the dual-ring
Uncore design, Cluster-on-Die mode, Uncore Frequency Scaling, core improvements
as new and improved execution units, as well as improvements throughout the
memory hierarchy. The Execution-Cache-Memory diagnostic performance model is
used together with a generic set of microbenchmarks to quantify the efficiency
of the microarchitecture. The set of microbenchmarks is chosen such that it can
serve as a blueprint for other streaming loop kernels.Comment: arXiv admin note: substantial text overlap with arXiv:1509.0311
Social Security Replacement Rates for Alternative Earnings Benchmarks
Social Security reform proposals are often presented in terms of their differential impacts on hypothetical or âexampleâ workers. Our work explores how different benchmarks produce different replacement rate outcomes. We use the Health and Retirement Study (HRS) to evaluate how Social Security benefit replacement rates differ for actual versus hypothetical earner profiles, and we examine whether these findings are sensitive to alternative definitions of replacement rates. We find that workers with the median HRS profile would be estimated to receive benefits worth 55% of lifetime average earnings, versus 48% for the SSA medium scaled profile. Since US policymakers tend to prefer a replacement rate measure tied to workersâ own past earnings, using these metrics would yield higher replacement rates compared to commonly used scaled illustrative profiles. However, benchmarks that use population as opposed to individual earnings measures to compare individual worker benefits to pre-retirement consumption produce lower replacement rates for HRS versus hypothetical earners.
Analytic Performance Modeling and Analysis of Detailed Neuron Simulations
Big science initiatives are trying to reconstruct and model the brain by
attempting to simulate brain tissue at larger scales and with increasingly more
biological detail than previously thought possible. The exponential growth of
parallel computer performance has been supporting these developments, and at
the same time maintainers of neuroscientific simulation code have strived to
optimally and efficiently exploit new hardware features. Current state of the
art software for the simulation of biological networks has so far been
developed using performance engineering practices, but a thorough analysis and
modeling of the computational and performance characteristics, especially in
the case of morphologically detailed neuron simulations, is lacking. Other
computational sciences have successfully used analytic performance engineering
and modeling methods to gain insight on the computational properties of
simulation kernels, aid developers in performance optimizations and eventually
drive co-design efforts, but to our knowledge a model-based performance
analysis of neuron simulations has not yet been conducted.
We present a detailed study of the shared-memory performance of
morphologically detailed neuron simulations based on the Execution-Cache-Memory
(ECM) performance model. We demonstrate that this model can deliver accurate
predictions of the runtime of almost all the kernels that constitute the neuron
models under investigation. The gained insight is used to identify the main
governing mechanisms underlying performance bottlenecks in the simulation. The
implications of this analysis on the optimization of neural simulation software
and eventually co-design of future hardware architectures are discussed. In
this sense, our work represents a valuable conceptual and quantitative
contribution to understanding the performance properties of biological networks
simulations.Comment: 18 pages, 6 figures, 15 table
- âŠ