243,902 research outputs found
Recommended from our members
Dynamic time management for improved accuracy and speed in host-compiled multi-core platform models
textWith increasing complexity and software content, modern embedded platforms employ a heterogeneous mix of multi-core processors along with hardware accelerators in order to provide high performance in limited power budgets. Due to complex interactions and highly dynamic behavior, static analysis of real-time performance and other constraints is challenging. As an alternative, full-system simulations have been widely accepted by designers. With traditional approaches being either slow or inaccurate, so-called host-compiled simulators have recently emerged as a solution for rapid evaluation of complete systems at early design stages. In such approaches, a faster simulation is achieved by natively executing application code at the source level, abstracting execution behavior of target platforms, and thus increasing simulation granularity. However, most existing host-compiled simulators often focus on application behavior only while neglecting effects of hardware/software interactions and associated speed and accuracy tradeoffs in platform modeling. In this dissertation, we focus on host-compiled operating system (OS) and processor modeling techniques, and we introduce novel dynamic timing model management approaches that efficiently improve both accuracy and speed of such models via automatically calibrating the simulation granularity. The contributions of this dissertation are twofold: We first establish an infrastructure for efficient host-compiled multi-core platform simulation by developing (a) abstract models of both real-time OSs and processors that replicate timing-accurate hardware/software interactions and enable full-system co-simulation, and (b) quantitative and analytical studies of host-compiled simulation principles to analyze error bounds and investigate possible improvements. Building on this infrastructure, we further propose specific techniques for improving accuracy and speed tradeoffs in host-compiled simulation by developing (c) an automatic timing granularity adjustment technique based on dynamically observing system state to control the simulation, (d) an out-of-order cache hierarchy modeling approach to efficiently reorder memory access behavior in the presence of temporal decoupling, and (e) a synchronized timing model to align platform threads to run efficiently in parallel simulation. Results as applied to industrial-strength platforms confirm that by providing careful abstractions and dynamic timing management, our models can achieve full-system simulations at equivalent speeds of more than a thousand MIPS with less than 3% timing error. Coupled with the capability to easily adjust simulation parameters and configurations, this demonstrates the benefits of our platform models for early application development and exploration.Electrical and Computer Engineerin
Interval simulation: raising the level of abstraction in architectural simulation
Detailed architectural simulators suffer from a long development cycle and extremely long evaluation times. This longstanding problem is further exacerbated in the multi-core processor era. Existing solutions address the simulation problem by either sampling the simulated instruction stream or by mapping the simulation models on FPGAs; these approaches achieve substantial simulation speedups while simulating performance in a cycle-accurate manner This paper proposes interval simulation which rakes a completely different approach: interval simulation raises the level of abstraction and replaces the core-level cycle-accurate simulation model by a mechanistic analytical model. The analytical model estimates core-level performance by analyzing intervals, or the timing between two miss events (branch mispredictions and TLB/cache misses); the miss events are determined through simulation of the memory hierarchy, cache coherence protocol, interconnection network and branch predictor By raising the level of abstraction, interval simulation reduces both development time and evaluation time. Our experimental results using the SPEC CPU2000 and PARSEC benchmark suites and the MS multi-core simulator show good accuracy up to eight cores (average error of 4.6% and max error of 11% for the multi-threaded full-system workloads), while achieving a one order of magnitude simulation speedup compared to cycle-accurate simulation. Moreover interval simulation is easy to implement: our implementation of the mechanistic analytical model incurs only one thousand lines of code. Its high accuracy, fast simulation speed and ease-of-use make interval simulation a useful complement to the architect's toolbox for exploring system-level and high-level micro-architecture trade-offs
New Blind Block Synchronization for Transceivers Using Redundant Precoders
This paper studies the blind block synchronization problem in block transmission systems using linear redundant precoders (LRP). Two commonly used LRP systems, namely, zero padding (ZP) and cyclic prefix (CP) systems, are considered in this paper. In particular, the block synchronization problem in CP systems is a broader version of timing synchronization problem in the popular orthogonal frequency division multiplexing (OFDM) systems. The proposed algorithms exploit the rank deficiency property of the matrix composed of received blocks when the block synchronization is perfect and use a parameter called repetition index which can be chosen as any positive integer. Theoretical results suggest advantages in blind block synchronization performances when using a large repetition index. Furthermore, unlike previously reported algorithms, which require a large amount of received data, the proposed methods, with properly chosen repetition indices, guarantee correct block synchronization in absence of noise using only two received blocks in ZP systems and three in CP systems. Computer simulations are conducted to evaluate the performances of the proposed algorithms and compare them with previously reported algorithms. Simulation results not only verify the capability of the proposed algorithms to work with limited received data but also show significant improvements in the block synchronization error rate performance of the proposed algorithms over previously reported algorithms
Increasing exhaust temperature to enable after-treatment operation on a two-stage turbo-charged medium speed marine diesel engine
Nitrogen-oxides (NOx) are becoming more and more regulated. In heavy duty, medium speed engines these emission limits are also being reduced steadily: Selective catalytic reduction is a proven technology which allows to reduce NOx emission with very high efficiency. However, operating temperature of the catalytic converter has to be maintained within certain limits as conversion efficiency and ammonia slip are very heavily influenced by temperature. In this work the engine calibration and hardware will be modified to allow for a wide engine operating range with Selective catalytic reduction. The studied engine has 4MW nominal power and runs at 750rpm engine speed, fuel consumption during engine tests becomes quite expensive (+- 750kg/h) for a measurement campaign. This is why a simulation model was developed and validated. This model was then used to investigate several strategies to control engine out temperature: different types of wastegates, injection variation and valve timing adjustments. Simulation showed that wastegate application had the best tradeoff between fuel consumption and exhaust temperature. Finally, this configuration was built on the engine test bench and results from both measurements and simulation agreed very well
Turbocharger blade vibration: Measurement and validation through laser tip-timing
High Cycle Fatigue (HCF) of turbine blades is a major cause of failure in turbochargers. In order to validate changes to blades intended to reduce fatigue failure, accurate measurement of blade dynamics is necessary. Strain gauging has limitations, so an alternative method is investigated. A description of the tip-timing method is given, applied to turbocharger testing. The advantages and disadvantages of laser probes are assessed. Examples of output data and interpretation are presented and compared with computer simulation. It is shown that laser tip-timing technology gives a more complete view of turbine vibration than the alternative measurement system
Modeling and visualizing networked multi-core embedded software energy consumption
In this report we present a network-level multi-core energy model and a
software development process workflow that allows software developers to
estimate the energy consumption of multi-core embedded programs. This work
focuses on a high performance, cache-less and timing predictable embedded
processor architecture, XS1. Prior modelling work is improved to increase
accuracy, then extended to be parametric with respect to voltage and frequency
scaling (VFS) and then integrated into a larger scale model of a network of
interconnected cores. The modelling is supported by enhancements to an open
source instruction set simulator to provide the first network timing aware
simulations of the target architecture. Simulation based modelling techniques
are combined with methods of results presentation to demonstrate how such work
can be integrated into a software developer's workflow, enabling the developer
to make informed, energy aware coding decisions. A set of single-,
multi-threaded and multi-core benchmarks are used to exercise and evaluate the
models and provide use case examples for how results can be presented and
interpreted. The models all yield accuracy within an average +/-5 % error
margin
Maximum-Likelihood Detection of Soliton with Timing Jitter
Using the maximum-likelihood detector (MLD) of a soliton with timing jitter
and noise, other than walk-out of the bit interval, timing jitter does not
degrade the performance of MLD. When the MLD is simulated with important
sampling method, even with a timing jitter standard deviation the same as the
full-width-half-maximum (FWHM) of the soliton, the signal-to-noise (SNR)
penalty is just about 0.2 dB. The MLD performs better than conventional scheme
to lengthen the decision window with additive noise proportional to the window
wide.Comment: 3 pages, 2 figures, submitted to Optics Letter
- ā¦