964 research outputs found
The potential of programmable logic in the middle: cache bleaching
Consolidating hard real-time systems onto modern multi-core Systems-on-Chip (SoC) is an open challenge. The extensive sharing of hardware resources at the memory hierarchy raises important unpredictability concerns. The problem is exacerbated as more computationally demanding workload is expected to be handled with real-time guarantees in next-generation Cyber-Physical Systems (CPS). A large body of works has approached the problem by proposing novel hardware re-designs, and by proposing software-only solutions to mitigate performance interference. Strong from the observation that unpredictability arises from a lack of fine-grained control over the behavior of shared hardware components, we outline a promising new resource management approach. We demonstrate that it is possible to introduce Programmable Logic In-the-Middle (PLIM) between a traditional multi-core processor and main memory. This provides the unique capability of manipulating individual memory transactions. We propose a proof-of-concept system implementation of PLIM modules on a commercial multi-core SoC. The PLIM approach is then leveraged to solve long-standing issues with cache coloring. Thanks to PLIM, colored sparse addresses can be re-compacted in main memory. This is the base principle behind the technique we call Cache Bleaching. We evaluate our design on real applications and propose hypervisor-level adaptations to showcase the potential of the PLIM approach.Accepted manuscrip
Improving Model-Based Software Synthesis: A Focus on Mathematical Structures
Computer hardware keeps increasing in complexity. Software design needs to keep up with this. The right models and abstractions empower developers to leverage the novelties of modern hardware. This thesis deals primarily with Models of Computation, as a basis for software design, in a family of methods called software synthesis.
We focus on Kahn Process Networks and dataflow applications as abstractions, both for programming and for deriving an efficient execution on heterogeneous multicores. The latter we accomplish by exploring the design space of possible mappings of computation and data to hardware resources. Mapping algorithms are not at the center of this thesis, however. Instead, we examine the mathematical structure of the mapping
space, leveraging its inherent symmetries or geometric properties to improve mapping methods in general.
This thesis thoroughly explores the process of model-based design, aiming to go beyond the more established software synthesis on dataflow applications. We starting with the problem of assessing these methods through benchmarking, and go on to formally examine the general goals of benchmarks. In this context, we also consider the role modern machine learning methods play in benchmarking.
We explore different established semantics, stretching the limits of Kahn Process Networks. We also discuss novel models, like Reactors, which are designed to be a deterministic, adaptive model with time as a first-class citizen. By investigating abstractions and transformations in the Ohua language for implicit dataflow programming, we also focus on programmability.
The focus of the thesis is in the models and methods, but we evaluate them in diverse use-cases, generally centered around Cyber-Physical Systems. These include the 5G telecommunication standard, automotive and signal processing domains. We even go beyond embedded systems and discuss use-cases in GPU programming and microservice-based architectures
Look-ahead in the two-sided reduction to compact band forms for symmetric eigenvalue problems and the SVD
We address the reduction to compact band forms, via unitary similarity
transformations, for the solution of symmetric eigenvalue problems and the computation of the singular value decomposition (SVD). Concretely, in the first case, we
revisit the reduction to symmetric band form, while, for the second case, we propose a similar alternative, which transforms the original matrix to (unsymmetric)
band form, replacing the conventional reduction method that produces a triangular–
band output. In both cases, we describe algorithmic variants of the standard Level
3 Basic Linear Algebra Subroutines (BLAS)-based procedures, enhanced with lookahead, to overcome the performance bottleneck imposed by the panel factorization.
Furthermore, our solutions employ an algorithmic block size that differs from the
target bandwidth, illustrating the important performance benefits of this decision.
Finally, we show that our alternative compact band form for the SVD is key to introduce an effective look-ahead strategy into the corresponding reduction procedure
CMS software and computing for LHC Run 2
The CMS offline software and computing system has successfully met the
challenge of LHC Run 2. In this presentation, we will discuss how the entire
system was improved in anticipation of increased trigger output rate, increased
rate of pileup interactions and the evolution of computing technology. The
primary goals behind these changes was to increase the flexibility of computing
facilities where ever possible, as to increase our operational efficiency, and
to decrease the computing resources needed to accomplish the primary offline
computing workflows. These changes have resulted in a new approach to
distributed computing in CMS for Run 2 and for the future as the LHC luminosity
should continue to increase. We will discuss changes and plans to our data
federation, which was one of the key changes towards a more flexible computing
model for Run 2. Our software framework and algorithms also underwent
significant changes. We will summarize the our experience with a new
multi-threaded framework as deployed on our prompt reconstruction farm for 2015
and across the CMS WLCG Tier-1 facilities. We will discuss our experience with
a analysis data format which is ten times smaller than our primary Run 1
format. This "miniAOD" format has proven to be easier to analyze while be
extremely flexible for analysts. Finally, we describe improvements to our
workflow management system that have resulted in increased automation and
reliability for all facets of CMS production and user analysis operations.Comment: Contribution to proceedings of the 38th International Conference on
High Energy Physics (ICHEP 2016
- …