573 research outputs found
A decoupled local memory allocator
Compilers use software-controlled local memories to provide fast, predictable, and power-efficient access to critical data. We show that the local memory allocation for straight-line, or linearized programs is equivalent to a weighted interval-graph coloring problem. This problem is new when allowing a color interval to "wrap around," and we call it the submarine-building problem. This graph-theoretical decision problem differs slightly from the classical ship-building problem, and exhibits very interesting and unusual complexity properties. We demonstrate that the submarine-building problem is NP-complete, while it is solvable in linear time for not-so-proper interval graphs, an extension of the the class of proper interval graphs. We propose a clustering heuristic to approximate any interval graph into a not-so-proper interval graph, decoupling spill code generation from local memory assignment. We apply this heuristic to a large number of randomly generated interval graphs reproducing the statistical features of standard local memory allocation benchmarks, comparing with state-of-the-art heuristics. © 2013 ACM
From Big Data to Big Displays: High-Performance Visualization at Blue Brain
Blue Brain has pushed high-performance visualization (HPV) to complement its
HPC strategy since its inception in 2007. In 2011, this strategy has been
accelerated to develop innovative visualization solutions through increased
funding and strategic partnerships with other research institutions.
We present the key elements of this HPV ecosystem, which integrates C++
visualization applications with novel collaborative display systems. We
motivate how our strategy of transforming visualization engines into services
enables a variety of use cases, not only for the integration with high-fidelity
displays, but also to build service oriented architectures, to link into web
applications and to provide remote services to Python applications.Comment: ISC 2017 Visualization at Scale worksho
SLA-Oriented Resource Provisioning for Cloud Computing: Challenges, Architecture, and Solutions
Cloud computing systems promise to offer subscription-oriented,
enterprise-quality computing services to users worldwide. With the increased
demand for delivering services to a large number of users, they need to offer
differentiated services to users and meet their quality expectations. Existing
resource management systems in data centers are yet to support Service Level
Agreement (SLA)-oriented resource allocation, and thus need to be enhanced to
realize cloud computing and utility computing. In addition, no work has been
done to collectively incorporate customer-driven service management,
computational risk management, and autonomic resource management into a
market-based resource management system to target the rapidly changing
enterprise requirements of Cloud computing. This paper presents vision,
challenges, and architectural elements of SLA-oriented resource management. The
proposed architecture supports integration of marketbased provisioning policies
and virtualisation technologies for flexible allocation of resources to
applications. The performance results obtained from our working prototype
system shows the feasibility and effectiveness of SLA-based resource
provisioning in Clouds.Comment: 10 pages, 7 figures, Conference Keynote Paper: 2011 IEEE
International Conference on Cloud and Service Computing (CSC 2011, IEEE
Press, USA), Hong Kong, China, December 12-14, 201
CONFLLVM: A Compiler for Enforcing Data Confidentiality in Low-Level Code
We present an instrumenting compiler for enforcing data confidentiality in
low-level applications (e.g. those written in C) in the presence of an active
adversary. In our approach, the programmer marks secret data by writing
lightweight annotations on top-level definitions in the source code. The
compiler then uses a static flow analysis coupled with efficient runtime
instrumentation, a custom memory layout, and custom control-flow integrity
checks to prevent data leaks even in the presence of low-level attacks. We have
implemented our scheme as part of the LLVM compiler. We evaluate it on the SPEC
micro-benchmarks for performance, and on larger, real-world applications
(including OpenLDAP, which is around 300KLoC) for programmer overhead required
to restructure the application when protecting the sensitive data such as
passwords. We find that performance overheads introduced by our instrumentation
are moderate (average 12% on SPEC), and the programmer effort to port OpenLDAP
is only about 160 LoC.Comment: Technical report for CONFLLVM: A Compiler for Enforcing Data
Confidentiality in Low-Level Code, appearing at EuroSys 201
Recommended from our members
High Performance Architecture using Speculative Threads and Dynamic Memory Management Hardware
With the advances in very large scale integration (VLSI) technology, hundreds of billions of transistors can be packed into a single chip. With the increased hardware budget, how to take advantage of available hardware resources becomes an important research area. Some researchers have shifted from control flow Von-Neumann architecture back to dataflow architecture again in order to explore scalable architectures leading to multi-core systems with several hundreds of processing elements. In this dissertation, I address how the performance of modern processing systems can be improved, while attempting to reduce hardware complexity and energy consumptions. My research described here tackles both central processing unit (CPU) performance and memory subsystem performance. More specifically I will describe my research related to the design of an innovative decoupled multithreaded architecture that can be used in multi-core processor implementations. I also address how memory management functions can be off-loaded from processing pipelines to further improve system performance and eliminate cache pollution caused by runtime management functions
NUMASK: High Performance Scalable Skip List for NUMA
This paper presents NUMASK, a skip list data structure specifically designed to exploit the characteristics of Non-Uniform Memory Access (NUMA) architectures to improve performance. NUMASK deploys an architecture around a concurrent skip list so that all metadata accesses (e.g., traversals of the skip list index levels) read and write memory blocks allocated in the NUMA zone where the thread is executing. To the best of our knowledge, NUMASK is the first NUMA-aware skip list design that goes beyond merely limiting the performance penalties introduced by NUMA, and leverages the NUMA architecture to outperform state-of-the-art concurrent high-performance implementations. We tested NUMASK on a four-socket server. Its performance scales for both read-intensive and write-intensive workloads (tested up to 160 threads). In write-intensive workload, NUMASK shows speedups over competitors in the range of 2x to 16x
Recursive internetwork architecture, investigating RINA as an alternative to TCP/IP (IRATI)
Driven by the requirements of the emerging applications and networks, the Internet has become an architectural patchwork of growing complexity which strains to cope with the changes. Moore’s law prevented us from recognising that the problem does not hide in the high demands of today’s applications but lies in the flaws of the Internet’s original design. The Internet needs to move beyond TCP/IP to prosper in the long term, TCP/IP has outlived its usefulness.
The Recursive InterNetwork Architecture (RINA) is a new Internetwork architecture whose fundamental principle is that networking is only interprocess communication (IPC). RINA reconstructs the overall structure of the Internet, forming a model that comprises a single repeating layer, the DIF (Distributed IPC Facility), which is the minimal set of components required to allow distributed IPC between application processes. RINA supports inherently and without the need of extra mechanisms mobility, multi-homing and Quality of Service, provides a secure and configurable environment, motivates for a more competitive marketplace and allows for a seamless adoption.
RINA is the best choice for the next generation networks due to its sound theory, simplicity and the features it enables. IRATI’s goal is to achieve further exploration of this new architecture. IRATI will advance the state of the art of RINA towards an architecture reference model and specifcations that are closer to enable implementations deployable in production scenarios.
The design and implemention of a RINA prototype on top of Ethernet will permit the experimentation and evaluation of RINA in comparison to TCP/IP. IRATI will use the OFELIA testbed to carry on its experimental activities. Both projects will benefit from the collaboration. IRATI will gain access to a large-scale testbed with a controlled network while OFELIA will get a unique use-case to validate the facility: experimentation of a non-IP based Internet
- …