58,883 research outputs found
Late allocation and early release of physical registers
The register file is one of the critical components of current processors in terms of access time and power consumption. Among other things, the potential to exploit instruction-level parallelism is closely related to the size and number of ports of the register file. In conventional register renaming schemes, both register allocation and releasing are conservatively done, the former at the rename stage, before registers are loaded with values, and the latter at the commit stage of the instruction redefining the same register, once registers are not used any more. We introduce VP-LAER, a renaming scheme that allocates registers later and releases them earlier than conventional schemes. Specifically, physical registers are allocated at the end of the execution stage and released as soon as the processor realizes that there will be no further use of them. VP-LAER enhances register utilization, that is, the fraction of allocated registers having a value to be read in the future. Detailed cycle-level simulations show either a significant speedup for a given register file size or a reduction in the register file size for a given performance level, especially for floating-point codes, where the register file pressure is usually high.Peer ReviewedPostprint (published version
Some programming techniques for increasing program versatility and efficiency on CDC equipment
Five programming techniques used to decrease core and increase program versatility and efficiency are explained. The techniques are: (1) dynamic storage allocation, (2) automatic core-sizing and core-resizing, (3) matrix partitioning, (4) free field alphanumeric reads, and (5) incorporation of a data complex. The advantages of these techniques and the basic methods for employing them are explained and illustrated. Several actual program applications which utilize these techniques are described as examples
System configuration and executive requirements specifications for reusable shuttle and space station/base
System configuration and executive requirements specifications for reusable shuttle and space station/bas
The edge cloud: A holistic view of communication, computation and caching
The evolution of communication networks shows a clear shift of focus from
just improving the communications aspects to enabling new important services,
from Industry 4.0 to automated driving, virtual/augmented reality, Internet of
Things (IoT), and so on. This trend is evident in the roadmap planned for the
deployment of the fifth generation (5G) communication networks. This ambitious
goal requires a paradigm shift towards a vision that looks at communication,
computation and caching (3C) resources as three components of a single holistic
system. The further step is to bring these 3C resources closer to the mobile
user, at the edge of the network, to enable very low latency and high
reliability services. The scope of this chapter is to show that signal
processing techniques can play a key role in this new vision. In particular, we
motivate the joint optimization of 3C resources. Then we show how graph-based
representations can play a key role in building effective learning methods and
devising innovative resource allocation techniques.Comment: to appear in the book "Cooperative and Graph Signal Pocessing:
Principles and Applications", P. Djuric and C. Richard Eds., Academic Press,
Elsevier, 201
Analysis of data processing systems
Mathematical simulation models and software monitoring of multiprogramming computer syste
C-MOS array design techniques: SUMC multiprocessor system study
The current capabilities of LSI techniques for speed and reliability, plus the possibilities of assembling large configurations of LSI logic and storage elements, have demanded the study of multiprocessors and multiprocessing techniques, problems, and potentialities. Evaluated are three previous systems studies for a space ultrareliable modular computer multiprocessing system, and a new multiprocessing system is proposed that is flexibly configured with up to four central processors, four 1/0 processors, and 16 main memory units, plus auxiliary memory and peripheral devices. This multiprocessor system features a multilevel interrupt, qualified S/360 compatibility for ground-based generation of programs, virtual memory management of a storage hierarchy through 1/0 processors, and multiport access to multiple and shared memory units
PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development
This paper describes PlinyCompute, a system for development of
high-performance, data-intensive, distributed computing tools and libraries. In
the large, PlinyCompute presents the programmer with a very high-level,
declarative interface, relying on automatic, relational-database style
optimization to figure out how to stage distributed computations. However, in
the small, PlinyCompute presents the capable systems programmer with a
persistent object data model and API (the "PC object model") and associated
memory management system that has been designed from the ground-up for high
performance, distributed, data-intensive computing. This contrasts with most
other Big Data systems, which are constructed on top of the Java Virtual
Machine (JVM), and hence must at least partially cede performance-critical
concerns such as memory management (including layout and de/allocation) and
virtual method/function dispatch to the JVM. This hybrid approach---declarative
in the large, trusting the programmer's ability to utilize PC object model
efficiently in the small---results in a system that is ideal for the
development of reusable, data-intensive tools and libraries. Through extensive
benchmarking, we show that implementing complex objects manipulation and
non-trivial, library-style computations on top of PlinyCompute can result in a
speedup of 2x to more than 50x or more compared to equivalent implementations
on Spark.Comment: 48 pages, including references and Appendi
Apollo applications program data archives
Apollo applications program data archives to collect, store, retrieve, and distribute experiments-related dat
- …