Search CORE

177 research outputs found

VIRTUAL MEMORY ON A MANY-CORE NOC

Author: McMenamin Adrian Ciaran
Publication venue: University of York
Publication date: 01/06/2019
Field of study

Many-core devices are likely to become increasingly common in real-time and embedded systems as computational demands grow and as expectations for higher performance can generally only be met by by increasing core numbers rather than relying on higher clock speeds. Network-on-chip devices, where multiple cores share a single slice of silicon and employ packetised communications, are a widely-deployed many-core option for system designers. As NoCs are expected to run larger and more complex programs, the small amount of fast, on-chip memory available to each core is unlikely to be sufficient for all but the simplest of tasks, and it is necessary to find an efficient, effective, and time-bounded, means of accessing resources stored in off-chip memory, such as DRAM or Flash storage. The abstraction of paged virtual memory is a familiar technique to manage similar tasks in general computing but has often been shunned by real-time developers because of concern about time predictability. We show it can be a poor choice for a many-core NoC system as, unmodified, it typically uses page sizes optimised for interaction with spinning disks and not solid state media, and transports significant volumes of subsequently unused data across already congested links. In this work we outline and simulate an efficient partial paging algorithm where only those memory resources that are locally accessed are transported between global and local storage. We further show that smaller page sizes add to efficiency. We examine the factors that lead to timing delays in such systems, and show we can predict worst case execution times at even safety-critical thresholds by using statistical methods from extreme value theory. We also show these results are applicable to systems with a variety of connections to memory

White Rose E-theses Online

C-MOS array design techniques: SUMC multiprocessor system study

Author: Clapp W. A.
Helbig W. A.
Merriam A. S.
Publication venue
Publication date
Field of study

The current capabilities of LSI techniques for speed and reliability, plus the possibilities of assembling large configurations of LSI logic and storage elements, have demanded the study of multiprocessors and multiprocessing techniques, problems, and potentialities. Evaluated are three previous systems studies for a space ultrareliable modular computer multiprocessing system, and a new multiprocessing system is proposed that is flexibly configured with up to four central processors, four 1/0 processors, and 16 main memory units, plus auxiliary memory and peripheral devices. This multiprocessor system features a multilevel interrupt, qualified S/360 compatibility for ground-based generation of programs, virtual memory management of a storage hierarchy through 1/0 processors, and multiport access to multiple and shared memory units

NASA Technical Reports Server

Dynamic Binary Translation for Embedded Systems with Scratchpad Memory

Author: Baiocchi Paredes Jose Americo
Publication venue
Publication date: 01/01/2011
Field of study

Embedded software development has recently changed with advances in computing. Rather than fully co-designing software and hardware to perform a relatively simple task, nowadays embedded and mobile devices are designed as a platform where multiple applications can be run, new applications can be added, and existing applications can be updated. In this scenario, traditional constraints in embedded systems design (i.e., performance, memory and energy consumption and real-time guarantees) are more difficult to address. New concerns (e.g., security) have become important and increase software complexity as well. In general-purpose systems, Dynamic Binary Translation (DBT) has been used to address these issues with services such as Just-In-Time (JIT) compilation, dynamic optimization, virtualization, power management and code security. In embedded systems, however, DBT is not usually employed due to performance, memory and power overhead. This dissertation presents StrataX, a low-overhead DBT framework for embedded systems. StrataX addresses the challenges faced by DBT in embedded systems using novel techniques. To reduce DBT overhead, StrataX loads code from NAND-Flash storage and translates it into a Scratchpad Memory (SPM), a software-managed on-chip SRAM with limited capacity. SPM has similar access latency as a hardware cache, but consumes less power and chip area. StrataX manages SPM as a software instruction cache, and employs victim compression and pinning to reduce retranslation cost and capture frequently executed code in the SPM. To prevent performance loss due to excessive code expansion, StrataX minimizes the amount of code inserted by DBT to maintain control of program execution. When a hardware instruction cache is available, StrataX dynamically partitions translated code among the SPM and main memory. With these techniques, StrataX has low performance overhead relative to native execution for MiBench programs. Further, it simplifies embedded software and hardware design by operating transparently to applications without any special hardware support. StrataX achieves sufficiently low overhead to make it feasible to use DBT in embedded systems to address important design goals and requirements

CiteSeerX

D-Scholarship@Pitt

Maruchi OS kankyo o shiensuru sofutowea oyobi hadowea kino no teian

Author: Kinebuchi Yuki
Publication venue
Publication date: 01/01/2012
Field of study

制度:新 ; 報告番号:甲3534号 ; 学位の種類:博士(工学) ; 授与年月日:2012/2/25 ; 早大学位記番号:新587

Waseda University Repository

Reducing exception management overhead with software restart markers

Author: Hampton Mark Jerome, 1977-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2008
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 181-196).Modern processors rely on exception handling mechanisms to detect errors and to implement various features such as virtual memory. However, these mechanisms are typically hardware-intensive because of the need to buffer partially-completed instructions to implement precise exceptions and enforce in-order instruction commit, often leading to issues with performance and energy efficiency. The situation is exacerbated in highly parallel machines with large quantities of programmer-visible state, such as VLIW or vector processors. As architects increasingly rely on parallel architectures to achieve higher performance, the problem of exception handling is becoming critical. In this thesis, I present software restart markers as the foundation of an exception handling mechanism for explicitly parallel architectures. With this model, the compiler is responsible for delimiting regions of idempotent code. If an exception occurs, the operating system will resume execution from the beginning of the region. One advantage of this approach is that instruction results can be committed to architectural state in any order within a region, eliminating the need to buffer those values. Enabling out-of-order commit can substantially reduce the exception management overhead found in precise exception implementations, and enable the use of new architectural features that might be prohibitively costly with conventional precise exception implementations. Additionally, software restart markers can be used to reduce context switch overhead in a multiprogrammed environment. This thesis demonstrates the applicability of software restart markers to vector, VLIW, and multithreaded architectures. It also contains an implementation of this exception handling approach that uses the Trimaran compiler infrastructure to target the Scale vectorthread architecture. I show that using software restart markers incurs very little performance overhead for vector-style execution on Scale.(cont.) Finally, I describe the Scale compiler flow developed as part of this work and discuss how it targets certain features facilitated by the use of software restart markersby Mark Jerome Hampton.Ph.D

DSpace@MIT

The Design of a System Architecture for Mobile Multimedia Computers

Author: Havinga Paul Johannes Mattheus
Publication venue: University of Twente
Publication date: 01/01/2000
Field of study

This chapter discusses the system architecture of a portable computer, called Mobile Digital Companion, which provides support for handling multimedia applications energy efficiently. Because battery life is limited and battery weight is an important factor for the size and the weight of the Mobile Digital Companion, energy management plays a crucial role in the architecture. As the Companion must remain usable in a variety of environments, it has to be flexible and adaptable to various operating conditions. The Mobile Digital Companion has an unconventional architecture that saves energy by using system decomposition at different levels of the architecture and exploits locality of reference with dedicated, optimised modules. The approach is based on dedicated functionality and the extensive use of energy reduction techniques at all levels of system design. The system has an architecture with a general-purpose processor accompanied by a set of heterogeneous autonomous programmable modules, each providing an energy efficient implementation of dedicated tasks. A reconfigurable internal communication network switch exploits locality of reference and eliminates wasteful data copies

CiteSeerX

University of Twente Research Information

FPGA dynamic and partial reconfiguration : a survey of architectures, methods, and applications

Author: Fahmy Suhaib A.
Vipin Kizheppatt
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/09/2018
Field of study

Dynamic and partial reconfiguration are key differentiating capabilities of field programmable gate arrays (FPGAs). While they have been studied extensively in academic literature, they find limited use in deployed systems. We review FPGA reconfiguration, looking at architectures built for the purpose, and the properties of modern commercial architectures. We then investigate design flows, and identify the key challenges in making reconfigurable FPGA systems easier to design. Finally, we look at applications where reconfiguration has found use, as well as proposing new areas where this capability places FPGAs in a unique position for adoption

Warwick Research Archives Portal Repository

The Ridge Operating System: High Performance through MessagePassing and Virtual Memory

Author: Ed Basart
Publication venue
Publication date: 24/04/2020
Field of study

ABSTRACT The Ridge operating system is decomposed into processes and relies on message passing for its interprocess communication. Messages and processes are used to improve reliability and extensibility and to facilitate networking. The challenge was to provide a high performance UNIXt implementation in this environment. The technique used was to blend in other operating facilities, such as virtual memory, with the message system. Key aspects of the design were to minimize the number of primitives and to provide support from the Ridge instruction set architecture

CiteSeerX

Recommended from our members

A New Protection Model for Component-Based

Author: Law G.
Publication venue
Publication date
Field of study

Protected operating systems multiplex programs onto resources such that they are isolated from one another — that is, concurrently executing programs cannot interfere with each other. A layer of software known as the kernel provides this protection to the software layers above it. Untrusted, ‘user’ programs are prevented from controlling the protection hardware because they are executed when the processor is in user mode — a mode of reduced privilege. In user mode, instructions that can be used to circumvent protection are unavailable; the processor’s instruction-set is reduced. This thesis introduces a new operating system protection mechanism termed SISR — Software-based Instruction Set Reduction (pronounced scissor). Here, all software (including the kernel) executes in the same processor mode, while both language independence and protection are maintained. Untrusted (that is, ‘user level’) code is prevented from issuing privileged instructions not by reducing the processor’s instruction set, but by scanning code prior to its loading; any code found to contain privileged instructions is not loaded. Memory protection is provided through segmentation. SISR leads to improved architectures (that is, simpler and more modular), and improves performance significantly. Its low overheads make fine-grained protection practical, making it especially well-suited to component-based operating systems. A prototype system has been built for x86-based PCs as a ‘proof-of-concept’. Significant improvements in architectures have been delivered. Tasks that have previously been inextricably linked (such as interrupt handling and CPU scheduling) have been separated into distinct components. Experiments have demonstrated significant improvements in performance, compared even to the leanest research operating systems

City Research Online