Search CORE

164 research outputs found

Recommended from our members

A microprogrammed operating system kernel

Author: Herbert Andrew Jame
Publication venue: University of Cambridge
Publication date: 19/03/1979
Field of study

The subject of the thesis is the design and implementation of an operating system kernel for the Cambridge Capability Computer (CAP). The kernel of an operating syst em is its most primitive level of facilities and forms the foundation stone a round which t he rest of the system is structured. The particular emphasis of the CAP kernel is concerned with protection - the control of access to information. The kernel uses the notion of capabilities to provide a flexible and controlled mechanism for the sharing of information within a computer system. The protection mechanisms include provision for the efficient control of access to memory as well as facilities for handling abstract resources like files and virtual peripherals. The kernel allows the introduction of new types of resources in addition to the basic set of hardware resourcee to permit user extension of the system. Attention is given to the problem of recall of privilege or revocation in capability systems and the kernel includes operations for both permanent and temporary revocation of particular access rights to information in a selective manner. In the past many of these functions have only been found in kernels implemented in user-level software which arc frequently cumbersome and inefficient. An examination is made of why this should be and·how efficiency and simplicity can be gained by a microprogrammed implementation. The thesis draws on the experience of a number of soft.ware kernels to discover the various design decisions that have to be made and the techniques that may be used to implement a successful kernel. The feasibility of the design arrived at by considering these issues is demonstratec1 by describinq its implementation on the Cambridge Capability Computer in terms of the primitives provided and the internal organisation of the proposed kernel. In an evaluation, the kernel is examined in the light of the analysis of other kernels to point out its strength s and weaknesses and to gain insights into the utility of the deign as a practical operating system kernel.Digitisation of this thesis was sponsored by Arcadia Fund, a charitable fund of Lisbet Rausing and Peter Baldwin

Apollo (Cambridge)

HAL-ASOS accelerator model: evolutive elasticity by design

Author: Cabral Jorge
Cardoso Paulo
Pinto Paulo
Silva Vítor Alberto Teixeira
Tavares Adriano
Publication venue: 'MDPI AG'
Publication date: 01/08/2021
Field of study

To address the integration of software threads and hardware accelerators into the Linux Operating System (OS) programming models, an accelerator architecture is proposed, based on micro-programmable hardware system calls, which fully export these resources into the Linux OS user-space through a design-specific virtual file system. The proposed HAL-ASOS accelerator model is split into a user-defined Hardware Task and a parameterizable Hardware Kernel with three differentiated transfer channels, aiming to explore distinct BUS technology interfaces and promote the accelerator to a first-class computing unit. This paper focuses on the Hardware Kernel and mainly its microcode control unit, which will leverage the elasticity to naturally evolve with Linux OS through key differentiating capabilities of field programmable gate arrays (FPGAs) when compared to the state of the art. To comply with the evolutive nature of Linux OS, or any Hardware Task incremental features, the proposed model generates page-faults signaling runtime errors that are handled at the kernel level as part of the virtual file system runtime. To evaluate the accelerator model’s programmability and its performance, a client-side application based on the AES 128-bit algorithm was implemented. Experiments demonstrate a flexible design approach in terms of hardware and software reconfiguration and significant performance increases consistent with rising processing demands or clock design frequencies.This work has been supported by FCT-Fundação para a Ciência e Tecnologia within the R&D Units Project Scope: UIDB/00319/2020

Multidisciplinary Digital Publishing Institute

Universidade do Minho: RepositoriUM

Directory of Open Access Journals

The FEM-2 design method

Author: Adams L. M.
Mehrotra P.
Patrick M.
Pratt T. W.
Vanrosendale J.
Voigt R. G.
Publication venue
Publication date
Field of study

The FEM-2 parallel computer is designed using methods differing from those ordinarily employed in parallel computer design. The major distinguishing aspects are: (1) a top-down rather than bottom-up design process; (2) the design considers the entire system structure in terms of layers of virtual machines; and (3) each layer of virtual machine is defined formally during the design process. The result is a complete hardware/software system design. The basic design method is discussed and the advantages of the method are considered. A status report on the FEM-2 design is included

NASA Technical Reports Server

Enabling preemptive multiprogramming on GPUs

Author: Cabezas Javier
Gelado Fernandez Isaac
Navarro Nacho
Ramírez Bellido Alejandro
Tanasic Ivan
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

GPUs are being increasingly adopted as compute accelerators in many domains, spanning environments from mobile systems to cloud computing. These systems are usually running multiple applications, from one or several users. However GPUs do not provide the support for resource sharing traditionally expected in these scenarios. Thus, such systems are unable to provide key multiprogrammed workload requirements, such as responsiveness, fairness or quality of service. In this paper, we propose a set of hardware extensions that allow GPUs to efficiently support multiprogrammed GPU workloads. We argue for preemptive multitasking and design two preemption mechanisms that can be used to implement GPU scheduling policies. We extend the architecture to allow concurrent execution of GPU kernels from different user processes and implement a scheduling policy that dynamically distributes the GPU cores among concurrently running kernels, according to their priorities. We extend the NVIDIA GK110 (Kepler) like GPU architecture with our proposals and evaluate them on a set of multiprogrammed workloads with up to eight concurrent processes. Our proposals improve execution time of high-priority processes by 15.6x, the average application turnaround time between 1.5x to 2x, and system fairness up to 3.4x.We would like to thank the anonymous reviewers, Alexan- der Veidenbaum, Carlos Villavieja, Lluis Vilanova, Lluc Al- varez, and Marc Jorda on their comments and help improving our work and this paper. This work is supported by Euro- pean Commission through TERAFLUX (FP7-249013), Mont- Blanc (FP7-288777), and RoMoL (GA-321253) projects, NVIDIA through the CUDA Center of Excellence program, Spanish Government through Programa Severo Ochoa (SEV-2011-0067) and Spanish Ministry of Science and Technology through TIN2007-60625 and TIN2012-34557 projects.Peer ReviewedPostprint (author’s final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Developing a support for FPGAs in the Controller parallel programming model

Author: Rodríguez Canal Gabriel
Publication venue
Publication date: 01/01/2020
Field of study

La computación heterogénea se presenta como la solución para conseguir supercomputadores cada vez más rápidos capaces de resolver problemas más grandes y complejos en diferentes áreas de conocimiento. Para ello, integra aceleradores con distintas arquitecturas capaces de explotar las características de los problemas desde distintos enfoques obteniendo, de este modo, un mayor rendimiento. Las FPGAs son hardware reconfigurable, i.e., es posible modificarlas después de su fabricación. Esto permite una gran flexibilidad y una máxima adaptación al problema en cuestión. Además, tienen un consumo energético muy bajo. Todas estas ventajas tienen el gran inconveniente de una más difícil programaci ón mediante los propensos a errores HDLs (Hardware Description Language), tales como Verilog o VHDL, y requisitos de conocimientos avanzados de electrónica digital. En los últimos años los principales fabricantes de FPGAs han enfocado sus esfuerzos en desarrollar herramientas HLS (High Level Synthesis) que permiten programarlas a través de lenguajes de programación de alto nivel estilo C. Esto ha favorecido su adopción por la comunidad HPC y su integración en los nuevos supercomputadores. Sin embargo, el programador aún tiene que ocuparse de aspectos como la gestión de colas de comandos, parámetros de lanzamiento o transferencias de datos. El modelo Controller es una librería que facilita la gestión de la coordinación, comunicación y los detalles de lanzamiento de los kernels en aceleradores hardware. Explota de forma transparente sus modelos de programación nativos, en concreto OpenCL y CUDA, y, por tanto, consigue un alto rendimiento independientemente del compilador. Permite al programador utilizar los distintos recursos hardware disponibles de forma combinada en entornos heterogéneos. Este trabajo extiende el modelo Controller mediante el desarrollo de un backend que permite la integración de FPGAs, manteniendo los cambios sobre la interfaz de usuario al mínimo. A través de los resultados experimentales se comprueba que se consigue una disminución del esfuerzo de programación significativa en comparación con la implementación nativa en OpenCL. Del mismo modo, se consigue un elevado solapamiento entre computación y comunicación y un sobrecoste por el uso de la librería despreciable.Heterogeneous computing appears to be the solution to achieve ever faster computers capable of solving bigger and more complex problems in difierent fields of knowledge. To that end, it integrates accelerators with difierent architectures capable of exploiting the features of problems from difierent perspectives thus achieving higher performance. FPGAs are reconfigurable hardware, i.e., it is possible to modify them after manufacture. This allows great flexibility and maximum adaptability to the given problem. In addition, they have low power consumption. All these advantages have the great objection of more dificult programming with the errorprone HDLs (Hardware Description Language), such as Verilog or VHDL, and the requirement of advanced knowledge of digital electronics. The main FPGA vendors have concentrated on developing HLS (High Level Synthesis) tools that allow to program them with C-like high level programming languages. This favoured their adoption by the HPC community and their integration in new supercomputers. However, the programmer still has to take care of aspects such as management of command queues, launching parameters or data transfers. The Controller model is a library to easily manage the coordination, communication and kernel launching details on hardware accelerators. It transparently exploits their native or vendor specific programming models, namely OpenCL and CUDA, thus enabling the potential performance obtained by using them in a compiler agnostic way. It is intended to enable the programmer to make use of the diferent available hardware resources in combination in heterogeneous environments. This work extends the Controller model through the development of a backend that allows the integration of FPGAs, keeping the changes over the user-facing interface to the minimum. The experimental results validate that a significant decrease in programming effort compared to the native OpenCL implementation is achieved. Similarly, high overlap of computation and communication and a negligible overhead due to the use of the library are attained.Grado en Ingeniería Informátic

Repositorio Documental de la Universidad de Valladolid

Timing Architecture for ESS

Author: Cereijo García Javier
Publication venue
Publication date: 01/01/2020
Field of study

Programa Oficial de Doutoramento en Investigación en Tecnoloxías da Información. 5023V01[Resumo] O sistema de temporización é unha compoñente fundamental para o control e sincronización de instalacións industriais e científicas, coma aceleradores de partículas. Nesta tese traballamos na especificación e desenvolvemento do sistema de temporización para a European Spallation Source (ESS), a maior fonte de neutróns actualmente en construción. Abordamos este tra ballo a dous niveis: a especificación do sistema de temporización, e a imple mentación física de sistemas de control empregando circuítos reconfigurables. Con respecto á especificación do sistema de temporización, deseñamos e implementamos a configuración do protocolo de temporización para cumprir cos requirimentos do ESS e ideamos un modo de operación e unha aplicación para a configuración e control do sistema de temporización. Tamén presentamos unha ferramenta e unha metodoloxía para imple mentar sistemas de control empregando FPGAs, coma os nodos do sistema de temporización. ámbalas <lúas están baseadas en statecharts, unha repre sentación gráfica de sistemas que expande o concepto de máquinas de estados finitos, orientada a sistemas que necesitan ser reconfigurados rápidamente en múltiples localizacións minimizando a posibilidade de erros. A ferramenta crea automaticamente código VHDL sintetizable a partir do statechart do sistema. A metodoloxía explica o procedemento para implementar o state chart como unha arquitectura microprogramada en FPGAs.[Resumen] El sistema de temporización es un componente fundamental para el control y sincronización de instalaciones industriales y científicas, como aceleradores e partículas. En esta tesis trabajamos en la especificación y desarrollo el sistema de temporización para la European Spallation Source (ESS), la mayor fuente de neutrones actualmente en construcción. Abordamos este trabajo en dos niveles: la especificación del sistema de temporización, y la mplementación física de sistemas de control empleando circuitos reconfig rables. Con respecto a la especificación del sistema de temporización, diseñamos e implementamos la configuración del protocolo de temporización para cumplir on los requisitos de ESS e ideamos un modo de operación y una aplicación ara la configuración y control del sistema de temporización. También presentamos una herramienta y una metodología para imple entar sistemas de control empleando FPGAs, como los nodos del sistema e temporización. Ambas están basadas en statecharts) una representación gráfica de sistemas que expande el concepto de máquinas de estados fini os, orientada a sistemas que necesitan ser reconfigurados rápidamente en últiples localizaciones minimizando la posibilidad de errores. La herramienta crea automáticamente código VHDL sintetizable a partir del statechart del sistema. La metodología explica el procedimiento para implementar el statechart como una arquitectura microprogramada en FPGAs.[Abstract] The timing system is a key component for the control and synchronization of industrial and scientific facilities, such as particle accelerators. In this thesis we tackle the specification and development of the timing system for the European Spallation Source (ESS), the largest neutron source currently in construction. We approach this work at two levels: the specification of the timing system and the physical implementation of control systems using reconfigurable hardware. Regarding the specification of the timing system, we designed and imple mented the configuration of the timing protocol to fulfil the requirements of ESS and devised an operation mode andan application for the configuration and control of the timing system. We also present one too! and one methodology to implement control systems using FPGAs, such as the nodes of the timing system. Both are based on statecharts, a graphical representation of systems that expand the concepts of Finite State Machines, targeted at systems that need to be re configured quickly in multiple locations minimizing the chance of errors. The too! automatically creates synthesizable VHDL code from a statechart of the system. The methodology explains the procedure to implement the statechart as a microprogrammed architecture in FPGAs

Repositorio da Universidade da Coruña

The mission oriented terminal area simulation facility

Author: Grove R. D.
Houck J. A.
Kaylor J. T.
Naftel P. B.
Simmons H. I.
Publication venue
Publication date
Field of study

The Mission Oriented Terminal Area Simulation (MOTAS) was developed to provide an ATC environment in which flight management and flight operations research studies can be conducted with a high degree of realism. This facility provides a flexible and comprehensive simulation of the airborne, ground-based and communication aspects of the airport terminal area environment. Major elements of the simulation are: an airport terminal area environment model, two air traffic controller stations, several aircraft models and simulator cockpits, four pseudo pilot stations, and a realistic air-ground communications network. MOTAS has been used for one study with the DC-9 simulator and a series of data link studies are planned in the near future

NASA Technical Reports Server

Survey of currently available high-resolution raster graphics systems

Author: Jones Denise R.
Publication venue
Publication date
Field of study

Presented are data obtained on high-resolution raster graphics engines currently available on the market. The data were obtained through survey responses received from various vendors and also from product literature. The questionnaire developed for this survey was basically a list of characteristics desired in a high performance color raster graphics system which could perform real-time aircraft simulations. Several vendors responded to the survey, with most reporting on their most advanced high-performance, high-resolution raster graphics engine

NASA Technical Reports Server

A Multi-Precision Bit-Serial Hardware Accelerator IP for Deep Learning Enabled Internet-of-Things

Author: Francesco Conti
Maurizio Capra
Maurizio Martina
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Deep Neural Networks (DNNs) computation-hungry algorithms demand hardware platforms capable of meeting rigid power and timing requirements. We introduce the Serial-MAC-engine (SMAC-engine), a fully-digital hardware accelerator for inference of quantized DNNs suitable for integration in a heterogeneous System-on-Chip (SoC). The accelerator is completely embedded in the form of a Hardware Processing Engine (HWPE) in the PULPissimo platform, a RISCV-based programmable architecture that targets the computational requirements of IoT applications. The SMAC-engine supports configurable precision for both weights (8/6/4 bits) and activations (8/4 bits), with scalable performance. Results in 65 nm technology demonstrate that the serial-MAC approach enables the accelerator to achieve a maximum throughput of 14.28 GMAC/s, consuming 0.58 pJ/MAC @ 1.0 V when operating at a precision of 4 bits for weights and 8 bits for activations

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

The Second Hungarian Workshop on Image Analysis : Budapest, June 7-9, 1988.

Author: Csetverikov Dimitrij
Álló Géza
Publication venue: MTA Számítástechnikai és Automatizálási Kutató Intézet
Publication date: 01/01/1988
Field of study

REAL-EOD