Search CORE

8 research outputs found

A Novel Thread Scheduler Design for Polymorphic Embedded Systems

Author: Krishnamurthy Viswanath
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2010
Field of study

A novel thread scheduler design for polymorphic embedded systems Abstract: The ever-increasing complexity of current day embedded systems necessitates that these systems be adaptable and scalable to user demands. With the growing use of consumer electronic devices, embedded computing is steadily approaching the desktop computing trend. End users expect their consumer electronic devices to operate faster than before and offer support for a wide range of applications. In order to accommodate a broad range of user applications, the challenge is to come up with an efficient design for the embedded system scheduler. Hence the primary goal of the thesis is to design a thread scheduler for a polymorphic thread computing embedded system. This is the first ever novel attempt at designing a polymorphic thread scheduler as none of the existing or conventional schedulers have accounted for thread polymorphism. To summarize the thesis work, a dynamic thread scheduler for a Multiple Application, Multithreaded polymorphic system has been implemented with User satisfaction as its objective function. The sigmoid function helps to accurately model end user perception in an embedded system as opposed to the conventional systems where the objective is to maximize/minimize the performance metric such as performance, power, energy etc. The Polymorphic thread scheduler framework which operates in a dynamic environment with N multithreaded applications has been explained and evaluated. Randomly generated Application graphs are used to test the Polymorphic scheduler framework. The benefits obtained by using User Satisfaction as the objective function and the performance enhancements obtained using the novel thread scheduler are demonstrated clearly using the result graphs. The advantages of the proposed greedy thread scheduling algorithm are demonstrated by comparison against conventional thread scheduling approaches like First Come First Serve (FCFS) and priority scheduling schemes

Digital Repository @ Iowa State University (ISU)

Run-time services for hybrid cpu/fpga systems on chip

Author: David Andrews
Erik Anderson
Fabrice Baijot
Jason Agron
Jim Stevens
Ron Sass
Wesley Peck
Publication venue
Publication date: 01/01/2006
Field of study

Modern FPGA devices, which include (multiple) processor core(s) as diffused IP on the silicon die, provide an excellent platform for developing custom multiprocessor systems-on-programmable chip (MPSoPC) architectures. As researchers are investigating new methods for migrating portions of applications into custom hardware circuits, it is also critical to develop new run-time service frameworks to support these capabilities. Hthreads (HybridThreads) is a multithreaded RTOS kernel for hybrid FPGA/CPU systems designed to meet this new growing need. A key capability of hthreads is the migration of thread management, synchronization primitives, and run-time scheduling services for both hardware and software threads into hardware. This paper describes the hthreads scheduler, a key component for controlling both software-resident threads (SW threads) and threads implemented in programmable logic (HW threads). Run-time analysis shows that the hthreads scheduler module helps in reducing unwanted system overhead and jitter when compared to historical software schedulers, while fielding scheduling requests from both hardware and software threads in parallel with application execution. Run time analysis shows the scheduler achieves constant time scheduling for up to 256 active threads with a total of 128 different priority levels, while using uniform APIs for threads requesting OS services from either side of the hardware/software boundary

CiteSeerX

Crossref

Anwendungsgewahre statische Spezialisierung vormals dynamischer Systemaufrufe zur Verbesserung nichtfunktionaler Eigenschaften eingebetteter Echtzeitsysteme

Author: Fiedler Björn
Publication venue: Hannover : Institutionelles Repositorium der Leibniz Universität Hannover
Publication date: 01/01/2023
Field of study

Eingebettete Systeme sind aus unserem heutigen Leben nicht mehr wegzudenken. Sie sind allgegenwärtig in fast jedem Moment unseres täglichen Lebens um uns vorhanden und unterstützen unseren Alltag. Wir erwarten von diesen Systemen gleichzeitig sowohl hohe Kosteneffizienz in Entwicklung als auch Produktion. Gleichzeitig erwarten wir, dass diese zuverlässig arbeiten und stets erwartungsgemäß reagieren. Dies führt gerade bei der großen Stückzahl und dem weiter steigenden Vorkommen dieser Systeme zu einem immensen Druck auf den Entwicklungsprozess neuer Systeme. Während ein fertiges System entsprechend der Umgebung eine festgelegte Aufgabe und damit eine festgelegte Software-Anwendung hat, die es ausführt, sind die für dessen Implementierung und Ausführung verwendeten Werkzeuge nicht speziell für genau diese Aufgabe gedacht, sondern für eine Vielzahl möglicher Anwendungen. Dies bedeutet, dass sie einen deutlich größeren Funktionsumfang und eine größere Flexibilität in der Verwendung dessen ermöglichen, als von der konkreten Anwendung benötigt wird. In dieser Arbeit beschäftige ich mich mit den Echtzeitbetriebssystemen (EZBS), die als Ausführungsgrundlage dienen. Diese stellen ein breites Spektrum an Primitiven verschiedener Systemobjektklassen dazugehöriger Interaktionsmethoden zur Verfügung, von denen eine Anwendung nur eine Teilmenge verwendet. Bei den hier betrachteten dynamisch konfigurierten Systemen werden alle Systemobjekte zur Laufzeit konfiguriert und auch ihre Interaktionen sind ausschließlich durch den Verlauf des Programmcodes bestimmt. Ein Betriebssystem muss dementsprechend jederzeit beliebige Systemaufrufe akzeptieren können, auch wenn diese von der Anwendung nicht ausgeführt werden. Diese Freiheit verursacht pessimistische Annahmen für mögliche Interaktionsmuster und erzwingt eine dynamische Verwaltung aller Systemzustände und Systemobjekte. In dieser Arbeit stelle ich daher Verfahren vor, mit denen systematisch und automatisiert vormals dynamische Systemaufrufe unter Beachtung der Anforderungen einer gegebenen Anwendung statisch spezialisiert werden können, sodass sich insgesamt die nichtfunktionalen Eigenschaften des Gesamtsystems verbessern. Mittels statischer Analyse ermittle ich die von der Anwendung verwendeten Systemobjekte und deren mögliche Interaktionen. Mit diesem Wissen führe ich in Spezialisierungen in der Phase des Systemstarts und in der Arbeitsphase des Systems zur Übersetzungszeit durch. Der Systemstart optimiere ich, indem semantisch statische Systemobjekte bereits zur Übersetzungszeit instanziiert werden. Interaktionen während der Arbeitsphase optimiere ich, indem ich auf die tatsächlichen Verwendungsmuster spezialisierte Implementierungen von Systemobjekten und deren Interaktionen einsetze. Mit diesen Spezialisierungen bin ich in der Lage, sowohl Laufzeit als auch Speicherbedarf eines spezialisierten Systems zu reduzieren. Den Systemstart kann ich um bis zu 67 % beschleunigen. Bei der Ausführungszeit eines einzelnen Systemaufrufs zur Kommunikation zweier Systemobjekte sind bis zu 43 % Reduktion möglich. Als Ergebnis dieser Arbeit kann ich zeigen, dass eine automatische anwendungsgewahre statische Spezialisierung von vormals dynamischen Systemaufrufen gewinnbringend möglich ist. Dabei kann ich das Ergebnis von Systemaufrufen zur Laufzeit vorausberechnen und damit sowohl die sonst benötigte Laufzeit reduzieren, als auch eventuell nicht mehr benötigte Systemaufrufimplementierungen im Betriebssystem einsparen. Durch den Einsatz von anwendungsangepassten Implementierungen von Systemaufrufen ist eine weitere Verbesserung gegeben. Dies ist in einem fließenden Übergang möglich, sodass diejenigen Komponenten, die die Flexibilität der dynamischen Betriebssystemschnittstelle benötigen, diese weiterhin uneingeschränkt zur Verfügung steht. Die funktionalen Eigenschaften und Anforderungen werden dabei unter keinen Umständen verletzt.DFG/Sachbeihilfe im Normalverfahren/LO 1719/4-1/E

Institutionelles Repositorium der Leibniz Universität Hannover

Composition and synchronization of real-time components upon one processor

Author: Heuvel van den, M.M.H.P.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2013
Field of study

Many industrial systems have various hardware and software functions for controlling mechanics. If these functions act independently, as they do in legacy situations, their overall performance is not optimal. There is a trend towards optimizing the overall system performance and creating a synergy between the different functions in a system, which is achieved by replacing more and more dedicated, single-function hardware by software components running on programmable platforms. This increases the re-usability of the functions, but their synergy requires also that (parts of) the multiple software functions share the same embedded platform. In this work, we look at the composition of inter-dependent software functions on a shared platform from a timing perspective. We consider platforms comprised of one preemptive processor resource and, optionally, multiple non-preemptive resources. Each function is implemented by a set of tasks; the group of tasks of a function that executes on the same processor, along with its scheduler, is called a component. The tasks of a component typically have hard timing constraints. Fulfilling these timing constraints of a component requires analysis. Looking at a single function, co-operative scheduling of the tasks within a component has already proven to be a powerful tool to make the implementation of a function more predictable. For example, co-operative scheduling can accelerate the execution of a task (making it easier to satisfy timing constraints), it can reduce the cost of arbitrary preemptions (leading to more realistic execution-time estimates) and it can guarantee access to other resources without the need for arbitration by other protocols. Since timeliness is an important functional requirement, (re-)use of a component for composition and integration on a platform must deal with timing. To enable us to analyze and specify the timing requirements of a particular component in isolation from other components, we reserve and enforce the availability of all its specified resources during run-time. The real-time systems community has proposed hierarchical scheduling frameworks (HSFs) to implement this isolation between components. After admitting a component on a shared platform, a component in an HSF keeps meeting its timing constraints as long as it behaves as specified. If it violates its specification, it may be penalized, but other components are temporally isolated from the malignant effects. A component in an HSF is said to execute on a virtual platform with a dedicated processor at a speed proportional to its reserved processor supply. Three effects disturb this point of view. Firstly, processor time is supplied discontinuously. Secondly, the actual processor is faster. Thirdly, the HSF no longer guarantees the isolation of an individual component when two arbitrary components violate their specification during access to non-preemptive resources, even when access is arbitrated via well-defined real-time protocols. The scientific contributions of this work focus on these three issues. Our solutions to these issues cover the system design from component requirements to run-time allocation. Firstly, we present a novel scheduling method that enables us to integrate the component into an HSF. It guarantees that each integrated component executes its tasks exactly in the same order regardless of a continuous or a discontinuous supply of processor time. Using our method, the component executes on a virtual platform and it only experiences that the processor speed is different from the actual processor speed. As a result, we can focus on the traditional scheduling problem of meeting deadline constraints of tasks on a uni-processor platform. For such platforms, we show how scheduling tasks co-operatively within a component helps to meet the deadlines of this component. We compare the strength of these cooperative scheduling techniques to theoretically optimal schedulers. Secondly, we standardize the way of computing the resource requirements of a component, even in the presence of non-preemptive resources. We can therefore apply the same timing analysis to the components in an HSF as to the tasks inside, regardless of their scheduling or their protocol being used for non-preemptive resources. This increases the re-usability of the timing analysis of components. We also make non-preemptive resources transparent during the development cycle of a component, i.e., the developer of a component can be unaware of the actual protocol being used in an HSF. Components can therefore be unaware that access to non-preemptive resources requires arbitration. Finally, we complement the existing real-time protocols for arbitrating access to non-preemptive resources with mechanisms to confine temporal faults to those components in the HSF that share the same non-preemptive resources. We compare the overheads of sharing non-preemptive resources between components with and without mechanisms for confinement of temporal faults. We do this by means of experiments within an HSF-enabled real-time operating system

Repository TU/e

Pure OAI Repository

Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)

Author: Becker Jürgen
Hübner Michael
Lagadec Loïc
Sander Oliver
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2010
Field of study

ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability

KITopen

Implementation of an AMIDAR-based Java Processor

Author: Li Changgong
Publication venue
Publication date: 01/01/2019
Field of study

This thesis presents a Java processor based on the Adaptive Microinstruction Driven Architecture (AMIDAR). This processor is intended as a research platform for investigating adaptive processor architectures. Combined with a configurable accelerator, it is able to detect and speed up hot spots of arbitrary applications dynamically. In contrast to classical RISC processors, an AMIDAR-based processor consists of four main types of components: a token machine, functional units (FUs), a token distribution network and an FU interconnect structure. The token machine is a specialized functional unit and controls the other FUs by means of tokens. These tokens are delivered to the FUs over the token distribution network. The tokens inform the FUs about what to do with input data and where to send the results. Data is exchanged among the FUs over the FU interconnect structure. Based on the virtual machine architecture defined by the Java bytecode, a total of six FUs have been developed for the Java processor, namely a frame stack, a heap manager, a thread scheduler, a debugger, an integer ALU and a floating-point unit. Using these FUs, the processor can already execute the SPEC JVM98 benchmark suite properly. This indicates that it can be employed to run a broad variety of applications rather than embedded software only. Besides bytecode execution, several enhanced features have also been implemented in the processor to improve its performance and usability. First, the processor includes an object cache using a novel cache index generation scheme that provides a better average hit rate than the classical XOR-based scheme. Second, a hardware garbage collector has been integrated into the heap manager, which greatly reduces the overhead caused by the garbage collection process. Third, thread scheduling has been realized in hardware as well, which allows it to be performed concurrently with the running application. Furthermore, a complete debugging framework has been developed for the processor, which provides powerful debugging functionalities at both software and hardware levels

TUbiblio

tuprints

Interaction-aware analysis and optimization of real-time application and operating system

Author: Dietrich Christian
Publication venue: Hannover : Institutionelles Repositorium der Leibniz Universität Hannover
Publication date: 01/01/2019
Field of study

Mechanical and electronic automation was a key component of the technological advances in the last two hundred years. With the use of special-purpose machines, manual labor was replaced by mechanical motion, leaving workers with the operation of these machines, before also this task was conquered by embedded control systems. With the advances of general-purpose computing, the development of these control systems shifted more and more from a problem-specific one to a one-size-fits-all mentality as the trade-off between per-instance overheads and development costs was in favor of flexible and reusable implementations. However, with a scaling factor of thousands, if not millions, of deployed devices, overheads and inefficiencies accumulate; calling for a higher degree of specialization. For the area real-time operating systems (RTOSs), which form the base layer for many of these computerized control systems, we deploy way more flexibility than what is actually required for the applications that run on top of it. Since only the solution, but not the problem, became less specific to the control problem at hand, we have the chance to cut away inefficiencies, improve on system-analyses results, and optimize the resource consumption. However, such a tailoring will only be favorable if it can be performed without much developer interaction and in an automated fashion. Here, real-time systems are a good starting point, since we already have to have a large degree of static knowledge in order to guarantee their timeliness. Until now, this static nature is not exploited to its full extent and optimization potentials are left unused. The requirements of a system, with regard to the RTOS, manifest in the interactions between the application and the kernel. Threads request resources from the RTOS, which in return determines and enforces a scheduling order that will ensure the timely completion of all necessary computations. Since the RTOS runs only in the exception, its reaction to requests from the application (or from the environment) is its defining feature. In this thesis, I will grasp these interactions, and thereby the required RTOS semantic, in a control-flow-sensitive fashion. Extracted automatically, this knowledge about the reciprocal influence allows me to fit the implementation of a system closer to its actual requirements. The result is a system that is not only in its usage a special-purpose system, but also in its implementation and in its provided guarantees. In the development of my approach, it became clear that the focus on these interactions is not only highly fruitful for the optimization of a system, but also for its end-to-end analysis. Therefore, this thesis does not only provide methods to reduce the kernel-execution overhead and a system's memory consumption, but it also includes methods to calculate tighter response-time bounds and to give guarantees about the correct behavior of the kernel. All these contributions are enabled by my proposed interaction-aware methodology that takes the whole system, RTOS and application, into account. With this thesis, I show that a control-flow-sensitive whole-system view on the interactions is feasible and highly rewarding. With this approach, we can overcome many inefficiencies that arise from analyses that have an isolating focus on individual system components. Furthermore, the interaction-aware methods keep close to the actual implementation, and therefore are able to consider the behavioral patterns of the finally deployed real-time computing system

Institutionelles Repositorium der Leibniz Universität Hannover

Especialização e síntese de processadores para aplicação em sistemas de tempo-real

Author: Oliveira Arnaldo Silva Rodrigues de
Publication venue: Universidade de Aveiro
Publication date: 01/01/2007
Field of study

Doutoramento em Engenharia ElectrotécnicaA evolução da tecnologia microelectrónica ao longo das últimas décadas tem permitido um impressionante aumento da capacidade lógica dos circuitos integrados, sendo actualmente possível a construção de circuitos digitais complexos, específicos ou programáveis e integrados numa única pastilha. No final desta década será possível construir processadores e FPGAs com cerca de dez mil milhões de transístores. A utilização eficiente desta quantidade de recursos é um grande desafio cuja abordagem depende obviamente do domínio de aplicação. Os processadores superescalares actuais utilizam técnicas sofisticadas para atingir níveis elevados de desempenho, tais como a execução paralela de múltiplas instruções, superpipelining, execução especulativa, reordenação de instruções e hierarquias de memória complexas. Estas técnicas revelaram-se adequadas para melhorar o desempenho médio dos processadores para sistemas de uso geral mantendo ao mesmo tempo a compatibilidade, ao nível do modelo de programação, com os processadores escalares convencionais. No entanto, a sua implementação requer uma quantidade de recursos apreciável e coloca grandes desafios ao nível da validação e teste dos respectivos processadores. Estes processadores também consomem e dissipam quantidades consideráveis de energia e não apresentam um desempenho determinístico. O aumento da capacidade computacional e a redução do tamanho levaram a que os sistemas com microprocessadores passassem a estar presentes (ou embutidos) em muitos dos equipamentos e aplicações do dia-a-dia, tais como os transportes, as telecomunicações, os sistemas de segurança, a automação industrial, etc. Devido à interacção entre estes sistemas e o ambiente, estas aplicações possuem requisitos de operação em tempo-real que se não forem cumpridos podem provocar graves danos humanos e materiais. Por este motivo, os sistemas de tempo-real requerem abordagens de projecto específicas de forma a assegurar um comportamento funcional e temporalmente correcto. No entanto, por questões económicas, nas aplicações embutidas são muitas vezes utilizados componentes dos sistemas computacionais de uso geral. Em particular, em sistemas embutidos de tempo-real são frequentemente usados processadores de uso geral, o que pode levantar alguns problemas, principalmente devido à sua ineficiência energética e ao seu desempenho não determinístico. Assim, é necessário adoptar técnicas de projecto, por vezes muito conservativas, de forma a garantir um comportamento correcto mesmo nas situações mais desfavoráveis. A complexidade crescente dos sistemas e a necessidade de reduzir o seu tempo de projecto, tem levado a uma utilização crescente de executivos e sistemas operativos multi-tarefa, os quais implementam camadas de abstracção do hardware e disponibilizam um conjunto de serviços que reduzem o tempo de desenvolvimento. No entanto, estas camadas intermédias de software consomem tempo de processamento e são também elas próprias por vezes uma fonte de não determinismo. Nesta dissertação são discutidas ideias, apresentadas arquitecturas e avaliadas implementações de modelos para a especialização e síntese de processadores para aplicação em sistemas embutidos de tempo-real multi-tarefa que exploram eficientemente a capacidade de integração e a flexibilidade proporcionada pelas FPGAs actuais. O objectivo deste trabalho é validar a seguinte tese: Um processador adequado para sistemas embutidos de tempo-real multi-tarefa deve apresentar um desempenho determinístico, ser eficiente do ponto de vista energético, assim como proporcionar, através de hardware especializado, o suporte adequado para este tipo de aplicações. Tal processador pode ser construído com base numa estrutura mais simples e uma quantidade de recursos de hardware inferior à dos processadores de uso geral actuais, sendo portanto mais simples de validar e implementar. A utilização de modelos sintetizáveis e parametrizáveis e sua implementação em dispositivos lógicos programáveis torna possível a construção de processadores à medida da aplicação alvo. As principais contribuições originais desta dissertação são a concepção de arquitecturas e modelos sintetizáveis de um processador pipelined multi-tarefa determinístico e respectivo coprocessador para suporte do sistema operativo de tempo-real. O ponto de partida deste trabalho foi a elaboração de um modelo criado de raiz para implementação da arquitectura MIPS32 em FPGA. Esse modelo, denominado ARPA-CP (Advanced Real-time Processor Architecture - with Configurable Pipeline) é parametrizável, sintetizável e independente da tecnologia. O modelo do processador pipelined ARPA-CP foi estendido com capacidades de multi-tarefa simultânea (Simultaneous Multithreading) resultando no processador ARPA-MT (ARPA - MultiThreaded), também implementado e prototipado em FPGA. A utilização de técnicas de multi-tarefa simultânea visa essencialmente melhorar o desempenho dos processadores destinados a sistemas de tempo-real multi-tarefa sem recorrer a técnicas de execução especulativa, mantendo portanto o desempenho determinístico. No âmbito deste doutoramento foi também concebido e projectado o coprocessador ARPA-OSC (ARPA - Operating System Coprocessor), para implementação em hardware dos mecanismos básicos de um sistema operativo de tempo-real, nomeadamente, a temporização, o escalonamento de tarefas, o controlo de acesso a recursos partilhados, a comutação de tarefas em execução, a verificação do cumprimento das restrições temporais e o atendimento de interrupções. A avaliação do desempenho deste coprocessador mostrou que a sua utilização permite obter reduções de uma a duas ordens de grandeza e valores mais determinísticos do tempo de execução de algumas das funções do executivo de tempo-real OReK, desenvolvido para abstrair e proporcionar uma interface de programação adequada do coprocessador ARPA-OSC. Todas as arquitecturas concebidas no âmbito deste trabalho foram modeladas ao nível RTL com a linguagem de descrição de hardware VHDL. Os modelos construídos são independentes da tecnologia e parametrizáveis de forma a que certos aspectos possam ser modificados durante a fase de síntese e implementação com ferramentas de projecto assistido por computador. A sua prototipagem foi realizada em FPGAs da Xilinx.The continuous evolution of the microelectronics technology during the last decades has allowed an impressive growth of the logic capacity that can be integrated on a single chip. It is now possible to manufacture complex digital circuits fully integrated on application specific or field programmable devices. By 2010 it will be possible to build processors and FPGAs containing about 10 thousand million transistors on a single chip. The efficient use of this huge transistor budget is a challenge being the approach highly dependent on the application domain. Current superscalar processors employ sophisticated techniques to achieve high levels of performance, such as parallel instruction issue, superpipelining, prediction, speculation, out-of-order execution and complex memory hierarchies. These techniques proved themselves very effective to improve the average performance of general purpose processors, being at the same time backward compatible and maintaining the programming model and sequential execution semantics of the conventional scalar processors. However, their implementation requires complex architectures and considerable hardware resources with the inherent time consuming validation and test procedures. Those processors also consume large amounts of power and exhibit a non deterministic performance. The improvement of computational power and the size reduction have allowed the utilization (or embedding) of microprocessor based systems within many equipments and real world applications such as transportation, telecommunications, security, industrial automation, etc. Due to the close interaction between these systems and the surrounding environment, this class of applications has real-time operation constraints that must be fulfilled or serious human and material injuries can occur otherwise. Thus, real-time systems require specific design approaches to ensure correct functional and timing behaviors. However, economical reasons motivate the use of commercially available of the shelf and general purpose components in the design of embedded systems. In particular, general purpose processors are often used in real-time embedded systems which can cause several problems, mainly due to its power inefficiency and non deterministic performance. For this reason it is necessary to adopt design techniques, sometimes very conservative, to ensure a correct behavior even under worst case conditions. The raising complexity of systems and the ever shrinking time to market led to an increasing use of existing frameworks, middlewares, multitasking executives and real-time operating systems, which implement abstraction layers and provide a set of services that reduce design time. However, these software layers require processing time, reducing the processor time available for the application and sometimes are also a source of non determinism. This dissertation discusses ideas, presents architectures and evaluates implementations of customizable and synthesizable processor models optimized for multitasking real-time embedded systems, which explore efficiently the integration capacity and flexibility provided by current FPGAs. The main goal of this work is to validate the following thesis: A processor optimized for multitasking embedded systems must exhibit a deterministic performance, be energy efficient, as well as provide, through specialized hardware, the adequate support for this class of applications. Such a processor can be based on a simpler structure and built with less hardware resources than general purpose processors, being easier to validate and to implement. The use of synthesizable and parameterizable models and their implementation in field programmable logic devices make possible the construction of processors customized for the target application. The main original contributions of this Ph.D. are the conception of architectures and synthesizable models for a deterministic, multitasking, pipelined processor and the respective coprocessor for real-time operating system support. The starting point of this work was the elaboration of a model created from scratch for FPGA implementation of the MIPS32 architecture. This model, named ARPA-CP (Advanced Real-time Processor Architecture - with Configurable Pipeline), is parameterizable, synthesizable and technology independent. The ARPA-CP pipelined processor model was extended with Simultaneous MultiThreading (SMT) capabilities, resulting in the ARPA-MT (ARPA - MultiThreaded) processor, also implemented and prototyped in FPGA. The motivation for using SMT techniques is the improvement of the processor performance for multitasking real-time systems without employing prediction or speculative execution techniques, keeping the performance deterministic. In the scope of this work it was also created and designed the ARPA-OSC coprocessor (ARPA - Operating System Coprocessor), for hardware implementation of basic real-time operating system functions, such as timing, task scheduling, synchronization for accessing shared resources, task switching, verification of timing constraints and interrupt servicing. The hardware implementation of these functions allows executing them in less time and in a more predictable manner when compared with a software implementation, reducing the overhead of operating system execution. The performance evaluation of this coprocessor has shown reductions of one to two orders of magnitude in the execution time of some of the functions of the OReK real-time executive, developed to provide an adequate application programmable interface for the ARPA-OSC coprocessor. All architectures were modelled at RTL level using VHDL. The models built are technology independent and parameterizable to allow the modification of several parameters during the synthesis phase using CAD/CAE tools. The prototyping was performed with Xilinx FPGAs.FCTFS

Repositório Institucional da Universidade de Aveiro