5,447 research outputs found

    The potential of programmable logic in the middle: cache bleaching

    Full text link
    Consolidating hard real-time systems onto modern multi-core Systems-on-Chip (SoC) is an open challenge. The extensive sharing of hardware resources at the memory hierarchy raises important unpredictability concerns. The problem is exacerbated as more computationally demanding workload is expected to be handled with real-time guarantees in next-generation Cyber-Physical Systems (CPS). A large body of works has approached the problem by proposing novel hardware re-designs, and by proposing software-only solutions to mitigate performance interference. Strong from the observation that unpredictability arises from a lack of fine-grained control over the behavior of shared hardware components, we outline a promising new resource management approach. We demonstrate that it is possible to introduce Programmable Logic In-the-Middle (PLIM) between a traditional multi-core processor and main memory. This provides the unique capability of manipulating individual memory transactions. We propose a proof-of-concept system implementation of PLIM modules on a commercial multi-core SoC. The PLIM approach is then leveraged to solve long-standing issues with cache coloring. Thanks to PLIM, colored sparse addresses can be re-compacted in main memory. This is the base principle behind the technique we call Cache Bleaching. We evaluate our design on real applications and propose hypervisor-level adaptations to showcase the potential of the PLIM approach.Accepted manuscrip

    Design and evaluation of acceleration strategies for speeding up the development of dialog applications

    Get PDF
    In this paper, we describe a complete development platform that features different innovative acceleration strategies, not included in any other current platform, that simplify and speed up the definition of the different elements required to design a spoken dialog service. The proposed accelerations are mainly based on using the information from the backend database schema and contents, as well as cumulative information produced throughout the different steps in the design. Thanks to these accelerations, the interaction between the designer and the platform is improved, and in most cases the design is reduced to simple confirmations of the “proposals” that the platform dynamically provides at each step. In addition, the platform provides several other accelerations such as configurable templates that can be used to define the different tasks in the service or the dialogs to obtain or show information to the user, automatic proposals for the best way to request slot contents from the user (i.e. using mixed-initiative forms or directed forms), an assistant that offers the set of more probable actions required to complete the definition of the different tasks in the application, or another assistant for solving specific modality details such as confirmations of user answers or how to present them the lists of retrieved results after querying the backend database. Additionally, the platform also allows the creation of speech grammars and prompts, database access functions, and the possibility of using mixed initiative and over-answering dialogs. In the paper we also describe in detail each assistant in the platform, emphasizing the different kind of methodologies followed to facilitate the design process at each one. Finally, we describe the results obtained in both a subjective and an objective evaluation with different designers that confirm the viability, usefulness, and functionality of the proposed accelerations. Thanks to the accelerations, the design time is reduced in more than 56% and the number of keystrokes by 84%

    The "MIND" Scalable PIM Architecture

    Get PDF
    MIND (Memory, Intelligence, and Network Device) is an advanced parallel computer architecture for high performance computing and scalable embedded processing. It is a Processor-in-Memory (PIM) architecture integrating both DRAM bit cells and CMOS logic devices on the same silicon die. MIND is multicore with multiple memory/processor nodes on each chip and supports global shared memory across systems of MIND components. MIND is distinguished from other PIM architectures in that it incorporates mechanisms for efficient support of a global parallel execution model based on the semantics of message-driven multithreaded split-transaction processing. MIND is designed to operate either in conjunction with other conventional microprocessors or in standalone arrays of like devices. It also incorporates mechanisms for fault tolerance, real time execution, and active power management. This paper describes the major elements and operational methods of the MIND architecture

    Configurable Version Management Hardware Transactional Memory for Multi-processor Platform

    Get PDF
    Programming on a shared memory multi-processor platforms in an efficient way is difficult as locked based synchronization limits the efficiency. Transactional memory (TM) is a promising approach in creating an abstraction layer for multi-threaded programming. However, the performance of TM is application-specific. In general, the configuration of a TM is divided into version management and conflict management. Each scheme has its strengths and weaknesses depending on executing application. Previous TM implementations for embedded system were built on fixed version management configuration which results in significant performance loss when transaction behaviour changes. In this paper, we propose a hardware transactional memory (HTM) with interchangeable version management. Random requests at different contention levels are used to verify the performance of the proposed TM. The proposed architecture is targeted for embedded applications and is area-efficient compared to current implementations that apply cache coherence protocols

    Adapting an IP MC6805 core for multiprocessing and multitasking

    Get PDF
    The availability of high-density field configurable devices provides the opportunity for designing highly integrated solutions (SOPC: System On a Programmable Chip).\nAmong the SOPC solutions, a case is the integration of an embedded single processor equipped with a multitasking operating system. As an alternative to a single processor the embedding of various processors on a chip, even heterogeneous and with multitasking capacity, may be considered.\nA distinctive characteristic of a SOPC device is that the tasks to be performed are well known before the design starts. That feature is opposed to the traditional multiprocessing and multitasking systems in which general purpose applications are adopted during design. The benefit of this knowledge is that hardware as well as software can be adapted to fit the application’s requirements.\nThis paper presents the hardware modifications performed on an microcontroller embedded core, to allow its inclusion as a multitasking device in a “multiprocessor on a chip”, through the addition of a hardware task manager (scheduler) and communication channels among processors.La disponibilidad de dispositivos de Lógica Programable de alta densidad de integración permite buscar soluciones integradas en un dispositivo SOPC (System On a Programmable Chip).\nUn tema de creciente interés son los procesadores empotrados, siendo usual un único procesador y un sistema operativo con capacidad de multitarea.\nSin embargo, debe considerarse como alternativa insertar varios procesadores, no necesariamente idénticos, que pueden a su vez atender varias tareas. En un SOPC, como diferencia fundamental con los casos tradicionales de multiprocesamiento y multitarea, las tareas a realizar son conocidas antes de comenzar el diseño, por lo tanto hardware como software se pueden configurar a medida de la aplicación, combinando la velocidad propia del primero, con la versatilidad del segundo.\nEste artículo describe las modificaciones de hardware realizadas al núcleo IP (Intellectual Property) de un procesador, de modo de permitir la inclusión de un administrador de tareas por hardware y de canales de comunicación interprocesadores

    Adapting an IP MC6805 core for multiprocessing and multitasking

    Get PDF
    La disponibilidad de dispositivos de Lógica Programable de alta densidad de integración permite buscar soluciones integradas en un dispositivo SOPC (System On a Programmable Chip). Un tema de creciente interés son los procesadores empotrados, siendo usual un único procesador y un sistema operativo con capacidad de multitarea. Sin embargo, debe considerarse como alternativa insertar varios procesadores, no necesariamente idénticos, que pueden a su vez atender varias tareas. En un SOPC, como diferencia fundamental con los casos tradicionales de multiprocesamiento y multitarea, las tareas a realizar son conocidas antes de comenzar el diseño, por lo tanto hardware como software se pueden configurar a medida de la aplicación, combinando la velocidad propia del primero, con la versatilidad del segundo. Este artículo describe las modificaciones de hardware realizadas al núcleo IP (Intellectual Property) de un procesador, de modo de permitir la inclusión de un administrador de tareas por hardware y de canales de comunicación interprocesadores.The availability of high-density field configurable devices provides the opportunity for designing highly integrated solutions (SOPC: System On a Programmable Chip). Among the SOPC solutions, a case is the integration of an embedded single processor equipped with a multitasking operating system. As an alternative to a single processor the embedding of various processors on a chip, even heterogeneous and with multitasking capacity, may be considered. A distinctive characteristic of a SOPC device is that the tasks to be performed are well known before the design starts. That feature is opposed to the traditional multiprocessing and multitasking systems in which general purpose applications are adopted during design. The benefit of this knowledge is that hardware as well as software can be adapted to fit the application’s requirements. This paper presents the hardware modifications performed on an microcontroller embedded core, to allow its inclusion as a multitasking device in a “multiprocessor on a chip”, through the addition of a hardware task manager (scheduler) and communication channels among processors.Facultad de Informátic

    The Design of a System Architecture for Mobile Multimedia Computers

    Get PDF
    This chapter discusses the system architecture of a portable computer, called Mobile Digital Companion, which provides support for handling multimedia applications energy efficiently. Because battery life is limited and battery weight is an important factor for the size and the weight of the Mobile Digital Companion, energy management plays a crucial role in the architecture. As the Companion must remain usable in a variety of environments, it has to be flexible and adaptable to various operating conditions. The Mobile Digital Companion has an unconventional architecture that saves energy by using system decomposition at different levels of the architecture and exploits locality of reference with dedicated, optimised modules. The approach is based on dedicated functionality and the extensive use of energy reduction techniques at all levels of system design. The system has an architecture with a general-purpose processor accompanied by a set of heterogeneous autonomous programmable modules, each providing an energy efficient implementation of dedicated tasks. A reconfigurable internal communication network switch exploits locality of reference and eliminates wasteful data copies

    Energy-efficient and high-performance lock speculation hardware for embedded multicore systems

    Full text link
    Embedded systems are becoming increasingly common in everyday life and like their general-purpose counterparts, they have shifted towards shared memory multicore architectures. However, they are much more resource constrained, and as they often run on batteries, energy efficiency becomes critically important. In such systems, achieving high concurrency is a key demand for delivering satisfactory performance at low energy cost. In order to achieve this high concurrency, consistency across the shared memory hierarchy must be accomplished in a cost-effective manner in terms of performance, energy, and implementation complexity. In this article, we propose Embedded-Spec, a hardware solution for supporting transparent lock speculation, without the requirement for special supporting instructions. Using this approach, we evaluate the energy consumption and performance of a suite of benchmarks, exploring a range of contention management and retry policies. We conclude that for resource-constrained platforms, lock speculation can provide real benefits in terms of improved concurrency and energy efficiency, as long as the underlying hardware support is carefully configured.This work is supported in part by NSF under Grants CCF-0903384, CCF-0903295, CNS-1319495, and CNS-1319095 as well the Semiconductor Research Corporation under grant number 1983.001. (CCF-0903384 - NSF; CCF-0903295 - NSF; CNS-1319495 - NSF; CNS-1319095 - NSF; 1983.001 - Semiconductor Research Corporation
    • …
    corecore