48 research outputs found

    Performance analysis of multicore processors using multi-scaling techniques

    Get PDF
    Integrating more cores per chip enables more programs to run simultaneously, and more easily switch from one program to another, and the system performance will be improved significantly. However, this current trend of central processing unit (CPU) performance cannot be maintained since the budget of power per chip has not risen while the consumption of power per core has slowly reduced. Generally, the processor’s maximum performance is proportional to the product of the number of their cores and the frequency they are running at. However, this is usually limited by constraints of power. In this study, first, the voltage/frequency adjustment of the running cores has been analyzed for several programs to improve the processor’s performance within the constraint of power. Second, the impact of dynamically scaling the number of running cores is summarized for additional performance improvements of the active programs and applications. Finally, it has been verified that scaling the number of the running cores and their voltage/frequency simultaneously can improve the processor’s performance for a higher power dissipation or under power constraints. The performance analysis and improvements are obtained in a real-time simulation on a Linux operating system using a GEM5 simulator. Results indicated that performance improvement was attained at 59.98%, 33.33%, and 66.65% for the three scenarios, respectively

    High performance IPC hardware accelerator and communication network for MPSoCs

    Get PDF
    In this paper, we explain a configurable IPC module for multimedia MPSoCs, which was implemented in a MPW chip that include three ARM7 CPU cores. According to the test results for an M-JPEG and a H.264 decoder, its IPC synchronization overheads are not more than 1% when the synchronization period is about 5000 cycles.This work was supported by the IC Design Education Center (IDEC) in KAIST, and the Seoul R&BD Program

    A composable, energy-managed, real-time MPSOC platform.

    Get PDF
    Multi-processors systems on chip (MPSOC) platforms emerged in embedded systems as hardware solutions to support the continuously increasing functionality and performance demands in this domain. Such a platform has to execute a mix of applications with diverse performance and timing constraints, i.e., real-time or non-real-time, thus different application schedulers should co-exist on an MPSOC. Moreover, applications share many MPSOC resources, thus their timing depends on the arbitration at these resources. Arbitration may create inter-application dependencies, e.g., the timing of a low priority application depends on the timing of all higher priority ones. Application inter-dependencies make the functional and timing verification and the integration process harder. This is especially problematic for real-time applications, for which fulfilling the time-related constraints should be guaranteed by construction. Moreover, energy and power management, commonly employed in embedded systems, make this verification even more difficult. Typically, energy and power management involves scaling the resources operating point, which has a direct impact on the resource performance, thus influences the application time behaviour. Finally, a small change in one application leads to the need to re-verify all other applications, incurring a large effort. Composability is a property meant to ease the verification and integration process. A system is composable if the functionality and the timing behaviour of each application is independent of other applications mapped on the same platform. Composability is achieved by utilising arbiters that ensure applications independence. In this paper we present the concepts behind a composable, scalable, energy-managed MPSOC platform, able to support different real-time and nonreal time schedulers concurrently, and discuss its advantages and limitations

    A List Scheduling Heuristic with New Node Priorities and Critical Child Technique for Task Scheduling with Communication Contention

    Get PDF
    Task scheduling is becoming an important aspect for parallel programming of modern embedded systems. In this chapter, the application to be scheduled is modeled as a Directed Acyclic Graph (DAG), and the architecture targets parallel embedded systems composed of multiple processors interconnected by buses and/or switches. This chapter presents new list scheduling heuristics with communication contention. Furthermore, we define new node priorities (top level and bottom level) to sort nodes, and propose an advanced technique named critical child to select a processor to execute a node. Experimental results show that the proposed method is effective to reduce the schedule length, and the runtime performance is greatly improved in the cases of medium and high communication. Since the communication cost is increasing from medium to high in modern applications like digital communication and video compression, the proposed method is well-adapted for scheduling these applications over parallel embedded systems

    Hardware implementation of inter-processor communication in MPSoCs for multimedia applications

    Get PDF
    In this paper we present a scalable and flexible architecture that implements inter-processor communication (IPC) synchronization among FIFO channels for multimedia applications. We also compare it to the simple mail-box architecture, especially for tasks of finer granularity. With experimental results we confirmed the proposed architecture is suitable for various cases including a Motion JPEG example

    A communication profiler to optimize embedded resource usage

    Get PDF
    While the number of cores in both embedded MultiProcessor Systems-on-Chip and general purpose processors keeps rising, on-chip communication becomes more and more important. In order to write efficient programs for these architectures, it is therefore necessary to have a good idea of the communication behavior of an application. We present a communication profiler that extracts this behavior from compiled sequential C/C++ programs, and constructs a dynamic dataflow graph at the level of major functional blocks. In contrast to existing methods of measuring inter-program communication, our tool automatically generates the program's dataflow graph and is less demanding for the developer. It can also be used to view differences between program phases (such as different video frames), which allows both input- and phase-specific optimizations to be made. We also look at how this information can subsequently be used to guide the effort of parallelizing the application, to co-design the software, memory hierarchy and communication hardware, and to provide new sources of communication-related runtime optimizations

    Modeling, Synthesis, and Validation of Heterogeneous Biomedical Embedded Systems

    Get PDF
    Abstract-The increasing performance and availability of embedded systems increases their attractiveness for biomedical applications. With advances in sensor processing and classification algorithms, real-time decision support in patient monitoring becomes feasible. However, the gap between algorithm design and their embedded realization is growing. This paper overviews an approach for development of biomedical devices at an abstract algorithm level with automatic generation of an embedded implementation. Based on a case study of a Brain Computer Interface (BCI), this paper demonstrates capturing, modeling and synthesis of such applications

    Sistema gràfic de depuració per xarxes (NoC)

    Get PDF
    Aquest projecte descriu el disseny i desenvolupament d'una eina gràfica per a la depuració de projectes desenvolupats amb un llenguatge de descripció de sistemes com és el SystemC. Amb aquest llenguatge s'ha desenvolupat una NoC (Network on Chip). L'eina desenvolupada mostra de forma visual l'arquitectura de la xarxa NoC, els valors dels senyals que es transmeten a través de la xarxa i estadístiques sobre aquests per tal de poder fer un seguiment exhaustiu i agilitzar la recerca d'errors com interbloquejos, pèrdua de dades i d'altres. Al concentrar en un únic entorn la descripció de la NoC i les dades relatives a les senyals en temps de simulació, proporciona un valor afegit a altres eines disponibles per a realitzar aquesta tasca.Este proyecto describe el diseño y desarrollo de una herramienta gráfica para la depuración de proyectos desarrollados con un lenguaje de descripción de sistemas como es el SystemC. Con este lenguaje se ha desarrollado una NoC (Network on Chip). La herramienta desarrollada muestra de forma visual la arquitectura de la red NoC, los valores de las señales que se transmiten a través de la red y estadísticas sobre estas para poder hacer un seguimiento exhaustivo y agilizar la búsqueda de errores como interbloqueos, pérdida de datos u otros. Al concentrar en un único entorno la descripción de la NoC y los datos relativos a las señales en tiempo de simulación, proporciona un valor añadido a otras herramientas disponibles para realizar esta tarea.This project describes the design and development of a graphical tool to debug projects that are developed with a system description language like is the SystemC. A NoC(Network on Chip) was developed with this language. The developed tool shows visually the NoC network architecture, the signals values that are transmitted through the network and statistics about this signals to get an exhaustive record and improve the errors search such as: deadlocks, data loss and others. Including in just one environment the NoC description and the information relative to the signals in simulation time providing an added value to other available tools to realize this task.Nota: Aquest document conté originàriament altre material i/o programari només consultable a la Biblioteca de Ciència i Tecnologia
    corecore