63 research outputs found

    Multidimensional model of estimated resource usage for multimedia NoC QoS

    Get PDF
    Multiprocessor systems are rapidly entering various high-performance computing segments, like multimedia processing. Instead of an increase in processor clock frequency, the new trend is enabling multiple cores in performing processing, e.g. dual or quadrapule CPUs in one subsystem. In this contribution, we address the problem of modeling the resource requirements of multimedia applications for a distributed computation on a multiprocessor system. This paper shows that the estimation of resource requirements based on input data enables the dynamic activation of tasks and run-time redistribution of application tasks. We also formally specify the optimal selection of the co-executed application with aim to provide the most optimal end-results of such streaming applications within one networks-on-chip (NoC) system. We present a new concept for system optimization which involves the major system parameters and resource usage. Experimental results are based on mapping an arbitrary-shaped MPEG-4 video decoder onto a multiprocessor NoC

    Dataflow Analysis for Real-Time Embedded Multiprocessor System Design

    Get PDF
    Dataflow analysis techniques are key to reduce the number of design iterations and shorten the design time of real-time embedded network based multiprocessor systems that process data streams. With these analysis techniques the worst-case end-to-end temporal behavior of hard real-time applications can be derived from a dataflow model in which computation, communication and arbitration is modeled. For soft real-time applications these static dataflow analysis techniques are combined with simulation of the dataflow model to test statistical assertions about their temporal behavior. The simulation results in combination with properties of the dataflow model are used to derive the sensitivity of design parameters and to estimate parameters like the capacity of data buffers

    Performance and QoS-aware MPEG-4 video-object coding for multiprocessor architecture

    No full text
    The introduction of Arbitrary-Shaped (AS) Video Objects (VO) in the MPEG-4 coding standard has enabled various applications using both natural and synthetic composition of video scenes. The work presented in this thesis aims at realizing an embedded-systems design involving the mapping of this type of applications onto a multiprocessor platform, like Network-on-Chip (NoC). The research has focused on the upper design layers, dealing with the application and their control for an ecient execution. The aspects addressed for the mapping are performance modeling of the MPEG-4 decoding, granularity optimization of the algorithm, introduction of task-level scalability, and controlling the quality of the applications by a Quality-of-Service (QoS) manager. The AS VO MPEG-4 decoding algorithm comprises of the conventional DCT coding techniques from MPEG-1/2 that are extended with the coding of object shapes and specic processing for the improvement of the picture quality of object borders, employing padding and block-based ltering. At the system level, the AS VO MPEG-4 coding allows the designer to think in individual planes and objects that together compose the scene. The target platform for such an application should be able to handle the features of MPEG-4 coding: the combination of high-level control-driven operations and streaming-oriented processing at the video-data level. The platform features a tile-based computing network, in which each tile is separated from the network by buered communication. This allows multiple instantiation of object decoding, each having its own dynamic behavior. The Synchronous Data Flow (SDF) graph is a traditional model for computation of multimedia applications mapped on the multiprocessor system. However, SDF cannot cope with the dynamic behavior of object-based video. Therefore, this research has extended SDF by a linear parametrical model of the required computation resources. The model is based on the coding parameters of the input stream (BAB-type of the block, number of non-transparent sub-blocks, number of AC coecients coded by an ESC code, etc.) and weighting coecients depending on the target processor architecture. Similarly, thesis proposes a parametrical model for the communication resources. It was found that our obtained parametrical timing model has about 5% deviation from the real execution on an Æthereal NoC with ARM7 cores. Our comparison with the mostly used worst-case approach for communication resource allocation revealed that it reduces the required resources with a factor of 2.5. For more ecient system control, the thesis presents a hierarchical Qualityof- Service (QoS) concept in combination with a scalable MPEG-4 decoder. To serve scalable execution, we have classied the tasks involved with the AS VO MPEG-4 decoding into two classes. The rst class contains essential tasks that cannot be skipped, while the second class is lled with the enhancement functions. Scalability of AS VO MPEG-4 decoding was obtained by enabling/disabling optional functions of the non-essential tasks next to the essential tasks. The resource distribution is controlled by a hierarchical QoS management. This QoS is based on two QoS managers. In our experimental implementation, the Local QoS provides the estimation of the resource-usage of an application and monitors the real execution. The Global QoS selects the best quality-levels of the active applications and reserves resources for the application. The key contribution of our work on QoS is the design of a heuristic algorithm that searches suitable combinations of quality levels for individual jobs, so that a set of jobs can be mapped on the available resources. In order to further improve the eciency of the mapping, we have distinguished reservation-based QoS control and best-eort computing on top of it as an addition. This combination was studied for controlling the bandwidth of the communication resources. The reservation-based approach guarantees that the video object will be always decoded at least at the lowest quality level, while the best-eort computing improves the quality by using the resources as much as they are available, as controlled by the Global QoS. The complete system was experimentally veried with a network of eight ARM processor cores, using an MPEG-4 Video Object decoder at the ACE prole and at CCIR- 601 resolution. The proposed framework showed that the adaptation at ner granularity, e.g. a VOP level within a GOV, signicantly improve the image quality (provided that resources are constrained. The mapping exploration of AS VO MPEG-4 decoding for execution on an NoC addresses a general case of running modern multimedia applications, because of the variability and dynamics of tasks. It has been shown that parametrical models help in planning the execution and QoS management and best-eort computing clearly improve the eciency of multiple tasks executed in parallel

    Performance and QoS-aware MPEG-4 video-object coding for multiprocessor architecture

    Get PDF
    The introduction of Arbitrary-Shaped (AS) Video Objects (VO) in the MPEG-4 coding standard has enabled various applications using both natural and synthetic composition of video scenes. The work presented in this thesis aims at realizing an embedded-systems design involving the mapping of this type of applications onto a multiprocessor platform, like Network-on-Chip (NoC). The research has focused on the upper design layers, dealing with the application and their control for an ecient execution. The aspects addressed for the mapping are performance modeling of the MPEG-4 decoding, granularity optimization of the algorithm, introduction of task-level scalability, and controlling the quality of the applications by a Quality-of-Service (QoS) manager. The AS VO MPEG-4 decoding algorithm comprises of the conventional DCT coding techniques from MPEG-1/2 that are extended with the coding of object shapes and specic processing for the improvement of the picture quality of object borders, employing padding and block-based ltering. At the system level, the AS VO MPEG-4 coding allows the designer to think in individual planes and objects that together compose the scene. The target platform for such an application should be able to handle the features of MPEG-4 coding: the combination of high-level control-driven operations and streaming-oriented processing at the video-data level. The platform features a tile-based computing network, in which each tile is separated from the network by buered communication. This allows multiple instantiation of object decoding, each having its own dynamic behavior. The Synchronous Data Flow (SDF) graph is a traditional model for computation of multimedia applications mapped on the multiprocessor system. However, SDF cannot cope with the dynamic behavior of object-based video. Therefore, this research has extended SDF by a linear parametrical model of the required computation resources. The model is based on the coding parameters of the input stream (BAB-type of the block, number of non-transparent sub-blocks, number of AC coecients coded by an ESC code, etc.) and weighting coecients depending on the target processor architecture. Similarly, thesis proposes a parametrical model for the communication resources. It was found that our obtained parametrical timing model has about 5% deviation from the real execution on an Æthereal NoC with ARM7 cores. Our comparison with the mostly used worst-case approach for communication resource allocation revealed that it reduces the required resources with a factor of 2.5. For more ecient system control, the thesis presents a hierarchical Qualityof- Service (QoS) concept in combination with a scalable MPEG-4 decoder. To serve scalable execution, we have classied the tasks involved with the AS VO MPEG-4 decoding into two classes. The rst class contains essential tasks that cannot be skipped, while the second class is lled with the enhancement functions. Scalability of AS VO MPEG-4 decoding was obtained by enabling/disabling optional functions of the non-essential tasks next to the essential tasks. The resource distribution is controlled by a hierarchical QoS management. This QoS is based on two QoS managers. In our experimental implementation, the Local QoS provides the estimation of the resource-usage of an application and monitors the real execution. The Global QoS selects the best quality-levels of the active applications and reserves resources for the application. The key contribution of our work on QoS is the design of a heuristic algorithm that searches suitable combinations of quality levels for individual jobs, so that a set of jobs can be mapped on the available resources. In order to further improve the eciency of the mapping, we have distinguished reservation-based QoS control and best-eort computing on top of it as an addition. This combination was studied for controlling the bandwidth of the communication resources. The reservation-based approach guarantees that the video object will be always decoded at least at the lowest quality level, while the best-eort computing improves the quality by using the resources as much as they are available, as controlled by the Global QoS. The complete system was experimentally veried with a network of eight ARM processor cores, using an MPEG-4 Video Object decoder at the ACE prole and at CCIR- 601 resolution. The proposed framework showed that the adaptation at ner granularity, e.g. a VOP level within a GOV, signicantly improve the image quality (provided that resources are constrained. The mapping exploration of AS VO MPEG-4 decoding for execution on an NoC addresses a general case of running modern multimedia applications, because of the variability and dynamics of tasks. It has been shown that parametrical models help in planning the execution and QoS management and best-eort computing clearly improve the eciency of multiple tasks executed in parallel

    Modeling Predictable Multiprocessor Performance for Video Decoding

    No full text

    Real-time aware rendering of scalable arbitrary-shaped MPEG-4 decoder for multiprocessor systems

    No full text
    The MPEG-4 video standard extends the traditional frame-based processing with the option to compose several video objects (VO) superimposed on a background sprite image. In our previous work, we presented a distributed, multiprocessor based, scalable implementation of an MPEG-4 arbitrary-shaped decoder, which forms together with the background sprite decoder an essential part for further scene rendering. For control of the multiprocessor architecture, we have constructed a Quality-of-Service (QoS) management that monitors the availability of required data and distributes the processing of individual tasks with guaranteed or best-effort services of the platform. However, the proposed architecture with the combined guaranteed and best-effort services poses problems for real-time scene rendering. In this paper, we present a technique for proper run-time rendering of the final scene after decoding one VO Layer. The individual video-object monitors check the data availability and select the highest quality for the final scene rendering. The algorithm operates hierarchically both at the scene level and at the task level of the video object processing. Whereas the earlier work on scalable implementation concentrated only on guaranteed services, we now introduce a new element in the system architecture for the real-time control and fall back mechanism of the best-effort services. This element is based on first, controlling data availability at task level, and second, introducing the propagation service to QoS management. We present our simulation results in the comparison with the standard "frame-skipping" technique that is the only currently available solution to this type of rendering a scalable processing
    corecore