Search CORE

1,629 research outputs found

System-Level Design Methodologies for Networked Multiprocessor Systems-on-Chip

Author: Virk Kashif Munir
Publication venue
Publication date: 01/11/2008
Field of study

A Reactive and Cycle-True IP Emulator for MPSoC Exploration

Author: Angiolini Federico
Benini Luca
Madsen Jan
Mahadevan Shankar
Sparsø Jens
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

The design of MultiProcessor Systems-on-Chip (MPSoC) emphasizes intellectual-property (IP)-based communication-centric approaches. Therefore, for the optimization of the MPSoC interconnect, the designer must develop traffic models that realistically capture the application behavior as executing on the IP core. In this paper, we introduce a Reactive IP Emulator (RIPE) that enables an effective emulation of the IP-core behavior in multiple environments, including bitand cycle-true simulation. The RIPE is built as a multithreaded abstract instruction-set processor, and it can generate reactive traffic patterns. We compare the RIPE models with cycle-true functional simulation of complex application behavior (tasksynchronization, multitasking, and input/output operations). Our results demonstrate high-accuracy and significant speedups. Furthermore, via a case study, we show the potential use of the RIPE in a design-space-exploration context

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Online Research Database In Technology

Analysis and Optimisation of Hierarchically Scheduled Multiprocessor Embedded Systems

Author: C. Douglas Locke
C.L. Liu
D. Ullman
G. Agrawal
J.Y.T. Leung
K. Richter
K. Tindell
L. Almeida
L. Sha
P. Eles
P. Koopman
P. Pop
P. Pop
Paul Pop
Petru Eles
S. Prakash
T. Demmeler
T.F. Abdelhazer
Traian Pop
Zebo Peng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Crossref

Online Research Database In Technology

Predictable embedded multiprocessor architecture for streaming applications

Author: Moonen A.J.M.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2009
Field of study

The focus of this thesis is on embedded media systems that execute applications from the application domain car infotainment. These applications, which we refer to as jobs, typically fall in the class of streaming, i.e. they process on a stream of data. The jobs are executed on heterogeneous multiprocessor platforms, for performance and power efficiency reasons. Most of these jobs have firm real-time requirements, like throughput and end-to-end latency. Car-infotainment systems become increasingly more complex, due to an increase in the supported number of jobs and an increase of resource sharing. Therefore, it is hard to verify, for each job, that the realtime requirements are satisfied. To reduce the verification effort, we elaborate on an architecture for a predictable system from which we can verify, at design time, that the job’s throughput and end-to-end latency requirements are satisfied. This thesis introduces a network-based multiprocessor system that is predictable. This is achieved by starting with an architecture where processors have private local memories and execute tasks in a static order, so that the uncertainty in the temporal behaviour is minimised. As an interconnect, we use a network that supports guaranteed communication services so that it is guaranteed that data is delivered in time. The architecture is extended with shared local memories, run-time scheduling of tasks, and a memory hierarchy. Dataflow modelling and analysis techniques are used for verification, because they allow cyclic data dependencies that influence the job’s performance. Shown is how to construct a dataflow model from a job that is mapped onto our predictable multiprocessor platforms. This dataflow model takes into account: computation of tasks, communication between tasks, buffer capacities, and scheduling of shared resources. The job’s throughput and end-to-end latency bounds are derived from a self-timed execution of the dataflow graph, by making use of existing dataflow-analysis techniques. It is shown that the derived bounds are tight, e.g. for our channel equaliser job, the accuracy of the derived throughput bound is within 10.1%. Furthermore, it is shown that the dataflow modelling and analysis techniques can be used despite the use of shared memories, run-time scheduling of tasks, and caches

Repository TU/e

Pure OAI Repository

Design of an asynchronous processor

Author: Sotiriou Christos Panagiotis
Publication venue: The University of Edinburgh
Publication date: 01/01/2001
Field of study

Edinburgh Research Archive

Experimental evaluation and selection of data consistency mechanisms for hard real-time applications on multicore platforms

Author: Di Natale Marco
Dou Wenhua
Han Gang
Liu Xue
Zeng Haibo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Archivio della ricerca della Scuola Superiore Sant'Anna

On the design of multimedia architectures : proceedings of a one-day workshop, Eindhoven, December 18, 2003

Author
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2003
Field of study

Pure OAI Repository

On the design of multimedia architectures : proceedings of a one-day workshop, Eindhoven, December 18, 2003

Author
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2003
Field of study

Pure OAI Repository

A Theoretical Approach Involving Recurrence Resolution, Dependence Cycle Statement Ordering and Subroutine Transformation for the Exploitation of Parallelism in Sequential Code.

Author: Chu Chih-ping
Publication venue: LSU Digital Commons
Publication date: 01/01/1991
Field of study

To exploit parallelism in Fortran code, this dissertation consists of a study of the following three issues: (1) recurrence resolution in Do-loops for vector processing, (2) dependence cycle statement ordering in Do-loops for parallel processing, and (3) sub-routine parallelization. For recurrence resolution, the major findings include: (1) the node splitting algorithm cannot be used directly to break an essential antidependence link, of which the source variable that results in antidependence is itself the sink variable of another true dependence so a correction method is proposed, (2) a sink variable renaming technique is capable of breaking an antidependence and/or output-dependence link, (3) for recurrences formed by only true dependences, a dynamic dependence concept and the derived technique are powerful, and (4) by integrating related techniques, an algorithm for resolving a general multistatement recurrence is developed. The performance of a parallel loop is determined by the level of parallelism and the time delay due to interprocessor communication and synchronization. For a dependence cycle of a single parallel loop executed in a general synchronization mode, the parallelism exposed varies with the alignment of statements. Statements are reordered on the basis of execution-time of the loop as estimated at compile-time. An improved timing formula and a derived statement ordering algorithm are proposed. Further extension of this algorithm to multiple perfectly nested Do-loops with simple global dependence cycle is also presented. The subroutine is a potential source for parallel processing. Several problems must be solved for subroutine parallelization: (1) the precedence of parallel executions of subroutines, (2) identification of the optimum execution mode for each subroutine and (3) the restructuring of a serial program. A five-step approach to parallelize called subroutines for a calling subroutine is proposed: (1) computation of control dependence, (2) approximation of the global effects of subroutines, (3) analysis of data dependence, (4) identification of execution mode, and (5) restructuring of calling and called subroutines. Application of these five steps in a recursive manner to different levels of calling subroutines in a program addresses the parallelization of subroutines

Louisiana State University