Search CORE

5 research outputs found

Partitioning Strategies for Concurrent Programming

Author: Agarwal Anant
Devadas Srinivas
Hoffmann Henry
Publication venue
Publication date: 01/01/2009
Field of study

This work presents four partitioning strategies, or patterns, useful for decomposing a serial application into multiple concurrently executing parts. These partitioning strategies augment the commonly used task and data parallel design patterns by recognizing that applications are spatiotemporal in nature. Therefore, data and instruction decomposition are further distinguished by whether the partitioning is done in the spatial or in temporal dimension. Thus, this work describes four decomposition strategies: spatial data partitioning (SDP), temporal data partitioning (TDP), spatial instruction partitioning (SIP), and temporal instruction partitioning (TIP), while cataloging the benefits and drawbacks of each. In addition, the practical use of these strategies is demonstrated through a case study in which they are applied to implement several different parallelizations of a multicore H.264 encoder for HD video. This case study illustrates both the application of the patterns and their effects on the performance of the encoder

DSpace@MIT

Complexity/Performance Analysis of a H.264/AVC Video Encoder

Author: Abderrazek Jemai
Ahmed Chiheb Ammari
Hajer Krichene Zrida
Mohamed Abid
Publication venue: 'IntechOpen'
Publication date: 05/07/2011
Field of study

IntechOpen

Parallel scalability of video decoders

Author: Azevedo Arnaldo
Juurlink Ben
Meenderinck Cor
Ramírez Bellido Alejandro
Álvarez Mesa Mauricio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2009
Field of study

An important question is whether emerging and future applications exhibit sufficient parallelism, in particular thread-level parallelism, to exploit the large numbers of cores future chip multiprocessors (CMPs) are expected to contain. As a case study we investigate the parallelism available in video decoders, an important application domain now and in the future. Specifically, we analyze the parallel scalability of the H.264 decoding process. First we discuss the data structures and dependencies of H.264 and show what types of parallelism it allows to be exploited. We also show that previously proposed parallelization strategies such as slice-level, frame-level, and intra-frame macroblock (MB) level parallelism, are not sufficiently scalable. Based on the observation that inter-frame dependencies have a limited spatial range we propose a new parallelization strategy, called Dynamic 3D-Wave. It allows certain MBs of consecutive frames to be decoded in parallel. Using this new strategy we analyze the limits to the available MB-level parallelism in H.264. Using real movie sequences we find a maximum MB parallelism ranging from 4000 to 7000. We also perform a case study to assess the practical value and possibilities of a highly parallelized H.264 application. The results show that H.264 exhibits sufficient parallelism to efficiently exploit the capabilities of future manycore CMPs.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

A Framework for Parallelizing OWL Classification in Description Logic Reasoners

Author: Quan Zixi
Publication venue
Publication date: 01/03/2019
Field of study

The Web Ontology Language (OWL) is a widely used knowledge representation language for describing knowledge in application domains by using classes, properties, and individuals. Ontology classification is an important and widely used service that computes a taxonomy of all classes occurring in an ontology. It can require significant amounts of runtime, but most OWL reasoners do not support any kind of parallel processing. This thesis reports on a black-box approach to parallelize existing description logic (DL) reasoners for theWeb Ontology Language. We focus on OWL ontology classification, which is an important inference service and supported by every major OWL/DL reasoner. To the best of our knowledge, we are the first to propose a flexible parallel framework which can be applied to existing OWL reasoners in order to speed up their classification process. There are two versions of our methods discussed: (i) the first version implements a novel thread-level parallel architecture with two parallel strategies to achieve a good speedup factor with an increasing number of threads, but does not rely on locking techniques and thus avoids possible race conditions. (ii) The improved version implements an improved data structure and various parallel computing techniques for precomputing and classification to reduce the overhead of processing ontologies and compete with other DL reasoners based on the wall clock time for classification. In order to test the performance of both versions of our approaches, we use a real-world repository for choosing the tested ontologies. For the first version of our approach, we evaluated our prototype implementation with a set of selected real-world ontologies. Our experiments demonstrate very good scalability resulting in a speedup that is linear to the number of available cores. For the second version, its performance is evaluated by parallelizing major OWL reasoners for concept classification. Currently, we mainly focus on comparison with two popular DL reasoners: Hermit and JFact. In comparison to the selected black-box reasoners, our results demonstrate that the wall clock time of ontology classification can be improved by one order of the magnitude for most real-world ontologies in the repository

Concordia University Research Repository

Optimisation d’algorithmes de codage vidéo sur des plateformes à plusieurs processeurs parallèles

Author: Franche Jean-François
Publication venue: École de technologie supérieure
Publication date
Field of study

H.264 est le standard de codage vidéo le plus récent et le plus puissant. Ce standard permet, par rapport à ses prédécesseurs, d’augmenter le taux de compression par un facteur d’au moins deux, mais au prix d’une complexité plus élevée. Pour réduire le temps d’encodage, plusieurs encodeurs H.264 utilisent une approche parallèle. Dans le cadre de ce travail de recherche, notre objectif premier est de concevoir une approche offrant une meilleure accélération que l’approche implémentée dans l’encodeur H.264 d’Intel livré en code d’exemple dans sa librairie IPP. Nous présentons notre approche d’encodage vidéo parallèle multi-trames et multi-tranches (MTMT) et ses modes d’estimation de mouvement qui offrent un compromis entre l’accélération et la perte de qualité visuelle. Le premier mode, le plus rapide, mais dégradant le plus la qualité, restreint la région de recherche de l'estimation de mouvement à l'intérieur des limites de la tranche courante. Le second mode, moins rapide, mais dégradant moins la qualité que le premier, élargit la région de recherche aux tranches voisines, quand les tranches de référence y correspondant ont été traitées. Le troisième mode, moins rapide que le second, mais dégradant moins la qualité, rend une tranche prête à l'encodage seulement quand les tranches de référence couvrant la région de recherche ont été traitées. Nos expériences montrent que le premier mode de notre approche offre une accélération moyenne environ 55 % plus élevée que celle obtenue par l’approche d’Intel. Nos expériences montrent aussi que nous obtenons une accélération comparable à celle obtenue par l’état de l’art sans l’inconvénient de forcer l’utilisation des trames B. De plus, notre approche s’implémente rapidement dans un encodeur H.264 qui, comme l’encodeur H.264 d’Intel, est basé sur une approche multi-tranches

Espace ÉTS