182 research outputs found

    Three-dimensional memory vectorization for high bandwidth media memory systems

    Get PDF
    Vector processors have good performance, cost and adaptability when targeting multimedia applications. However, for a significant number of media programs, conventional memory configurations fail to deliver enough memory references per cycle to feed the SIMD functional units. This paper addresses the problem of the memory bandwidth. We propose a novel mechanism suitable for 2-dimensional vector architectures and targeted at providing high effective bandwidth for SIMD memory instructions. The basis of this mechanism is the extension of the scope of vectorization at the memory level, so that 3-dimensional memory patterns can be fetched into a second-level register file. By fetching long blocks of data and by reusing 2-dimensional memory streams at this second-level register file, we obtain a significant increase in the effective memory bandwidth. As side benefits, the new 3-dimensional load instructions provide a high robustness to memory latency and a significant reduction of the cache activity, thus reducing power and energy requirements. At the investment of a 50% more area than a regular SIMD register file, we have measured and average speed-up of 13% and the potential for power savings in the L2 cache of a 30%.Peer ReviewedPostprint (published version

    Etude et mise en place d'une plateforme d'adaptation multiservice embarquée pour la gestion de flux multimédia à différents niveaux logiciels et matériels

    Get PDF
    Les avancées technologiques ont permis la commercialisation à grande échelle de terminaux mobiles. De ce fait, l homme est de plus en plus connecté et partout. Ce nombre grandissant d usagers du réseau ainsi que la forte croissance du contenu disponible, aussi bien d un point de vue quantitatif que qualitatif saturent les réseaux et l augmentation des moyens matériels (passage à la fibre optique) ne suffisent pas. Pour surmonter cela, les réseaux doivent prendre en compte le type de contenu (texte, vidéo, ...) ainsi que le contexte d utilisation (état du réseau, capacité du terminal, ...) pour assurer une qualité d expérience optimum. A ce sujet, la vidéo fait partie des contenus les plus critiques. Ce type de contenu est non seulement de plus en plus consommé par les utilisateurs mais est aussi l un des plus contraignant en terme de ressources nécéssaires à sa distribution (taille serveur, bande passante, ). Adapter un contenu vidéo en fonction de l état du réseau (ajuster son débit binaire à la bande passante) ou des capacités du terminal (s assurer que le codec soit nativement supporté) est indispensable. Néanmoins, l adaptation vidéo est un processus qui nécéssite beaucoup de ressources. Cela est antinomique à son utilisation à grande echelle dans les appareils à bas coûts qui constituent aujourd hui une grande part dans l ossature du réseau Internet. Cette thÚse se concentre sur la conception d un systÚme d adaptation vidéo à bas coût et temps réel qui prendrait place dans ces réseaux du futur. AprÚs une analyse du contexte, un systÚme d adaptation générique est proposé et évalué en comparaison de l état de l art. Ce systÚme est implémenté sur un FPGA afin d assurer les performances (temps-réels) et la nécessité d une solution à bas coût. Enfin, une étude sur les effets indirects de l adaptation vidéo est menée.On the one hand, technology advances have led to the expansion of the handheld devices market. Thanks to this expansion, people are more and more connected and more and more data are exchanged over the Internet. On the other hand, this huge amound of data imposes drastic constrains in order to achieve sufficient quality. The Internet is now showing its limits to assure such quality. To answer nowadays limitations, a next generation Internet is envisioned. This new network takes into account the content nature (video, audio, ...) and the context (network state, terminal capabilities ...) to better manage its own resources. To this extend, video manipulation is one of the key concept that is highlighted in this arising context. Video content is more and more consumed and at the same time requires more and more resources. Adapting videos to the network state (reducing its bitrate to match available bandwidth) or to the terminal capabilities (screen size, supported codecs, ) appears mandatory and is foreseen to take place in real time in networking devices such as home gateways. However, video adaptation is a resource intensive task and must be implemented using hardware accelerators to meet the desired low cost and real time constraints.In this thesis, content- and context-awareness is first analyzed to be considered at the network side. Secondly, a generic low cost video adaptation system is proposed and compared to existing solutions as a trade-off between system complexity and quality. Then, hardware conception is tackled as this system is implemented in an FPGA based architecture. Finally, this system is used to evaluate the indirect effects of video adaptation; energy consumption reduction is achieved at the terminal side by reducing video characteristics thus permitting an increased user experience for End-Users.BORDEAUX1-Bib.electronique (335229901) / SudocSudocFranceF

    Etude et mise en place d’une plateforme d’adaptation multiservice embarquĂ©e pour la gestion de flux multimĂ©dia Ă  diffĂ©rents niveaux logiciels et matĂ©riels

    Get PDF
    On the one hand, technology advances have led to the expansion of the handheld devices market. Thanks to this expansion, people are more and more connected and more and more data are exchanged over the Internet. On the other hand, this huge amound of data imposes drastic constrains in order to achieve sufficient quality. The Internet is now showing its limits to assure such quality. To answer nowadays limitations, a next generation Internet is envisioned. This new network takes into account the content nature (video, audio, ...) and the context (network state, terminal capabilities ...) to better manage its own resources. To this extend, video manipulation is one of the key concept that is highlighted in this arising context. Video content is more and more consumed and at the same time requires more and more resources. Adapting videos to the network state (reducing its bitrate to match available bandwidth) or to the terminal capabilities (screen size, supported codecs, 
) appears mandatory and is foreseen to take place in real time in networking devices such as home gateways. However, video adaptation is a resource intensive task and must be implemented using hardware accelerators to meet the desired low cost and real time constraints.In this thesis, content- and context-awareness is first analyzed to be considered at the network side. Secondly, a generic low cost video adaptation system is proposed and compared to existing solutions as a trade-off between system complexity and quality. Then, hardware conception is tackled as this system is implemented in an FPGA based architecture. Finally, this system is used to evaluate the indirect effects of video adaptation; energy consumption reduction is achieved at the terminal side by reducing video characteristics thus permitting an increased user experience for End-Users.Les avancĂ©es technologiques ont permis la commercialisation Ă  grande Ă©chelle de terminaux mobiles. De ce fait, l’homme est de plus en plus connectĂ© et partout. Ce nombre grandissant d’usagers du rĂ©seau ainsi que la forte croissance du contenu disponible, aussi bien d’un point de vue quantitatif que qualitatif saturent les rĂ©seaux et l’augmentation des moyens matĂ©riels (passage Ă  la fibre optique) ne suffisent pas. Pour surmonter cela, les rĂ©seaux doivent prendre en compte le type de contenu (texte, vidĂ©o, ...) ainsi que le contexte d’utilisation (Ă©tat du rĂ©seau, capacitĂ© du terminal, ...) pour assurer une qualitĂ© d’expĂ©rience optimum. A ce sujet, la vidĂ©o fait partie des contenus les plus critiques. Ce type de contenu est non seulement de plus en plus consommĂ© par les utilisateurs mais est aussi l’un des plus contraignant en terme de ressources nĂ©cĂ©ssaires Ă  sa distribution (taille serveur, bande passante, 
). Adapter un contenu vidĂ©o en fonction de l’état du rĂ©seau (ajuster son dĂ©bit binaire Ă  la bande passante) ou des capacitĂ©s du terminal (s’assurer que le codec soit nativement supportĂ©) est indispensable. NĂ©anmoins, l’adaptation vidĂ©o est un processus qui nĂ©cĂ©ssite beaucoup de ressources. Cela est antinomique Ă  son utilisation Ă  grande echelle dans les appareils Ă  bas coĂ»ts qui constituent aujourd’hui une grande part dans l’ossature du rĂ©seau Internet. Cette thĂšse se concentre sur la conception d’un systĂšme d’adaptation vidĂ©o Ă  bas coĂ»t et temps rĂ©el qui prendrait place dans ces rĂ©seaux du futur. AprĂšs une analyse du contexte, un systĂšme d’adaptation gĂ©nĂ©rique est proposĂ© et Ă©valuĂ© en comparaison de l’état de l’art. Ce systĂšme est implĂ©mentĂ© sur un FPGA afin d’assurer les performances (temps-rĂ©els) et la nĂ©cessitĂ© d’une solution Ă  bas coĂ»t. Enfin, une Ă©tude sur les effets indirects de l’adaptation vidĂ©o est menĂ©e

    Flexi-WVSNP-DASH: A Wireless Video Sensor Network Platform for the Internet of Things

    Get PDF
    abstract: Video capture, storage, and distribution in wireless video sensor networks (WVSNs) critically depends on the resources of the nodes forming the sensor networks. In the era of big data, Internet of Things (IoT), and distributed demand and solutions, there is a need for multi-dimensional data to be part of the Sensor Network data that is easily accessible and consumable by humanity as well as machinery. Images and video are expected to become as ubiquitous as is the scalar data in traditional sensor networks. The inception of video-streaming over the Internet, heralded a relentless research for effective ways of distributing video in a scalable and cost effective way. There has been novel implementation attempts across several network layers. Due to the inherent complications of backward compatibility and need for standardization across network layers, there has been a refocused attention to address most of the video distribution over the application layer. As a result, a few video streaming solutions over the Hypertext Transfer Protocol (HTTP) have been proposed. Most notable are Apple’s HTTP Live Streaming (HLS) and the Motion Picture Experts Groups Dynamic Adaptive Streaming over HTTP (MPEG-DASH). These frameworks, do not address the typical and future WVSN use cases. A highly flexible Wireless Video Sensor Network Platform and compatible DASH (WVSNP-DASH) are introduced. The platform's goal is to usher video as a data element that can be integrated into traditional and non-Internet networks. A low cost, scalable node is built from the ground up to be fully compatible with the Internet of Things Machine to Machine (M2M) concept, as well as the ability to be easily re-targeted to new applications in a short time. Flexi-WVSNP design includes a multi-radio node, a middle-ware for sensor operation and communication, a cross platform client facing data retriever/player framework, scalable security as well as a cohesive but decoupled hardware and software design.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Parallelism and the software-hardware interface in embedded systems

    Get PDF
    This thesis by publications addresses issues in the architecture and microarchitecture of next generation, high performance streaming Systems-on-Chip through quantifying the most important forms of parallelism in current and emerging embedded system workloads. The work consists of three major research tracks, relating to data level parallelism, thread level parallelism and the software-hardware interface which together reflect the research interests of the author as they have been formed in the last nine years. Published works confirm that parallelism at the data level is widely accepted as the most important performance leverage for the efficient execution of embedded media and telecom applications and has been exploited via a number of approaches the most efficient being vectorlSIMD architectures. A further, complementary and substantial form of parallelism exists at the thread level but this has not been researched to the same extent in the context of embedded workloads. For the efficient execution of such applications, exploitation of both forms of parallelism is of paramount importance. This calls for a new architectural approach in the software-hardware interface as its rigidity, manifested in all desktop-based and the majority of embedded CPU's, directly affects the performance ofvectorized, threaded codes. The author advocates a holistic, mature approach where parallelism is extracted via automatic means while at the same time, the traditionally rigid hardware-software interface is optimized to match the temporal and spatial behaviour of the embedded workload. This ultimate goal calls for the precise study of these forms of parallelism for a number of applications executing on theoretical models such as instruction set simulators and parallel RAM machines as well as the development of highly parametric microarchitectural frameworks to encapSUlate that functionality.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    On the design of multimedia architectures : proceedings of a one-day workshop, Eindhoven, December 18, 2003

    Get PDF

    On the design of multimedia architectures : proceedings of a one-day workshop, Eindhoven, December 18, 2003

    Get PDF

    Parallel Modelling Paradigm in Multimedia Applications: Mapping and Scheduling onto a Multi-Processor System-on-Chip Platform

    Get PDF
    Multi-processor systems have appeared as a promising alternative to face the difficulties of creating even faster uni-processor systems using latest technologies. Emerg-ing design paradigms such as Multiprocessor System-on-a-Chip (MpSoC) offer high levels of performance and flex-ibility and at the same time promise low-cost, reliable and power-efficient implementations. However, the design com-plexity of such systems have increased tremendously. One source of the complexity stems from highly parallel het-erogeneous nature of the underlying hardware architecture, which poses many challenges for mapping of an applica-tion to the architecture. This motivates the development of a unified programming paradigm that facilitates the map-ping by hiding the architectural complexity and exposing the parallel resources of the architecture. To enable de-sign reuse, such a programming paradigm has to support a smooth translation of sequentially-coded software algo-rithms into their parallel implementations. In this paper we address the parallelization of sequential multimedia appli-cations written in C/C++ for their mapping and schedul-ing onto a flexible MpSoC platform. We show that using our approach an architecture-independent multi-threaded model of a MPEG–2 video decoder algorithm can be ob-tained with only few modifications to an existing sequential implementation of the algorithm. 1
    • 

    corecore