Search CORE

425 research outputs found

Exploring Processor and Memory Architectures for Multimedia

Author: Iranpour Ali
Publication venue
Publication date: 01/01/2012
Field of study

Multimedia has become one of the cornerstones of our 21st century society and, when combined with mobility, has enabled a tremendous evolution of our society. However, joining these two concepts introduces many technical challenges. These range from having sufficient performance for handling multimedia content to having the battery stamina for acceptable mobile usage. When taking a projection of where we are heading, we see these issues becoming ever more challenging by increased mobility as well as advancements in multimedia content, such as introduction of stereoscopic 3D and augmented reality. The increased performance needs for handling multimedia come not only from an ongoing step-up in resolution going from QVGA (320x240) to Full HD (1920x1080) a 27x increase in less than half a decade. On top of this, there is also codec evolution (MPEG-2 to H.264 AVC) that adds to the computational load increase. To meet these performance challenges there has been processing and memory architecture advances (SIMD, out-of-order superscalarity, multicore processing and heterogeneous multilevel memories) in the mobile domain, in conjunction with ever increasing operating frequencies (200MHz to 2GHz) and on-chip memory sizes (128KB to 2-3MB). At the same time there is an increase in requirements for mobility, placing higher demands on battery-powered systems despite the steady increase in battery capacity (500 to 2000mAh). This leaves negative net result in-terms of battery capacity versus performance advances. In order to make optimal use of these architectural advances and to meet the power limitations in mobile systems, there is a need for taking an overall approach on how to best utilize these systems. The right trade-off between performance and power is crucial. On top of these constraints, the flexibility aspects of the system need to be addressed. All this makes it very important to reach the right architectural balance in the system. The first goal for this thesis is to examine multimedia applications and propose a flexible solution that can meet the architectural requirements in a mobile system. Secondly, propose an automated methodology of optimally mapping multimedia data and instructions to a heterogeneous multilevel memory subsystem. The proposed methodology uses constraint programming for solving a multidimensional optimization problem. Results from this work indicate that using today’s most advanced mobile processor technology together with a multi-level heterogeneous on-chip memory subsystem can meet the performance requirements for handling multimedia. By utilizing the automated optimal memory mapping method presented in this thesis lower total power consumption can be achieved, whilst performance for multimedia applications is improved, by employing enhanced memory management. This is achieved through reduced external accesses and better reuse of memory objects. This automatic method shows high accuracy, up to 90%, for predicting multimedia memory accesses for a given architecture

CiteSeerX

Lund University Publications

A configurable vector processor for accelerating speech coding algorithms

Author: Konstantia Koutsomyti (7201031)
Publication venue
Publication date: 01/01/2007
Field of study

The growing demand for voice-over-packer (VoIP) services and multimedia-rich applications has made increasingly important the efficient, real-time implementation of low-bit rates speech coders on embedded VLSI platforms. Such speech coders are designed to substantially reduce the bandwidth requirements thus enabling dense multichannel gateways in small form factor. This however comes at a high computational cost which mandates the use of very high performance embedded processors. This thesis investigates the potential acceleration of two major ITU-T speech coding algorithms, namely G.729A and G.723.1, through their efficient implementation on a configurable extensible vector embedded CPU architecture. New scalar and vector ISAs were introduced which resulted in up to 80% reduction in the dynamic instruction count of both workloads. These instructions were subsequently encapsulated into a parametric, hybrid SISD (scalar processor)–SIMD (vector) processor. This work presents the research and implementation of the vector datapath of this vector coprocessor which is tightly-coupled to a Sparc-V8 compliant CPU, the optimization and simulation methodologies employed and the use of Electronic System Level (ESL) techniques to rapidly design SIMD datapaths

Loughborough University Institutional Repository

Design of a smartphone with a Digital Signal Processor

Author: Lecluse Joep
Publication venue
Publication date: 01/01/1996
Field of study

Repository TU/e

Pure OAI Repository

Media gateway utilizando um GPU

Author: Portugal Ricardo
Publication venue: Universidade de Aveiro
Publication date: 01/01/2012
Field of study

Mestrado em Engenharia de Computadores e Telemátic

Repositório Institucional da Universidade de Aveiro

Network streaming and compression for mixed reality tele-immersion

Author: Mekuria R.N. (Rufael)
Publication venue
Publication date: 01/01/2017
Field of study

Bulterman, D.C.A. [Promotor]Cesar, P.S. [Copromotor

VU Research Portal

CWI's Institutional Repository

Large-scale unsupervised audio pre-training for video-to-speech synthesis

Author: Kefalas Triantafyllos
Panagakis Yannis
Pantic Maja
Publication venue
Publication date: 27/06/2023
Field of study

Video-to-speech synthesis is the task of reconstructing the speech signal from a silent video of a speaker. Most established approaches to date involve a two-step process, whereby an intermediate representation from the video, such as a spectrogram, is extracted first and then passed to a vocoder to produce the raw audio. Some recent work has focused on end-to-end synthesis, whereby the generation of raw audio and any intermediate representations is performed jointly. All such approaches involve training on data from almost exclusively audio-visual datasets, i.e. every audio sample has a corresponding video sample. This precludes the use of abundant audio-only datasets which may not have a corresponding visual modality (e.g. audiobooks, radio podcasts, speech recognition datasets etc.), as well as audio-only architectures that have been developed by the audio machine learning community over the years. In this paper we propose to train encoder-decoder models on more than 3,500 hours of audio data at 24kHz, and then use the pre-trained decoders to initialize the audio decoders for the video-to-speech synthesis task. The pre-training step uses audio samples only and does not require labels or corresponding samples from other modalities (visual, text). We demonstrate that this pre-training step improves the reconstructed speech and that it is an unexplored way to improve the quality of the generator in a cross-modal task while only requiring samples from one of the modalities. We conduct experiments using both raw audio and mel spectrograms as target outputs and benchmark our models with existing work.Comment: Submitted to IEE

arXiv.org e-Print Archive

Algorithm/Architecture Co-Exploration of Visual Computing: Overview and Future Perspectives

Author: Chen Yen-Kuang
Lee Gwo Giun (Chris)
Mattavelli Marco
S. Jang Euee
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/01/2010
Field of study

Concurrently exploring both algorithmic and architectural optimizations is a new design paradigm. This survey paper addresses the latest research and future perspectives on the simultaneous development of video coding, processing, and computing algorithms with emerging platforms that have multiple cores and reconfigurable architecture. As the algorithms in forthcoming visual systems become increasingly complex, many applications must have different profiles with different levels of performance. Hence, with expectations that the visual experience in the future will become continuously better, it is critical that advanced platforms provide higher performance, better flexibility, and lower power consumption. To achieve these goals, algorithm and architecture co-design is significant for characterizing the algorithmic complexity used to optimize targeted architecture. This paper shows that seamless weaving of the development of previously autonomous visual computing algorithms and multicore or reconfigurable architectures will unavoidably become the leading trend in the future of video technology

Infoscience - École polytechnique fédérale de Lausanne

Opus audiokoodekki matkapuhelinverkoissa

Author: Sundvall Mika
Publication venue
Publication date: 15/12/2014
Field of study

The latest generations in mobile networks have enabled a possibility to include high quality audio coding in data transmission. On the other hand, an on-going effort to move the audio signal processing from dedicated hardware to data centers with generalized hardware introduces a challenge of providing enough computational power needed by the virtualized network elements. This thesis evaluates the usage of a modern hybrid audio codec called Opus in a virtualized network element. It is performed by integrating the codec, testing it for functionality and performance on a general purpose processor, as well as evaluating the performance in comparison to the digital signal processor's performance. Functional testing showed that the codec was integrated successfully and bit compliance with the Opus standard was met. The performance results showed that although the digital signal processor computes the encoder's algorithms with less clock cycles, related to the processor's whole capacity the general purpose processor performs more efficiently due to higher clock frequency. For the decoder this was even clearer, when the generic hardware spends on average less clock cycles for performing the algorithms.Uusimmat sukupolvet matkapuhelinverkoissa mahdollistavat korkealaatuisen audiokoodauksen tiedonsiirrossa. Toisaalta audiosignaalinkäsittelyn siirtäminen sovelluskohtaisesta laitteistosta keskitettyjen palvelinkeskusten yleiskäyttöiseen laitteistoon on käynnissä, mikä aiheuttaa haasteita tarjota riittävästi laskennallista tehoa virtualisoituja verkkoelementtejä varten. Tämä diplomityö arvioi modernin hybridikoodekin, Opuksen, käyttöä virtualisoidussa verkkoelementissä. Se on toteutettu integroimalla koodekki, testaamalla funktionaalisuutta ja suorituskykyä yleiskäyttöisellä prosessorilla sekä arvioimalla suorituskykyä verrattuna digitaalisen signaaliprosessorin suorituskykyyn. Funktionaalinen testaus osoitti että koodekki oli integroitu onnistuneesti ja että bittitason yhdenmukaisuus Opuksen standardin kanssa saavutettiin. Suorituskyvyn testitulokset osoittivat, että vaikka enkoodaus tuotti vähemmän kellojaksoja digitaalisella signaaliprosessorilla, yleiskäyttöinen prosessori suoriutuu tehokkaammin suhteutettuna prosessorin kokonaiskapasiteettiin korkeamman kellotaajuuden ansiosta. Dekooderilla tämä näkyi vielä selkeämmin, sillä yleiskäyttöinen prosessori kulutti keskimäärin vähemmän kellojaksoja algoritmien suorittamiseen

Aaltodoc Publication Archive

Network emulation focusing on QoS-Oriented satellite communication

Author: Dairaine Laurent
Gineste Mathieu
Thalmensy Hervé
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 31/12/2007
Field of study

This chapter proposes network emulation basics and a complete case study of QoS-oriented Satellite Communication

Open Archive Toulouse Archive Ouverte

Scipedia

Recommended from our members

Intelligent Side Information Generation in Distributed Video Coding

Author: Akinola Mobolaji Olukunle
Publication venue
Publication date: 15/05/2015
Field of study

Distributed video coding (DVC) reverses the traditional coding paradigm of complex encoders allied with basic decoding to one where the computational cost is largely incurred by the decoder. This is attractive as the proven theoretical work of Wyner-Ziv (WZ) and Slepian-Wolf (SW) shows that the performance by such a system should be exactly the same as a conventional coder. Despite the solid theoretical foundations, current DVC qualitative and quantitative performance falls short of existing conventional coders and there remain crucial limitations. A key constraint governing DVC performance is the quality of side information (SI), a coarse representation of original video frames which are not available at the decoder. Techniques to generate SI have usually been based on linear motion compensated temporal interpolation (LMCTI), though these do not always produce satisfactory SI quality, especially in sequences exhibiting non-linear motion. This thesis presents an intelligent higher order piecewise trajectory temporal interpolation (HOPTTI) framework for SI generation with original contributions that afford better SI quality in comparison to existing LMCTI-based approaches. The major elements in this framework are: (i) a cubic trajectory interpolation algorithm model that significantly improves the accuracy of motion vector estimations; (ii) an adaptive overlapped block motion compensation (AOBMC) model which reduces both blocking and overlapping artefacts in the SI emanating from the block matching algorithm; (iii) the development of an empirical mode switching algorithm; and (iv) an intelligent switching mechanism to construct SI by automatically selecting the best macroblock from the intermediate SI generated by HOPTTI and AOBMC algorithms. Rigorous analysis and evaluation confirms that significant quantitative and perceptual improvements in SI quality are achieved with the new framework

Open Research Online (The Open University)