446 research outputs found
A virtual model for simulation and design of architectures in MPEG-4 audio and multimedia context
Research in multimedia for consumer electronics is dominated by the problem of incredibly short times-to-market, that means fast complexity estimations and fast design of new architectures. On one side more and more sophisticated and flexible applications are rapidly developed, on the other side the exponential growth in IC computational power seems to be hardly capable to keep pace with requirements for real-time applications, since their complexity is exponentially growing as well. The processor's performance, often slowed down by bottlenecks in memories and buses, is further reduced by the time wasted in communication among the several application layers. From this problem comes the conception of integrated development frameworks for simulation and design. Tools for simulation and analysis of architectures have appeared from academic and industrial research laboratories above all in the last decade. Many of them are conceived to provide low-level exact simulation of the supported devices, at the price of heavy slowdowns in simulation times and huge sizes of traced data reports. Some others, in the last years, introduced some degrees of approximation in simulations, in order to speed up execution time and to increase the flexibility of the tools to support multi-processors. The resulting more or less abstract models are anyway not suitable to analyze real multimedia-oriented applications, where programs are usually available in some languages including dedicated libraries and meaningful results are only those measured in function of time. On another level, tools for hardware/software codesign or for block-based system design provide more useful results, but to be effective they must rely on rigid cores that may allow only a few degrees of reconfigurability; some very recent tools are conceived to design complex systems by modeling blocks through a high level description language. Conversely, the virtual model and related tool proposed in this dissertation have their roots in an approach to the analysis of complexity that aims to be, as far as possible, platform independent. The method is based on the concepts of abstract classes of operations and simulation in function of the performance time. The work described in this dissertation finds its application field in the world of multimedia, more precisely in multimedia-oriented Audio applications. Media applications are commonly programmed by imperative or object-oriented languages, which are composed by many different statements, operators and above all standard libraries. A careful profiling of typical applications permits to detect fundamental operations and functions and to define a virtual instruction set, grouping more or less similar operators and breaking functions into basic building blocks. The resulting virtual instruction set does not correspond to any actual one but it has as property to be easily mapped on a large number of existing ones. The simulation of an architecture requires then the availability of measures, benchmarks or estimations of at least one member of each abstract class. The number of classes, i.e. the complexity vector, can be adapted in length and detail to the needed degree of precision and to the available set of actual measures and/or benchmarks. The input to the simulation is described by a high-level standardized programming language, the new MPEG-4 Structured Audio Orchestra Language (SAOL); in principle, it may also be the case that the application does not need any translation, if it is already available in this format. Moreover, simulation through a high-level language permits to trace the behavior of the target architecture in function of the internal time of the application itself, result that is fundamental when the related workload is highly variable as in downloadable and/or interactive scenarios. In such cases complexity has always been considered a guess. The new virtual model for analyses of complexity led to two main practical results: the proposed method of simulation has been used to define complexity levels for Structured Audio in the MPEG-4 Standard, and consequently the platform independent analyzer has become the MPEG-4 reference software for MPEG-4 SA Conformance test. a virtual instruction set has been conceived that has permitted the implementation of an efficient Structured Audio decoder, SAINT, based on a virtual interpreted DSP core. The flexibility of the SAINT virtual DSP approach has permitted a fast porting of its execution engine on a superscalar VLIW DSP, making it one of the building blocks of the ThreeDSPACE system, a framework for advanced rendering of 3-D Audio scenes based on MPEG-4 descriptions. The complete tool for simulation of architectures, including a cache simulator, has shown promising results, achieving estimations of the execution time of SAINT with an approximation of the 10% in the mean for a general purpose processor, and of the 20% for a very complex superscalar VLIW processor. Estimated programs have sometimes dynamic excursions of a factor 10 in their complexity along the time axis. In general, the experimental results can be considered in line with those of the most recent, state-of-the-art simulators presented in literature. A limitation of the language adopted to model the applications is that its generality is limited to onedimensional floating-point computation; the most relevant advantage is that simulation in function of the performance time is straightforward and provides the time-dependent results that are fundamental for the optimization of real-time applications. With the tool developed in this PhD work, complex programs can be quickly modeled thanks to SA specific libraries and also quickly analyzed; moreover, the tool can be easily extended to become a tool for automatic generation and configuration of the main building blocks of a system running the application, or the class of applications, under consideration. The proposed model is finally intended as an alternative approach to coordinate two sides; a principal goal of this work was to conceive and specify a systematic analysis method that can be useful to both the software programmer and to the hardware system engineer. The former can benefit of a reconfigurable and dedicated highlevel software tool able to profile programs in a simple and platform independent manner and to easily simulate, with some margins of error, the behavior of a specific platform, existing or virtual; the latter can exploit complexity estimations in an abstract format, with the possibility to study the target application in its several aspects (operations, memory usage, data flows among the different program blocks) and the potentiality to extend these results to an automatic generation of high-level system architectures
Parametrization, auralization, and authoring of room acoustics for virtual reality applications
The primary goal of this work has been to develop means to represent acoustic properties of an environment with a set of spatial sound related parameters. These parameters are used for creating virtual environments, where the sounds are expected to be perceived by the user as if they were listened to in a corresponding real space. The virtual world may consist of both visual and audio components. Ideally in such an application, the sound and the visual parts of the virtual scene are in coherence with each other, which should improve the user immersion in the virtual environment.
The second aim was to verify the feasibility of the created sound environment parameter set in practice. A virtual acoustic modeling system was implemented, where any spatial sound scene, defined by using the developed parameters, can be rendered audible in real time. In other words the user can listen to the auralized sound according to the defined sound scene parameters.
Thirdly, the authoring of creating such parametric sound scene representations was addressed. In this authoring framework, sound scenes and an associated visual scene can be created to be further encoded and transmitted in real time to a remotely located renderer. The visual scene counterpart was created as a part of the multimedia scene acting simultaneously as a user interface for renderer-side interaction.reviewe
Bimodal Audiovisual Perception in Interactive Application Systems of Moderate Complexity
The dissertation at hand deals with aspects of quality perception of
interactive audiovisual application systems of moderate complexity as e.g.
defined in the MPEG-4 standard. Because in these systems the available
computing power is limited, it is decisive to know which factors influence
the perceived quality. Only then can the available computing power be
distributed in the most effective and efficient way for the simulation and
display of audiovisual 3D scenes. Whereas quality factors for the unimodal
auditory and visual stimuli are well known and respective models of
perception have been successfully devised based on this knowledge, this is
not true for bimodal audiovisual perception. For the latter, it is only
known that some kind of interdependency between auditory and visual
perception does exist. The exact mechanisms of human audiovisual perception
have not been described. It is assumed that interaction with an application
or scene has a major influence upon the perceived overall quality.
The goal of this work was to devise a system capable of performing
subjective audiovisual assessments in the given context in a largely
automated way. By applying the system, first evidence regarding audiovisual
interdependency and influence of interaction upon perception should be
collected. Therefore this work was composed of three fields of activities:
the creation of a test bench based on the available but (regarding the
audio functionality) somewhat restricted MPEG-4 player, the preoccupation
with methods and framework requirements that ensure comparability and
reproducibility of audiovisual assessments and results, and the performance
of a series of coordinated experiments including the analysis and
interpretation of the collected data. An object-based modular audio
rendering engine was co-designed and -implemented which allows to perform
simple room-acoustic simulations based on the MPEG-4 scene description
paradigm in real-time. Apart from the MPEG-4 player, the test bench
consists of a haptic Input Device used by test subjects to enter their
quality ratings and a logging tool that allows to journalize all relevant
events during an assessment session. The collected data can be exported
comfortably for further analysis using appropriate statistic tools.
A thorough analysis of the well established test methods and
recommendations for unimodal subjective assessments was performed to find
out whether a transfer to the audiovisual bimodal case is easily possible.
It became evident that - due to the limited knowledge about the underlying
perceptual processes - a novel categorization of experiments according to
their goals could be helpful to organize the research in the field.
Furthermore, a number of influencing factors could be identified that
exercise control over bimodal perception in the given context.
By performing the perceptual experiments using the devised system, its
functionality and ease of use was verified. Apart from that, some first
indications for the role of interaction in perceived overall quality have
been collected: interaction in the auditory modality reduces a human's
ability of correctly rating the audio quality, whereas visually based
(cross-modal) interaction does not necessarily generate this effect.Die vorliegende Dissertation beschäftigt sich mit Aspekten der
Qualitätswahrnehmung von interaktiven audiovisuellen Anwendungssystemen
moderater Komplexität, wie sie z.B. durch den MPEG-4 Standard definiert
sind. Die Frage, welche Faktoren Einfluss auf die wahrgenommene Qualität
von audiovisuellen Anwendungssystemen haben ist entscheidend dafĂĽr, wie die
nur begrenzt zur VerfĂĽgung stehende Rechenleistung fĂĽr die
Echtzeit-Simulation von 3D Szenen und deren Darbietung sinnvoll verteilt
werden soll. Während Qualitätsfaktoren für unimodale auditive als auch
visuelle Stimuli seit langem bekannt sind und entsprechende Modelle
existieren, mĂĽssen diese fĂĽr die bimodale audiovisuelle Wahrnehmung noch
hergeleitet werden. Dabei ist bekannt, dass eine Wechselwirkung zwischen
auditiver und visueller Qualität besteht, nicht jedoch, wie die Mechanismen
menschlicher audiovisueller Wahrnehmung genau arbeiten. Es wird auch
angenommen, dass der Faktor Interaktion einen wesentlichen Einfluss auf
wahrgenommene Qualität hat.
Das Ziel dieser Arbeit war, ein System fĂĽr die zeitsparende und weitgehend
automatisierte DurchfĂĽhrung von subjektiven audiovisuellen
Wahrnehmungstests im gegebenen Kontext zu erstellen und es fĂĽr einige
exemplarische Experimente einzusetzen, welche erste Aussagen ĂĽber
audiovisuelleWechselwirkungen und den Einfluss von Interaktion auf die
Wahrnehmung erlauben sollten. Demzufolge gliederte sich die Arbeit in drei
Aufgabenbereiche: die Erstellung eines geeigneten Testsystems auf der
Grundlage eines vorhandenen, jedoch in seiner Audiofunktionalität noch
eingeschränkten MPEG-4 Players, das Sicherstellen von Vergleichbarkeit und
Wiederholbarkeit von audiovisuellen Wahrnehmungstests durch definierte
Testmethoden und -bedingungen, und die eigentliche DurchfĂĽhrung der
aufeinander abgestimmten Experimente mit anschlieĂżender Auswertung und
Interpretation der gewonnenen Daten. Dazu wurde eine objektbasierte,
modulare Audio-Engine mitentworfen und -implementiert, welche basierend auf
den Möglichkeiten der MPEG-4 Szenenbeschreibung alle Fähigkeiten zur
Echtzeitberechnung von Raumakustik bietet. Innerhalb des entwickelten
Testsystems kommuniziert der MPEG-4 Player mit einem hardwaregestĂĽtzten
Benutzerinterface zur Eingabe der Qualitätsbewertungen durch die
Testpersonen. Sämtliche relevanten Ereignisse, die während einer
Testsession auftreten, können mit Hilfe eines Logging-Tools aufgezeichnet
und fĂĽr die weitere Datenanalyse mit Statistikprogrammen exportiert werden.
Eine Analyse der existierenden Testmethoden und -empfehlungen fĂĽr unimodale
Wahrnehmungstests sollte zeigen, ob deren Ăśbertragung auf den
audiovisuellen Fall möglich ist. Dabei wurde deutlich, dass bedingt durch
die fehlende Kenntnis der zugrundeliegenden Wahrnehmungsprozesse zunächst
eine Unterteilung nach den Zielen der durchgefĂĽhrten Experimente sinnvoll
erscheint. Weiterhin konnten Einflussfaktoren identifiziert werden, die die
bimodale Wahrnehmung im gegebenen Kontext steuern.
Bei der DurchfĂĽhrung der Wahrnehmungsexperimente wurde die
Funktionsfähigkeit des erstellten Testsystems verifiziert. Darüber hinaus
ergaben sich erste Anhaltspunkte fĂĽr den Einfluss von Interaktion auf die
wahrgenommene Gesamtqualität: Interaktion in der auditiven Modalität
verringert die Fähigkeit, Audioqualität korrekt beurteilen zu können,
während visuell gestützte Interaktion (cross-modal) diesen Effekt nicht
zwingend generiert
Platforms for handling and development of audiovisual data
Estágio realizado na MOG Solutions e orientado por VĂtor TeixeiraTese de mestrado integrado. Engenharia Informátca e Computação. Faculdade de Engenharia. Universidade do Porto. 200
- …