1,986 research outputs found

    A Review of Audio Features and Statistical Models Exploited for Voice Pattern Design

    Full text link
    Audio fingerprinting, also named as audio hashing, has been well-known as a powerful technique to perform audio identification and synchronization. It basically involves two major steps: fingerprint (voice pattern) design and matching search. While the first step concerns the derivation of a robust and compact audio signature, the second step usually requires knowledge about database and quick-search algorithms. Though this technique offers a wide range of real-world applications, to the best of the authors' knowledge, a comprehensive survey of existing algorithms appeared more than eight years ago. Thus, in this paper, we present a more up-to-date review and, for emphasizing on the audio signal processing aspect, we focus our state-of-the-art survey on the fingerprint design step for which various audio features and their tractable statistical models are discussed.Comment: http://www.iaria.org/conferences2015/PATTERNS15.html ; Seventh International Conferences on Pervasive Patterns and Applications (PATTERNS 2015), Mar 2015, Nice, Franc

    Service Platform for Converged Interactive Broadband Broadcast and Cellular Wireless

    Get PDF
    A converged broadcast and telecommunication service platform is presented that is able to create, deliver, and manage interactive, multimedia content and services for consumption on three different terminal types. The motivations of service providers for designing converged interactive multimedia services, which are crafted for their individual requirements, are investigated. The overall design of the system is presented with particular emphasis placed on the operational features of each of the sub-systems, the flows of media and metadata through the sub-systems and the formats and protocols required for inter-communication between them. The key features of tools required for creating converged interactive multimedia content for a range of different end-user terminal types are examined. Finally possible enhancements to this system are discussed. This study is of particular interest to those organizations currently conducting trials and commercial launches of DVB-H services because it provides them with an insight of the various additional functions required in the service provisioning platforms to provide fully interactive services to a range of different mobile terminal types

    High Dynamic Range Images Coding: Embedded and Multiple Description

    Get PDF
    The aim of this work is to highlight and discuss a new paradigm for representing high-dynamic range (HDR) images that can be used for both its coding and describing its multimedia content. In particular, the new approach defines a new representation domain that, conversely from the classical compressed one, enables to identify and exploit content metadata. Information related to content are used here to control both the encoding and the decoding process and are directly embedded in the compressed data stream. Firstly, thanks to the proposed solution, the content description can be quickly accessed without the need of fully decoding the compressed stream. This fact ensures a significant improvement in the performance of search and retrieval systems, such as for semantic browsing of image databases. Then, other potential benefits can be envisaged especially in the field of management and distribution of multimedia content, because the direct embedding of content metadata preserves the consistency between content stream and content description without the need of other external frameworks, such as MPEG-21. The paradigm proposed here may also be shifted to Multiple description coding, where different representations of the HDR image can be generated accordingly to its content. The advantages provided by the new proposed method are visible at different levels, i.e. when evaluating the redundancy reduction. Moreover, the descriptors extracted from the compressed data stream could be actively used in complex applications, such as fast retrieval of similar images from huge databases

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Pattern Matching Techniques for Replacing Missing Sections of Audio Streamed across Wireless Networks

    Get PDF
    Streaming media on the Internet can be unreliable. Services such as audio-on-demand drastically increase the loads on networks; therefore, new, robust, and highly efficient coding algorithms are necessary. One method overlooked to date, which can work alongside existing audio compression schemes, is that which takes into account the semantics and natural repetition of music. Similarity detection within polyphonic audio has presented problematic challenges within the field of music information retrieval. One approach to deal with bursty errors is to use self-similarity to replace missing segments. Many existing systems exist based on packet loss and replacement on a network level, but none attempt repairs of large dropouts of 5 seconds or more. Music exhibits standard structures that can be used as a forward error correction (FEC) mechanism. FEC is an area that addresses the issue of packet loss with the onus of repair placed as much as possible on the listener's device. We have developed a server--client-based framework (SoFI) for automatic detection and replacement of large packet losses on wireless networks when receiving time-dependent streamed audio. Whenever dropouts occur, SoFI swaps audio presented to the listener between a live stream and previous sections of the audio stored locally. Objective and subjective evaluations of SoFI where subjects were presented with other simulated approaches to audio repair together with simulations of replacements including varying lengths of time in the repair give positive results.</jats:p

    MASCOT : metadata for advanced scalable video coding tools : final report

    Get PDF
    The goal of the MASCOT project was to develop new video coding schemes and tools that provide both an increased coding efficiency as well as extended scalability features compared to technology that was available at the beginning of the project. Towards that goal the following tools would be used: - metadata-based coding tools; - new spatiotemporal decompositions; - new prediction schemes. Although the initial goal was to develop one single codec architecture that was able to combine all new coding tools that were foreseen when the project was formulated, it became clear that this would limit the selection of the new tools. Therefore the consortium decided to develop two codec frameworks within the project, a standard hybrid DCT-based codec and a 3D wavelet-based codec, which together are able to accommodate all tools developed during the course of the project

    Multimedia

    Get PDF
    The nowadays ubiquitous and effortless digital data capture and processing capabilities offered by the majority of devices, lead to an unprecedented penetration of multimedia content in our everyday life. To make the most of this phenomenon, the rapidly increasing volume and usage of digitised content requires constant re-evaluation and adaptation of multimedia methodologies, in order to meet the relentless change of requirements from both the user and system perspectives. Advances in Multimedia provides readers with an overview of the ever-growing field of multimedia by bringing together various research studies and surveys from different subfields that point out such important aspects. Some of the main topics that this book deals with include: multimedia management in peer-to-peer structures & wireless networks, security characteristics in multimedia, semantic gap bridging for multimedia content and novel multimedia applications

    Video browsing interfaces and applications: a review

    Get PDF
    We present a comprehensive review of the state of the art in video browsing and retrieval systems, with special emphasis on interfaces and applications. There has been a significant increase in activity (e.g., storage, retrieval, and sharing) employing video data in the past decade, both for personal and professional use. The ever-growing amount of video content available for human consumption and the inherent characteristics of video data—which, if presented in its raw format, is rather unwieldy and costly—have become driving forces for the development of more effective solutions to present video contents and allow rich user interaction. As a result, there are many contemporary research efforts toward developing better video browsing solutions, which we summarize. We review more than 40 different video browsing and retrieval interfaces and classify them into three groups: applications that use video-player-like interaction, video retrieval applications, and browsing solutions based on video surrogates. For each category, we present a summary of existing work, highlight the technical aspects of each solution, and compare them against each other
    corecore