2,634 research outputs found

    A Review of Audio Features and Statistical Models Exploited for Voice Pattern Design

    Full text link
    Audio fingerprinting, also named as audio hashing, has been well-known as a powerful technique to perform audio identification and synchronization. It basically involves two major steps: fingerprint (voice pattern) design and matching search. While the first step concerns the derivation of a robust and compact audio signature, the second step usually requires knowledge about database and quick-search algorithms. Though this technique offers a wide range of real-world applications, to the best of the authors' knowledge, a comprehensive survey of existing algorithms appeared more than eight years ago. Thus, in this paper, we present a more up-to-date review and, for emphasizing on the audio signal processing aspect, we focus our state-of-the-art survey on the fingerprint design step for which various audio features and their tractable statistical models are discussed.Comment: http://www.iaria.org/conferences2015/PATTERNS15.html ; Seventh International Conferences on Pervasive Patterns and Applications (PATTERNS 2015), Mar 2015, Nice, Franc

    A Robust Perceptual Audio Hashing Using Balanced Multiwavelets

    Get PDF
    Digital multimedia content (especially audio) is becoming a major part of the average computer user experience. Large digital audio collections of music, audio and sound effects are also used by the entertainment, music, movie and animation industries. Therefore, the need for identification and management of audio content grows proportionally to the increasing widespread availability of such media virtually ”any time and any where” over the Internet. In this paper, we propose a novel framework for robust perceptual hashing of audio content using balanced multiwavelets (BMW). The framework for generating robust perceptual hash values (or fingerprints) is described. The generated hash values are used for identifying, searching, and retrieving audio content from large audio databases. Furthermore, we illustrate, through extensive computer simulation, the robustness of the proposed framework to efficiently represent audio content and withstand several signal processing attacks and manipulations

    A Robust Perceptual Audio Hashing Using Balanced Multiwavelets

    Get PDF
    Digital multimedia content (especially audio) is becoming a major part of the average computer user experience. Large digital audio collections of music, audio and sound effects are also used by the entertainment, music, movie and animation industries. Therefore, the need for identification and management of audio content grows proportionally to the increasing widespread availability of such media virtually ”any time and any where” over the Internet. In this paper, we propose a novel framework for robust perceptual hashing of audio content using balanced multiwavelets (BMW). The framework for generating robust perceptual hash values (or fingerprints) is described. The generated hash values are used for identifying, searching, and retrieving audio content from large audio databases. Furthermore, we illustrate, through extensive computer simulation, the robustness of the proposed framework to efficiently represent audio content and withstand several signal processing attacks and manipulations

    Compact and Robust MFCC-based Space-Saving Audio Fingerprint Extraction for Efficient Music Identification on FM Broadcast Monitoring

    Get PDF
    The Myanmar music industry urgently needs an efficient broadcast monitoring system to solve copyright infringement issues and illegal benefit-sharing between artists and broadcasting stations. In this paper, a broadcast monitoring system is proposed for Myanmar FM radio stations by utilizing space-saving audio fingerprint extraction based on the Mel Frequency Cepstral Coefficient (MFCC). This study focused on reducing the memory requirement for fingerprint storage while preserving the robustness of the audio fingerprints to common distortions such as compression, noise addition, etc. In this system, a three-second audio clip is represented by a 2,712-bit fingerprint block. This significantly reduces the memory requirement when compared to Philips Robust Hashing (PRH), one of the dominant audio fingerprinting methods, where a three-second audio clip is represented by an 8,192-bit fingerprint block. The proposed system is easy to implement and achieves correct and speedy music identification even on noisy and distorted broadcast audio streams. In this research work, we deployed an audio fingerprint database of 7,094 songs and broadcast audio streams of four local FM channels in Myanmar to evaluate the performance of the proposed system. The experimental results showed that the system achieved reliable performance

    ARCHANGEL: Tamper-proofing Video Archives using Temporal Content Hashes on the Blockchain

    Get PDF
    We present ARCHANGEL; a novel distributed ledger based system for assuring the long-term integrity of digital video archives. First, we describe a novel deep network architecture for computing compact temporal content hashes (TCHs) from audio-visual streams with durations of minutes or hours. Our TCHs are sensitive to accidental or malicious content modification (tampering) but invariant to the codec used to encode the video. This is necessary due to the curatorial requirement for archives to format shift video over time to ensure future accessibility. Second, we describe how the TCHs (and the models used to derive them) are secured via a proof-of-authority blockchain distributed across multiple independent archives. We report on the efficacy of ARCHANGEL within the context of a trial deployment in which the national government archives of the United Kingdom, Estonia and Norway participated.Comment: Accepted to CVPR Blockchain Workshop 201

    Panako: a scalable acoustic fingerprinting system handling time-scale and pitch modification

    Get PDF
    In this paper a scalable granular acoustic fingerprinting system robust against time and pitch scale modification is presented. The aim of acoustic fingerprinting is to identify identical, or recognize similar, audio fragments in a large set using condensed representations of audio signals, i.e. fingerprints. A robust fingerprinting system generates similar fingerprints for perceptually similar audio signals. The new system, presented here, handles a variety of distortions well. It is designed to be robust against pitch shifting, time stretching and tempo changes, while remaining scalable. After a query, the system returns the start time in the reference audio, and the amount of pitch shift and tempo change that has been applied. The design of the system that offers this unique combination of features is the main contribution of this research. The fingerprint itself consists of a combination of key points in a Constant-Q spectrogram. The system is evaluated on commodity hardware using a freely available reference database with fingerprints of over 30.000 songs. The results show that the system responds quickly and reliably on queries, while handling time and pitch scale modifications of up to ten percent

    Twofold Video Hashing with Automatic Synchronization

    Full text link
    Video hashing finds a wide array of applications in content authentication, robust retrieval and anti-piracy search. While much of the existing research has focused on extracting robust and secure content descriptors, a significant open challenge still remains: Most existing video hashing methods are fallible to temporal desynchronization. That is, when the query video results by deleting or inserting some frames from the reference video, most existing methods assume the positions of the deleted (or inserted) frames are either perfectly known or reliably estimated. This assumption may be okay under typical transcoding and frame-rate changes but is highly inappropriate in adversarial scenarios such as anti-piracy video search. For example, an illegal uploader will try to bypass the 'piracy check' mechanism of YouTube/Dailymotion etc by performing a cleverly designed non-uniform resampling of the video. We present a new solution based on dynamic time warping (DTW), which can implement automatic synchronization and can be used together with existing video hashing methods. The second contribution of this paper is to propose a new robust feature extraction method called flow hashing (FH), based on frame averaging and optical flow descriptors. Finally, a fusion mechanism called distance boosting is proposed to combine the information extracted by DTW and FH. Experiments on real video collections show that such a hash extraction and comparison enables unprecedented robustness under both spatial and temporal attacks.Comment: submitted to Image Processing (ICIP), 2014 21st IEEE International Conference o

    Optimization of star research algorithm for esmo star tracker

    Get PDF
    This paper explains in detail the design and the development of a software research star algorithm, embedded on a star tracker, by the ISAE/SUPAERO team. This research algorithm is inspired by musical techniques. This work will be carried out as part of the ESMO (European Student Moon Orbiter) project by different teams of students and professors from ISAE/SUPAERO (Institut Supe ́rieur de l’Ae ́ronautique et de l’Espace). Till today, the system engineering studies have been completed and the work that will be presented will concern the algorithmic and the embedded software development. The physical architecture of the sensor relies on APS 750 developed by the CIMI laboratory of ISAE/SUPAERO. First, a star research algorithm based on the image acquired in lost-in-space mode (one of the star tracker opera- tional modes) will be presented; it is inspired by techniques of musical recognition with the help of the correlation of digital signature (hash) with those stored in databases. The musical recognition principle is based on finger- printing, i.e. the extraction of points of interest in the studied signal. In the musical context, the signal spectrogram is used to identify these points. Applying this technique in image processing domain requires an equivalent tool to spectrogram. Those points of interest create a hash and are used to efficiently search within the database pre- viously sorted in order to be compared. The main goals of this research algorithm are to minimise the number of steps in the computations in order to deliver information at a higher frequency and to increase the computation robustness against the different possible disturbances
    corecore