749 research outputs found

    Robust short clip representation and fast search through large video collections

    Get PDF
    Master'sMASTER OF ENGINEERIN

    A Highly Robust Audio Monitoring System for Radio Broadcasting

    Get PDF
    Proposing a novel approach for monitoringsongs for the radio broadcasting channels is veryimportant for the interest of singers, writers andmusicians in the musical industry. Singers, writers andmusicians have a claim to intellectual property rightsfor their songs broadcast over all the radio channels.According to this intellectual property rights actsingers, writers and musicians should be paid for theirsongs broadcast over all the radio channels. Therefore wepropose a real time audio monitoring approach to solvethis problem which includes our own audio recognitionalgorithm. It is easy to recognize a song, when you providethe original high quality blueprint of the song as input. Butwe can’t expect such kind of audio input from radiochannels since lots of transformations are possible beforereaching the end user or listener. For example, addingenvironmental effects such as noise, adding commercialson the song as watermarks, playing more than one songas a chain without adding any silence between them,playing a part of the song, playing same song in variousspeeds and so on. These transformations cause change inthe uniqueness of particular song and make the problemeven more difficult. The algorithm we proposing is resistantto noise and distortion as well as it is capable of recognizingshort segment of song when broadcasting over the radiochannels. At the end of the processing our system generatesa descriptive report including title of the song, singer of thesong, writer of the song, composer of the song, number oftimes it was played and when it was played for all songs fora particular period for all radio broadcasting channels. Weevaluate our system against various types of real timescenarios and achieved overall higher level of accuracy(96%) at the end

    Classifying and Mapping Aquatic Vegetation in Heterogeneous Stream Ecosystems Using Visible and Multispectral UAV Imagery

    Get PDF
    The need for assessment and management of aquatic vegetation in stream ecosystems is recognized given the importance in impacting water quality, hydrodynamics, and aquatic biota. However, existing approaches to monitor are laborious and its currently not feasible to track spatial and temporal differences at broad scales. The objective of this study was therefore to map and classify aquatic vegetation of a shallow stream with heterogenous mixtures of emergent and submerged aquatic vegetation. Data was collected in the Camden Creek watershed within the Inner Bluegrass Region of central Kentucky. The use of unmanned aerial vehicles (UAVs) was employed and both visible (RGB) and multispectral imagery were collected. Machine learning techniques were applied in an off-the-shelf software (QGIS environment) to develop visible and multispectral classification land-cover maps following an effective object-based image analysis workflow. Visible images were additionally coupled with high frequency water quality data to examine the spatial and temporal behavior of the aquatic vegetation. Results showed high overall classification accuracies (OA=83.5% for the training dataset and OA=83.73% for the validation dataset) for the visible imagery, with excellent user’s and producer’s accuracies for duckweed, both for training and validation. Surprisingly, multispectral overall accuracies were substantial (OA=77.8% for the training dataset and OA=70.2% for the validation dataset) but were inferior to the visible classification results. User’s and producer’s accuracies were lower for almost all classes. However, this approach was unsuccessful in detecting, segmenting and classifying submerged aquatic vegetation (algae) for both datasets. Finally, a change detection algorithm was applied to the visible classified maps and the changes in duckweed areal coverage were successfully estimated

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    Digital rights management techniques for H.264 video

    Get PDF
    This work aims to present a number of low-complexity digital rights management (DRM) methodologies for the H.264 standard. Initially, requirements to enforce DRM are analyzed and understood. Based on these requirements, a framework is constructed which puts forth different possibilities that can be explored to satisfy the objective. To implement computationally efficient DRM methods, watermarking and content based copy detection are then chosen as the preferred methodologies. The first approach is based on robust watermarking which modifies the DC residuals of 4×4 macroblocks within I-frames. Robust watermarks are appropriate for content protection and proving ownership. Experimental results show that the technique exhibits encouraging rate-distortion (R-D) characteristics while at the same time being computationally efficient. The problem of content authentication is addressed with the help of two methodologies: irreversible and reversible watermarks. The first approach utilizes the highest frequency coefficient within 4×4 blocks of the I-frames after CAVLC en- tropy encoding to embed a watermark. The technique was found to be very effect- ive in detecting tampering. The second approach applies the difference expansion (DE) method on IPCM macroblocks within P-frames to embed a high-capacity reversible watermark. Experiments prove the technique to be not only fragile and reversible but also exhibiting minimal variation in its R-D characteristics. The final methodology adopted to enforce DRM for H.264 video is based on the concept of signature generation and matching. Specific types of macroblocks within each predefined region of an I-, B- and P-frame are counted at regular intervals in a video clip and an ordinal matrix is constructed based on their count. The matrix is considered to be the signature of that video clip and is matched with longer video sequences to detect copies within them. Simulation results show that the matching methodology is capable of not only detecting copies but also its location within a longer video sequence. Performance analysis depict acceptable false positive and false negative rates and encouraging receiver operating charac- teristics. Finally, the time taken to match and locate copies is significantly low which makes it ideal for use in broadcast and streaming applications

    MediaSync: Handbook on Multimedia Synchronization

    Get PDF
    This book provides an approachable overview of the most recent advances in the fascinating field of media synchronization (mediasync), gathering contributions from the most representative and influential experts. Understanding the challenges of this field in the current multi-sensory, multi-device, and multi-protocol world is not an easy task. The book revisits the foundations of mediasync, including theoretical frameworks and models, highlights ongoing research efforts, like hybrid broadband broadcast (HBB) delivery and users' perception modeling (i.e., Quality of Experience or QoE), and paves the way for the future (e.g., towards the deployment of multi-sensory and ultra-realistic experiences). Although many advances around mediasync have been devised and deployed, this area of research is getting renewed attention to overcome remaining challenges in the next-generation (heterogeneous and ubiquitous) media ecosystem. Given the significant advances in this research area, its current relevance and the multiple disciplines it involves, the availability of a reference book on mediasync becomes necessary. This book fills the gap in this context. In particular, it addresses key aspects and reviews the most relevant contributions within the mediasync research space, from different perspectives. Mediasync: Handbook on Multimedia Synchronization is the perfect companion for scholars and practitioners that want to acquire strong knowledge about this research area, and also approach the challenges behind ensuring the best mediated experiences, by providing the adequate synchronization between the media elements that constitute these experiences

    Content-based video copy detection using multimodal analysis

    Get PDF
    Ankara : The Department of Computer Engineering and the Institute of Engineering and Science of Bilkent University, 2009.Thesis (Master's) -- Bilkent University, 2009.Includes bibliographical references leaves 67-76.Huge and increasing amount of videos broadcast through networks has raised the need of automatic video copy detection for copyright protection. Recent developments in multimedia technology introduced content-based copy detection (CBCD) as a new research field alternative to the watermarking approach for identification of video sequences. This thesis presents a multimodal framework for matching video sequences using a three-step approach: First, a high-level face detector identifies facial frames/shots in a video clip. Matching faces with extended body regions gives the flexibility to discriminate the same person (e.g., an anchor man or a political leader) in different events or scenes. In the second step, a spatiotemporal sequence matching technique is employed to match video clips/segments that are similar in terms of activity. Finally the non-facial shots are matched using low-level visual features. In addition, we utilize fuzzy logic approach for extracting color histogram to detect shot boundaries of heavily manipulated video clips. Methods for detecting noise, frame-droppings, picture-in-picture transformation windows, and extracting mask for still regions are also proposed and evaluated. The proposed method was tested on the query and reference dataset of CBCD task of TRECVID 2008. Our results were compared with the results of top-8 most successful techniques submitted to this task. Experimental results show that the proposed method performs better than most of the state-of-the-art techniques, in terms of both effectiveness and efficiency.Küçüktunç, OnurM.S

    Action Recognition in Videos: from Motion Capture Labs to the Web

    Full text link
    This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4 table

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
    corecore