7 research outputs found

    Listen, Look, and Gotcha: Instant Video Search with Mobile Phones by Layered Audio-Video Indexing *

    Get PDF
    ABSTRACT Mobile video is quickly becoming a mass consumer phenomenon. More and more people are using their smartphones to search and browse video content while on the move. In this paper, we have developed an innovative instant mobile video search system through which users can discover videos by simply pointing their phones at a screen to capture a very few seconds of what they are watching. The system is able to index large-scale video data using a new layered audio-video indexing approach in the cloud, as well as extract light-weight joint audio-video signatures in real time and perform progressive search on mobile devices. Unlike most existing mobile video search applications that simply send the original video query to the cloud, the proposed mobile system is one of the first attempts at instant and progressive video search leveraging the light-weight computing capacity of mobile devices. The system is characterized by four unique properties: 1) a joint audio-video signature to deal with the large aural and visual variances associated with the query video captured by the mobile phone, 2) layered audio-video indexing to holistically exploit the complementary nature of audio and video signals, 3) light-weight fingerprinting to comply with mobile processing capacity, and 4) a progressive query process to significantly reduce computational costs and improve the user experience-the search process can stop anytime once a confident result is achieved. We have collected 1,400 query videos captured by 25 mobile users from a dataset of 600 hours of video. The experiments show that our system outperforms state-of-the-art methods by achieving 90.79% precision when the query video is less than 10 seconds and 70.07% even when the query video is less than 5 seconds. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. The search process can stop anytime once a confident search result is achieved. Thus, the user does not need to wait for a fixed time lag. The proposed system is characterized by its unique features such as layered audio-video indexing, as well as instant and progressive search. Categories and Subject Descriptor

    Effective and efficient query processing for video subsequence identification

    No full text
    With the growing demand for visual information of rich content, effective and efficient manipulations of large video databases are increasingly desired. Many investigations have been made on content-based video retrieval. However, despite the importance, video subsequence identification, which is to find the similar content to a short query clip from a long video sequence, has not been well addressed. This paper presents a graph transformation and matching approach to this problem, with extension to identify the occurrence of potentially different ordering or length due to content editing. With a novel batch query algorithm to retrieve similar frames, the mapping relationship between the query and database video is first represented by a bipartite graph. The densely matched parts along the long sequence are then extracted, followed by a filter-and-refine search strategy to prune some irrelevant subsequences. During the filtering stage, Maximum Size Matching is deployed for each subgraph constructed by the query and candidate subsequence to obtain a smaller set of candidates. During the refinement stage, Sub-Maximum Similarity Matching is devised to identify the subsequence with the highest aggregate score from all candidates, according to a robust video similarity model that incorporates visual content, temporal order, and frame alignment information. The performance studies conducted on a long video recording of 50 hours validate that our approach is promising in terms of both search accuracy and speed

    Multi-level Video Filtering Using Non-textual Contents

    Get PDF

    Estudio del comportamiento del iDistance en la recuperación de video basada en contenido

    Get PDF
    En este proyecto se presenta el iDistance como método de indexación de datos altamente dimensionales utilizando la técnica reducción de la dimensionalidad y se estudia su comportamiento en un sistema de video basado en contenido (Content-Based Video Retrieval, CBVR). Para poder crear el índice “iDistance” es necesario obtener los puntos de referencia del conjunto de datos dim-dimensional y para ello se va a utilizar una técnica de clusteirng llamada kmeans. Una vez creado el iDistance, este puede incluirse en un CBVR para probar su comportamiento en la identificación de subsecuencias de video, de manera que será el iDistance el que recupere los frames similares para un procesado posterior con el objetivo de la identificación de la subsecuencia de consulta. Para comparar los resultados se ha utilizado otra técnica para resolver la maldición de la dimensionalidad basada en vectores de aproximación, como es el VA-File. Mientras que respecto a la búsqueda de videos, se realizan mejoras para la identificación de subsecuencias de video. En cuanto al contenido, en este proyecto se presentan las características mas importantes de los datos altamente dimensionales así como las métricas utilizadas para su clasificación en distancia. Se introduce el árbol B+ como núcleo en el que está basado el iDistance así como todas las operaciones asociadas a dicha estructura de datos. También se trata la teoría de grafos bipartitos y matching puesto que es imprescindible para la identificación de subsecuencias de video. Posteriormente, se estudia e implementa el iDistance como motor de indexación en bases de datos altamente dimensionales, prestando especial atención en la metodología de indexación y búsqueda en la consulta de los K vecinos más cercanos. Tras dicho estudio se proponen una serie de experimentos con datos de video reales con el objetivo de estudiar el rendimiento con la variación de parámetros clave en la configuración del iDistance. Una vez estudiado el iDistance, se procede a introducir dicho motor de indexación en un sistema de recuperación de video basado en contenido para la identificación de subsecuencias de video. En este proyecto, además, se propone la recuperación de las K mejores subsecuencias en ranking estudiando el comportamiento del acierto de las mismas en una batería de experimentos posterior
    corecore