Search CORE

2,626 research outputs found

A quick search method for audio signals based on a piecewise linear representation of feature trajectories

Author: Kashino Kunio
Kimura Akisato
Kurozumi Takayuki
Murase Hiroshi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/10/2007
Field of study

This paper presents a new method for a quick similarity-based search through long unlabeled audio streams to detect and locate audio clips provided by users. The method involves feature-dimension reduction based on a piecewise linear representation of a sequential feature trajectory extracted from a long audio stream. Two techniques enable us to obtain a piecewise linear representation: the dynamic segmentation of feature trajectories and the segment-based Karhunen-L\'{o}eve (KL) transform. The proposed search method guarantees the same search results as the search method without the proposed feature-dimension reduction method in principle. Experiment results indicate significant improvements in search speed. For example the proposed method reduced the total search time to approximately 1/12 that of previous methods and detected queries in approximately 0.3 seconds from a 200-hour audio database.Comment: 20 pages, to appear in IEEE Transactions on Audio, Speech and Language Processin

arXiv.org e-Print Archive

Crossref

A semantic feature for human motion retrieval

Author: Chao
Chiu
Choi
Eitz
Gonen
Hecker
Jain
Kovar
Kovar
Liu
Liu
Muller
Olsen
Pradhan
Shao
Sun
Yoshitaka
Yu
Publication venue: 'Wiley'
Publication date: 01/05/2013
Field of study

With the explosive growth of motion capture data, it becomes very imperative in animation production to have an efficient search engine to retrieve motions from large motion repository. However, because of the high dimension of data space and complexity of matching methods, most of the existing approaches cannot return the result in real time. This paper proposes a high level semantic feature in a low dimensional space to represent the essential characteristic of different motion classes. On the basis of the statistic training of Gauss Mixture Model, this feature can effectively achieve motion matching on both global clip level and local frame level. Experiment results show that our approach can retrieve similar motions with rankings from large motion database in real-time and also can make motion annotation automatically on the fly. Copyright © 2013 John Wiley & Sons, Ltd

Crossref

Bournemouth University Research Online

SeeSaw: Interactive Ad-hoc Search Over Image Databases

Author: Cafarella Michael
Favela Manuel
Gadepally Vijay
Madden Samuel
Moll Oscar
Publication venue
Publication date: 14/09/2023
Field of study

As image datasets become ubiquitous, the problem of ad-hoc searches over image data is increasingly important. Many high-level data tasks in machine learning, such as constructing datasets for training and testing object detectors, imply finding ad-hoc objects or scenes within large image datasets as a key sub-problem. New foundational visual-semantic embeddings trained on massive web datasets such as Contrastive Language-Image Pre-Training (CLIP) can help users start searches on their own data, but we find there is a long tail of queries where these models fall short in practice. SeeSaw is a system for interactive ad-hoc searches on image datasets that integrates state-of-the-art embeddings like CLIP with user feedback in the form of box annotations to help users quickly locate images of interest in their data even in the long tail of harder queries. One key challenge for SeeSaw is that, in practice, many sensible approaches to incorporating feedback into future results, including state-of-the-art active-learning algorithms, can worsen results compared to introducing no feedback, partly due to CLIP's high-average performance. Therefore, SeeSaw includes several algorithms that empirically result in larger and also more consistent improvements. We compare SeeSaw's accuracy to both using CLIP alone and to a state-of-the-art active-learning baseline and find SeeSaw consistently helps improve results for users across four datasets and more than a thousand queries. SeeSaw increases Average Precision (AP) on search tasks by an average of .08 on a wide benchmark (from a base of .72), and by a .27 on a subset of more difficult queries where CLIP alone performs poorly.Comment: SIGMOD 2024 camera read

arXiv.org e-Print Archive

BilVideo: A video database management system

Author: Donderler M. E.
Gudukbay U.
Saykol E.
Ulusoy O.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2003
Field of study

Cataloged from PDF version of article.The BilVideo video database management system provides integrated support for spatiotemporal and semantic queries for video. BilVideo can support any application with video data searching needs. It's query language provides a simple way to extend the system's query capabilities. Users can add application-dependent rules and facts to the knowledge base

Bilkent University Institutional Repository

CBCD Based on Color Features and Landmark MDS-Assisted Distance Estimation

Author: Corvaglia Marzia
Guerrini Fabrizio
Leonardi Riccardo
Migliorati Pierangelo
Rossi Eliana
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Content-Based Copy Detection (CBCD) of digital videos is an important research field that aims at the identification of modified copies of an original clip, e.g., on the Internet. In this application, the video content is uniquely identified by the content itself, by extracting some compact features that are robust to a certain set of video transformations. Given the huge amount of data present in online video databases, the computational complexity of the feature extraction and comparison is a very important issue. In this paper, a landmark based multi-dimensional scaling technique is proposed to speed up the detection procedure which is based on exhaustive search and the MPEG-7 Dominant Color Descriptor. The method is evaluated under the MPEG Video Signature Core Experiment conditions, and simulation results show impressive time savings at the cost of a slightly reduced detection performance

Crossref

Archivio istituzionale della ricerca - Università di Brescia

STRG-QL: Spatio-Temporal Region Graph Query Language for Video Databases

Author: Celebi M. Emre
Lee Jeongkyu
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/01/2008
Field of study

Copyright 2008 Society of Photo-Optical Instrumentation Engineers. One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited.In this paper, we present a new graph-based query language and its query processing for a Graph-based Video Database Management System (GVDBMS). Although extensive researches have proposed various query languages for video databases, most of them have the limitation in handling general-purpose video queries. Each method can handle specific data model, query type or application. In order to develop a general-purpose video query language, we first produce Spatio-Temporal Region Graph (STRG) for each video, which represents spatial and temporal information of video objects. An STRG data model is generated from the STRG by exploiting object-oriented model. Based on the STRG data model, we propose a new graph-based query language named STRG-QL, which supports various types of video query. To process the proposed STRG-QL, we introduce a rule-based query optimization that considers the characteristics of video data, i.e., the hierarchical correlations among video segments. The results of our extensive experimental study show that the proposed STRG-QL is promising in terms of accuracy and cost.http://dx.doi.org/10.1117/12.76553

UB ScholarWorks

CiteSeerX