1,567 research outputs found
A quick search method for audio signals based on a piecewise linear representation of feature trajectories
This paper presents a new method for a quick similarity-based search through
long unlabeled audio streams to detect and locate audio clips provided by
users. The method involves feature-dimension reduction based on a piecewise
linear representation of a sequential feature trajectory extracted from a long
audio stream. Two techniques enable us to obtain a piecewise linear
representation: the dynamic segmentation of feature trajectories and the
segment-based Karhunen-L\'{o}eve (KL) transform. The proposed search method
guarantees the same search results as the search method without the proposed
feature-dimension reduction method in principle. Experiment results indicate
significant improvements in search speed. For example the proposed method
reduced the total search time to approximately 1/12 that of previous methods
and detected queries in approximately 0.3 seconds from a 200-hour audio
database.Comment: 20 pages, to appear in IEEE Transactions on Audio, Speech and
Language Processin
Design of Multiplier for Medical Image Compression Using Urdhava Tiryakbhyam Sutra
Compressing the medical images is one of the challenging areas in healthcare industry which calls for an effective design of the compression algorithms. The conventional compression algorithms used on medical images doesn’t offer enhanced computational capabilities with respect to faster processing speed and is more dependent on hardware resources. The present paper has identified the potential usage of Vedic mathematics in the form of Urdhava Tiryakbhyam sutra, which can be used for designing an efficient multiplier that can be used for enhancing the capabilities of the existing processor to generate enhance compression experience. The design of the proposed system is discussed with respect to 5 significant algorithms and the outcome of the proposed study was testified with heterogeneous samples of medical image to find that proposed system offers approximately 57% of the reduction in size without any significant loss of data
Flexible and efficient IR using array databases
textabstractThe Matrix Framework is a recent proposal by IR researchers to flexibly represent all important information retrieval models in a single multi-dimensional array framework. Computational support for exactly this framework is provided by the array database system SRAM (Sparse Relational Array Mapping) that works on top of a DBMS. Information retrieval models can be specified in its comprehension-based array query language, in a way that directly corresponds to the underlying mathematical formulas. SRAM efficiently stores sparse arrays in (compressed) relational tables and translates and optimizes array queries into relational queries. In this work, we describe a number of array query optimization rules and demonstrate their effect on text retrieval in the TREC TeraByte track (TREC-TB) efficiency task, using the Okapi BM25 model as our example. It turns out that these optimization rules enable SRAM to automatically translate the BM25 array queries into the relational equivalent of inverted list processing including compression, score materialization and quantization, such as employed by custom-built IR systems. The use of the high-performance MonetDB/X100 relational backend, that provides transparent database compression, allows the system to achieve very fast response times with good precision and low resource usage
Flexible and efficient IR using array databases
The Matrix Framework is a recent proposal by IR researchers to flexibly represent all important information retrieval models in a single multi-dimensional array framework. Computational support for exactly this framework is provided by the array database system SRAM (Sparse Relational Array Mapping) that works on top of a DBMS. Information retrieval models can be specified in its comprehension-based array query language, in a way that directly corresponds to the underlying mathematical formulas. SRAM efficiently stores sparse arrays in (compressed) relational tables and translates and optimizes array queries into relational queries. In this work, we describe a number of array query optimization rules and demonstrate their effect on text retrieval in the TREC TeraByte track (TREC-TB) efficiency task, using the Okapi BM25 model as our example. It turns out that these optimization rules enable SRAM to automatically translate the BM25 array queries into the relational equivalent of inverted list processing including compression, score materialization and quantization, such as employed by custom-built IR systems. The use of the high-performance MonetDB/X100 relational backend, that provides transparent database compression, allows the system to achieve very fast response times with good precision and low resource usage
MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos
User-generated content (UGC) live videos are often bothered by various
distortions during capture procedures and thus exhibit diverse visual
qualities. Such source videos are further compressed and transcoded by media
server providers before being distributed to end-users. Because of the
flourishing of UGC live videos, effective video quality assessment (VQA) tools
are needed to monitor and perceptually optimize live streaming videos in the
distributing process. In this paper, we address \textbf{UGC Live VQA} problems
by constructing a first-of-a-kind subjective UGC Live VQA database and
developing an effective evaluation tool. Concretely, 418 source UGC videos are
collected in real live streaming scenarios and 3,762 compressed ones at
different bit rates are generated for the subsequent subjective VQA
experiments. Based on the built database, we develop a
\underline{M}ulti-\underline{D}imensional \underline{VQA} (\textbf{MD-VQA})
evaluator to measure the visual quality of UGC live videos from semantic,
distortion, and motion aspects respectively. Extensive experimental results
show that MD-VQA achieves state-of-the-art performance on both our UGC Live VQA
database and existing compressed UGC VQA databases.Comment: Accepted to CVPR202
Matemaattisen morfologian käyttö geometrisessa musiikinhaussa
The usual task in music information retrieval (MIR) is to find occurrences of a monophonic query pattern within a music database, which can contain both monophonic and polyphonic content. The so-called query-by-humming systems are a famous instance of content-based MIR. In such a system, the user's hummed query is converted into symbolic form to perform search operations in a similarly encoded database. The symbolic representation (e.g., textual, MIDI or vector data) is typically a quantized and simplified version of the sampled audio data, yielding to faster search algorithms and space requirements that can be met in real-life situations.
In this thesis, we investigate geometric approaches to MIR. We first study some musicological properties often needed in MIR algorithms, and then give a literature review on traditional (e.g., string-matching-based) MIR algorithms and novel techniques based on geometry. We also introduce some concepts from digital image processing, namely the mathematical morphology, which we will use to develop and implement four algorithms for geometric music retrieval.
The symbolic representation in the case of our algorithms is a binary 2-D image. We use various morphological pre- and post-processing operations on the query and the database images to perform template matching / pattern recognition for the images. The algorithms are basically extensions to classic image correlation and hit-or-miss transformation techniques used widely in template matching applications. They aim to be a future extension to the retrieval engine of C-BRAHMS, which is a research project of the Department of Computer Science at University of Helsinki
- …