3,110 research outputs found
Towards an All-Purpose Content-Based Multimedia Information Retrieval System
The growth of multimedia collections - in terms of size, heterogeneity, and
variety of media types - necessitates systems that are able to conjointly deal
with several forms of media, especially when it comes to searching for
particular objects. However, existing retrieval systems are organized in silos
and treat different media types separately. As a consequence, retrieval across
media types is either not supported at all or subject to major limitations. In
this paper, we present vitrivr, a content-based multimedia information
retrieval stack. As opposed to the keyword search approach implemented by most
media management systems, vitrivr makes direct use of the object's content to
facilitate different types of similarity search, such as Query-by-Example or
Query-by-Sketch, for and, most importantly, across different media types -
namely, images, audio, videos, and 3D models. Furthermore, we introduce a new
web-based user interface that enables easy-to-use, multimodal retrieval from
and browsing in mixed media collections. The effectiveness of vitrivr is shown
on the basis of a user study that involves different query and media types. To
the best of our knowledge, the full vitrivr stack is unique in that it is the
first multimedia retrieval system that seamlessly integrates support for four
different types of media. As such, it paves the way towards an all-purpose,
content-based multimedia information retrieval system
A Highly Robust Audio Monitoring System for Radio Broadcasting
Proposing a novel approach for monitoringsongs for the radio broadcasting channels is veryimportant for the interest of singers, writers andmusicians in the musical industry. Singers, writers andmusicians have a claim to intellectual property rightsfor their songs broadcast over all the radio channels.According to this intellectual property rights actsingers, writers and musicians should be paid for theirsongs broadcast over all the radio channels. Therefore wepropose a real time audio monitoring approach to solvethis problem which includes our own audio recognitionalgorithm. It is easy to recognize a song, when you providethe original high quality blueprint of the song as input. Butwe can’t expect such kind of audio input from radiochannels since lots of transformations are possible beforereaching the end user or listener. For example, addingenvironmental effects such as noise, adding commercialson the song as watermarks, playing more than one songas a chain without adding any silence between them,playing a part of the song, playing same song in variousspeeds and so on. These transformations cause change inthe uniqueness of particular song and make the problemeven more difficult. The algorithm we proposing is resistantto noise and distortion as well as it is capable of recognizingshort segment of song when broadcasting over the radiochannels. At the end of the processing our system generatesa descriptive report including title of the song, singer of thesong, writer of the song, composer of the song, number oftimes it was played and when it was played for all songs fora particular period for all radio broadcasting channels. Weevaluate our system against various types of real timescenarios and achieved overall higher level of accuracy(96%) at the end
Spectrogram classification using dissimilarity space
In this work, we combine a Siamese neural network and different clustering techniques to generate a dissimilarity space that is then used to train an SVM for automated animal audio classification. The animal audio datasets used are (i) birds and (ii) cat sounds, which are freely available. We exploit different clustering methods to reduce the spectrograms in the dataset to a number of centroids that are used to generate the dissimilarity space through the Siamese network. Once computed, we use the dissimilarity space to generate a vector space representation of each pattern, which is then fed into an support vector machine (SVM) to classify a spectrogram by its dissimilarity vector. Our study shows that the proposed approach based on dissimilarity space performs well on both classification problems without ad-hoc optimization of the clustering methods. Moreover, results show that the fusion of CNN-based approaches applied to the animal audio classification problem works better than the stand-alone CNNs
Revisión de algoritmos, métodos y técnicas para la detección de UAVs y UAS en aplicaciones de audio, radiofrecuencia y video
Unmanned Aerial Vehicles (UAVs), also known as drones, have had an exponential evolution in recent times due in large part to the development of technologies that enhance the development of these devices. This has resulted in increasingly affordable and better-equipped artifacts, which implies their application in new fields such as agriculture, transport, monitoring, and aerial photography. However, drones have also been used in terrorist acts, privacy violations, and espionage, in addition to involuntary accidents in high-risk zones such as airports. In response to these events, multiple technologies have been introduced to control and monitor the airspace in order to ensure protection in risk areas. This paper is a review of the state of the art of the techniques, methods, and algorithms used in video, radiofrequency, and audio-based applications to detect UAVs and Unmanned Aircraft Systems (UAS). This study can serve as a starting point to develop future drone detection systems with the most convenient technologies that meet certain requirements of optimal scalability, portability, reliability, and availability.Los vehículos aéreos no tripulados, conocidos también como drones, han tenido una evolución exponencial en los últimos tiempos, debido en gran parte al desarrollo de las tecnologías que potencian su desarrollo, lo cual ha desencadenado en artefactos cada vez más asequibles y con mejores prestaciones, lo que implica el desarrollo de nuevas aplicaciones como agricultura, transporte, monitoreo, fotografía aérea, entre otras. No obstante, los drones se han utilizado también en actos terroristas, violaciones a la privacidad y espionaje, además de haber producido accidentes involuntarios en zonas de alto riesgo de operación como aeropuertos. En respuesta a dichos eventos, aparecen tecnologías que permiten controlar y monitorear el espacio aéreo, con el fin de garantizar la protección en zonas de riesgo. En este artículo se realiza un estudio del estado del arte de la técnicas, métodos y algoritmos basados en video, en análisis de sonido y en radio frecuencia, para tener un punto de partida que permita el desarrollo en el futuro de un sistema de detección de drones, con las tecnologías más propicias, según los requerimientos que puedan ser planteados con las características de escalabilidad, portabilidad, confiabilidad y disponibilidad óptimas
Landmark Based Audio Fingerprinting for Naval Vessels
This paper presents a novel landmark based audio fingerprinting algorithm for matching naval vessels' acoustic signatures. The algorithm incorporates joint time - frequency based approach with parameters optimized for application to acoustic signatures of naval vessels. The technique exploits the relative time difference between neighboring frequency onsets, which is found to remain consistent in different samples originating over time from the same vessel. The algorithm has been implemented in MATLAB and trialed with real acoustic signatures of submarines. The training and test samples of submarines have been acquired from resources provided by San Francisco National Park Association [14]. Storage requirements to populate the database with 500 tracks allowing a maximum of 0.5 Million feature hashes per track remained below 1GB. On an average PC, the database hash table can be populated with feature hashes of database tracks @ 1250 hashes/second achieving conversion of 120 seconds of audio data into hashes in less than a second. Under varying attributes such as time skew, noise and sample length, the results prove algorithm robustness in identifying a correct match. Experimental results show classification rate of 94% using proposed approach which is a considerable improvement as compared to 88% achieved by [17] employing existing state of the art techniques such as Detection Envelope Modulation On Noise (DEMON) [15] and Low Frequency Analysis and Recording (LOFAR) [16]
- …