Search CORE

13,560 research outputs found

A neural network approach to audio-assisted movie dialogue detection

Author: Alatan
Birge
Constantine Kotropoulos
Emmanouil Benetos
Freund
Freund
Hosmer
Ioannis Pitas
Jelinek
Kotti
Král
Lehane
Margarita Kotti
Papoulis
Platt
Reiss
Stoica
Trelea
Webb
Zhai
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

A novel framework for audio-assisted dialogue detection based on indicator functions and neural networks is investigated. An indicator function defines that an actor is present at a particular time instant. The cross-correlation function of a pair of indicator functions and the magnitude of the corresponding cross-power spectral density are fed as input to neural networks for dialogue detection. Several types of artificial neural networks, including multilayer perceptrons, voted perceptrons, radial basis function networks, support vector machines, and particle swarm optimization-based multilayer perceptrons are tested. Experiments are carried out to validate the feasibility of the aforementioned approach by using ground-truth indicator functions determined by human observers on 6 different movies. A total of 41 dialogue instances and another 20 non-dialogue instances is employed. The average detection accuracy achieved is high, ranging between 84.78%±5.499% and 91.43%±4.239%

CiteSeerX

City Research Online

Crossref

Spiral - Imperial College Digital Repository

Autoencoding the Retrieval Relevance of Medical Images

Author: Camlica Zehra
Khalvati Farzad
Tizhoosh H. R.
Publication venue
Publication date: 05/07/2015
Field of study

Content-based image retrieval (CBIR) of medical images is a crucial task that can contribute to a more reliable diagnosis if applied to big data. Recent advances in feature extraction and classification have enormously improved CBIR results for digital images. However, considering the increasing accessibility of big data in medical imaging, we are still in need of reducing both memory requirements and computational expenses of image retrieval systems. This work proposes to exclude the features of image blocks that exhibit a low encoding error when learned by a

n/p/n

autoencoder (

p\!<\!n

). We examine the histogram of autoendcoding errors of image blocks for each image class to facilitate the decision which image regions, or roughly what percentage of an image perhaps, shall be declared relevant for the retrieval task. This leads to reduction of feature dimensionality and speeds up the retrieval process. To validate the proposed scheme, we employ local binary patterns (LBP) and support vector machines (SVM) which are both well-established approaches in CBIR research community. As well, we use IRMA dataset with 14,410 x-ray images as test data. The results show that the dimensionality of annotated feature vectors can be reduced by up to 50% resulting in speedups greater than 27% at expense of less than 1% decrease in the accuracy of retrieval when validating the precision and recall of the top 20 hits.Comment: To appear in proceedings of The 5th International Conference on Image Processing Theory, Tools and Applications (IPTA'15), Nov 10-13, 2015, Orleans, Franc

arXiv.org e-Print Archive

Crossref

Image Quantification Learning Technique through Content based Image Retrieval

Author: Dr. R. Usha Rani
Publication venue: Auricle Global Society of Education and Research
Publication date: 31/01/2018
Field of study

This paper proposes a Radial basis functionality incorporation in learning the quantification of images using Content based Image Retrieval (CBIR). The approach is trying to find out the effectiveness of Multi-Layer Perceptron (MLP) namely Radial Basis Function (RBF) through Content Based Image Retrieval. Extract the features of an image, the numeric values of each pixel is framed in to a definite input data set of image to that the neural networks MLP gives the accuracy of the prediction of that particular Image data set. This paper put forward us with new idea of neural networks structure efficiency in the accuracy of output data set which got increased by the adjustment of the weighted neurons through a Perceptron called Radial Basis Function promoting by applying k means clustering to form clusters which are parameterized with Gaussian function application. Finally compare the actual output with observed output promoting the weighted neurons adjustment for getting the actual accurate output. A new dimension, in work enhancement of neural networks technology with that of image processing

International Journal on Future Revolution in Computer Science & Communication Engineering

A Comprehensive Review on Multimedia Retrieval Techniques

Author: Neha Gupta
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/06/2015
Field of study

Abstract: With the prevalence of sight and sound advancements and web mediums, client can't fulfil with the customarey techniques for data retrieval systems. On account of this, the substance based picture recovery is turning into another and quick strategy for data recovery. Substance based picture recovery is the system for recovering the information especially pictures from a wide gathering of databases. The recovery is careried out by utilizing highlights. Content Based Image Retrieval (CBIR) is a system to compose the wide mixture of pictures by their visual highlight. Feature based recovery or retrieval procedures aree accessible for recovering the pictures, in our review we aree investigating them. In our first segment, we aree tending towareds a few nuts and bolts of a specific CBIR framework with that we have demonstrated some fundamental highlights of any picture, these aree similare to shape, surface, shading and indicated diverse systems to compute them. We have also demonstrated diverse separeation measuring systems utilized for closeness estimation of any picture furthermore talked about indexing methods. At last conclusion and future degree is examined. DOI: 10.17762/ijritcc2321-8169.15061

International Journal on Recent and Innovation Trends in Computing and Communication