Search CORE

15,890 research outputs found

Could Multimedia approaches help Remote Sensing Analysis?

Author: Ben Hamida Amina
Ben-Amar Chokri
Benoit Alexandre
Lambert Patrick
Publication venue: HAL CCSD
Publication date: 29/10/2015
Field of study

International audienceThe paper explores how multimedia approaches used in image understanding tasks could be adapted and used in remote sensing image analysis. Two approaches are investigated: the classical Bag of Visual Words (BoVW) approach and the Deep Learning approach. Tests are performed for the classification of the UC Merced Land Use Dataset which provide better results than the state of the art

Hal - Université Grenoble Alpes

HAL Université de Savoie

Revising Knowledge Discovery for Object Representation with Spatio-Semantic Feature Integration

Author: Madhuri B. Dhas, Prof. S. A. Shinde
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/06/2015
Field of study

In large social networks, web objects become increasingly popular. Multimedia object classification and representation is a necessary step of multimedia information retrieval. Indexing and organizing these web objects for the purpose of convenient browsing and search of the objects, and to effectively reveal interesting patterns from the objects. For all these tasks, classifying the web objects into manipulable semantic categories is an essential procedure. One important issue for classification of objects is the representation of images. To perform supervised classification tasks, the knowledge is extracted from unlabeled objects through unsupervised learning. In order to represent the images in a more meaningful and effective way rather than using the basic Bag-of-words (BoW) model, a novel image representation model called Bag-of-visual phrases(BoP) is used. In this model visual words are obtained using hierarchical clustering and visual phrases are generated by vector classifier of visual words. To obtain the Spatio-semantic correlation knowledge the frequently co-occurring pairs are calculated from visual vocabulary. After the successful object representation, the tags, comments, and descriptions of web objects are separated by using most likelihood method. The spatial and semantic differentiation power of image features can be enhanced via this BoP model and likelihood method. DOI: 10.17762/ijritcc2321-8169.15065

International Journal on Recent and Innovation Trends in Computing and Communication

Bag-of-Features Image Indexing and Classification in Microsoft SQL Server Relational Database

Author: Korytkowski Marcin
Scherer Rafal
Staszewski Pawel
Woldan Piotr
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/06/2015
Field of study

This paper presents a novel relational database architecture aimed to visual objects classification and retrieval. The framework is based on the bag-of-features image representation model combined with the Support Vector Machine classification and is integrated in a Microsoft SQL Server database.Comment: 2015 IEEE 2nd International Conference on Cybernetics (CYBCONF), Gdynia, Poland, 24-26 June 201

arXiv.org e-Print Archive

Crossref

Learning Local Feature Aggregation Functions with Backpropagation

Author: csurka
krizhevsky
liu
moosmann
perronnin
simonyan
soomro
Publication venue
Publication date: 26/06/2017
Field of study

This paper introduces a family of local feature aggregation functions and a novel method to estimate their parameters, such that they generate optimal representations for classification (or any task that can be expressed as a cost function minimization problem). To achieve that, we compose the local feature aggregation function with the classifier cost function and we backpropagate the gradient of this cost function in order to update the local feature aggregation function parameters. Experiments on synthetic datasets indicate that our method discovers parameters that model the class-relevant information in addition to the local feature space. Further experiments on a variety of motion and visual descriptors, both on image and video datasets, show that our method outperforms other state-of-the-art local feature aggregation functions, such as Bag of Words, Fisher Vectors and VLAD, by a large margin.Comment: In Proceedings of the 25th European Signal Processing Conference (EUSIPCO 2017

arXiv.org e-Print Archive

Crossref

Action Recognition in Videos: from Motion Capture Labs to the Web

Author: Ana Paula Br
Arnaldo Albuquerque De Araújo
De Almeida
Eduardo Alves
Jussara Marques
Publication venue
Publication date: 17/06/2010
Field of study

This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4 table

arXiv.org e-Print Archive

CiteSeerX