616 research outputs found
Kernel Discriminant Analysis Using Triangular Kernel for Semantic Scene Classification
Semantic scene classification is a challenging research problem that aims to categorise images into semantic classes such as beaches, sunsets or mountains. This prob-lem can be formulated as multi-labeled classification prob-lem where an image can belong to more than one concep-tual class such as sunsets and beaches at the same time. Re-cently, Kernel Discriminant Analysis combined with spec-tral regression (SR-KDA) has been successfully used for face, text and spoken letter recognition. But SR-KDA method works only with positive definite symmetric matri-ces. In this paper, we have modified this method to support both definite and indefinite symmetric matrices. The main idea is to use LDLT decomposition instead of Cholesky decomposition. The modified SR-KDA is applied to scene database involving 6 concepts. We validate the advocated approach and demonstrate that it yields significant perfor-mance gains when conditionally positive definite triangular kernel is used instead of positive definite symmetric kernels such as linear, polynomial or RBF. The results also indicate performance gains when compared with the state-of-the art multi-label methods for semantic scene classification.
Generic object classification for autonomous robots
Un dels principals problemes de la interacció dels robots autònoms és el coneixement de l'escena. El reconeixement és fonamental per a solucionar aquest problema i permetre als robots interactuar en un escenari no controlat. En aquest document presentem una aplicació pràctica de la captura d'objectes, de la normalització i de la classificació de senyals triangulars i circulars. El sistema s'introdueix en el robot Aibo de Sony per a millorar-ne la interacció. La metodologia presentada s'ha comprobat en simulacions i problemes de categorització reals, com ara la classificació de senyals de trànsit, amb resultats molt prometedors.Uno de los principales problemas de la interacción de los robots autónomos es el conocimiento de la escena. El reconocimiento es fundamental para solventar este problema y permitir a los robots interactuar en un escenario no controlado. En este documento, presentamos una aplicación práctica de captura del objeto, normalización y clasificación de señales triangulares y circulares. El sistema es introducido en el robot Aibo de Sony para mejorar el comportamiento de la interacción del robot. La metodología presentada ha sido testeada en simulaciones y problemas de categorización reales, como es la clasificación de señales de tráfico, con resultados muy prometedores.One of the main problems of autonomous robots interaction is the scene knowledge. Recognition is concerned to deal with this problem and to allow robots to interact in uncontrolled environments. In this paper, we present a practical application for object fitting, normalization and classification of triangular and circular signs. The system is introduced in the Aibo robot of Sony to increase the robot interaction behaviour. The presented methodology has been tested in real simulations and categorization problems, as the traffic signs classification, with very promising results.Nota: Aquest document conté originàriament altre material i/o programari només consultable a la Biblioteca de Ciència i Tecnologia
Towards Effective Codebookless Model for Image Classification
The bag-of-features (BoF) model for image classification has been thoroughly
studied over the last decade. Different from the widely used BoF methods which
modeled images with a pre-trained codebook, the alternative codebook free image
modeling method, which we call Codebookless Model (CLM), attracted little
attention. In this paper, we present an effective CLM that represents an image
with a single Gaussian for classification. By embedding Gaussian manifold into
a vector space, we show that the simple incorporation of our CLM into a linear
classifier achieves very competitive accuracy compared with state-of-the-art
BoF methods (e.g., Fisher Vector). Since our CLM lies in a high dimensional
Riemannian manifold, we further propose a joint learning method of low-rank
transformation with support vector machine (SVM) classifier on the Gaussian
manifold, in order to reduce computational and storage cost. To study and
alleviate the side effect of background clutter on our CLM, we also present a
simple yet effective partial background removal method based on saliency
detection. Experiments are extensively conducted on eight widely used databases
to demonstrate the effectiveness and efficiency of our CLM method
Multi-Label Dimensionality Reduction
abstract: Multi-label learning, which deals with data associated with multiple labels simultaneously, is ubiquitous in real-world applications. To overcome the curse of dimensionality in multi-label learning, in this thesis I study multi-label dimensionality reduction, which extracts a small number of features by removing the irrelevant, redundant, and noisy information while considering the correlation among different labels in multi-label learning. Specifically, I propose Hypergraph Spectral Learning (HSL) to perform dimensionality reduction for multi-label data by exploiting correlations among different labels using a hypergraph. The regularization effect on the classical dimensionality reduction algorithm known as Canonical Correlation Analysis (CCA) is elucidated in this thesis. The relationship between CCA and Orthonormalized Partial Least Squares (OPLS) is also investigated. To perform dimensionality reduction efficiently for large-scale problems, two efficient implementations are proposed for a class of dimensionality reduction algorithms, including canonical correlation analysis, orthonormalized partial least squares, linear discriminant analysis, and hypergraph spectral learning. The first approach is a direct least squares approach which allows the use of different regularization penalties, but is applicable under a certain assumption; the second one is a two-stage approach which can be applied in the regularization setting without any assumption. Furthermore, an online implementation for the same class of dimensionality reduction algorithms is proposed when the data comes sequentially. A Matlab toolbox for multi-label dimensionality reduction has been developed and released. The proposed algorithms have been applied successfully in the Drosophila gene expression pattern image annotation. The experimental results on some benchmark data sets in multi-label learning also demonstrate the effectiveness and efficiency of the proposed algorithms.Dissertation/ThesisPh.D. Computer Science 201
Mining Semantically Consistent Patterns for Cross view data with CCA and CJFL
We often faces the situation that the same semantic concept can be expressed using different views with similar information, in some real world applications such as Information Retrieval and Data classification. So it becomes necessary for those applications to obtain a certain Semantically Consistent Patterns (SCP) for cross-view data, which embeds the complementary information from different views. However, eliminating heterogeneity among cross-view representationsis a significant challenge in mining the SCP. The existing work has proposed the effective Isomorphic Relevant Redundant Transformation (IRRT) and Correlation-based Joint Feature Learning (CJFL) method for mining SCP from cross-view data representation. Even though existing system uses the IRRT for SCP from low level to mid-level feature extraction. Some redundant data and noise remains in it. To remove redundant information and noise from mid- level feature space to high level feature space, CJFL algorithm is used. We are using Canonical correlation analysis (CCA) method instead of complex IRRT which also lags to remove the noise and redundant information
Survey on Mining Semantically Consistent Patterns for Cross-View Data
We often face the situation that the similar information is represented by different views with different backgrounds, in some real world applications such as Information Retrieval and Data classification. So it becomes necessary for those applications to obtain a certain Semantically Consistent Patterns (SCP) for cross-view data, which embeds the complementary information from different views. However, eliminating heterogeneity among cross-view representations is a significant challenge in mining the SCP. This paper reviews the research work on a general framework to discover the SCP for cross-view data web crawling algorithms used on searching a general framework to discover the SCP for cross-view data.
DOI: 10.17762/ijritcc2321-8169.160411
A Clustering-guided Contrastive Fusion for Multi-view Representation Learning
The past two decades have seen increasingly rapid advances in the field of
multi-view representation learning due to it extracting useful information from
diverse domains to facilitate the development of multi-view applications.
However, the community faces two challenges: i) how to learn robust
representations from a large amount of unlabeled data to against noise or
incomplete views setting, and ii) how to balance view consistency and
complementary for various downstream tasks. To this end, we utilize a deep
fusion network to fuse view-specific representations into the view-common
representation, extracting high-level semantics for obtaining robust
representation. In addition, we employ a clustering task to guide the fusion
network to prevent it from leading to trivial solutions. For balancing
consistency and complementary, then, we design an asymmetrical contrastive
strategy that aligns the view-common representation and each view-specific
representation. These modules are incorporated into a unified method known as
CLustering-guided cOntrastiVE fusioN (CLOVEN). We quantitatively and
qualitatively evaluate the proposed method on five datasets, demonstrating that
CLOVEN outperforms 11 competitive multi-view learning methods in clustering and
classification. In the incomplete view scenario, our proposed method resists
noise interference better than those of our competitors. Furthermore, the
visualization analysis shows that CLOVEN can preserve the intrinsic structure
of view-specific representation while also improving the compactness of
view-commom representation. Our source code will be available soon at
https://github.com/guanzhou-ke/cloven.Comment: 13 pages, 9 figure
- …