6 research outputs found

    Discovering Multi-relational Latent Attributes by Visual Similarity Networks

    Full text link
    Abstract. The key problems in visual object classification are: learning discriminative feature to distinguish between two or more visually similar categories ( e.g. dogs and cats), modeling the variation of visual appear-ance within instances of the same class (e.g. Dalmatian and Chihuahua in the same category of dogs), and tolerate imaging distortion (3D pose). These account to within and between class variance in machine learning terminology, but in recent works these additional pieces of information, latent dependency, have been shown to be beneficial for the learning process. Latent attribute space was recently proposed and verified to capture the latent dependent correlation between classes. Attributes can be annotated manually, but more attempting is to extract them in an unsupervised manner. Clustering is one of the popular unsupervised ap-proaches, and the recent literature introduces similarity measures that help to discover visual attributes by clustering. However, the latent at-tribute structure in real life is multi-relational, e.g. two different sport cars in different poses vs. a sport car and a family car in the same pose-what attribute can dominate similarity? Instead of clustering, a network (graph) containing multiple connections is a natural way to represent such multi-relational attributes between images. In the light of this, we introduce an unsupervised framework for network construction based on pairwise visual similarities and experimentally demonstrate that the constructed network can be used to automatically discover multiple dis-crete (e.g. sub-classes) and continuous (pose change) latent attributes. Illustrative examples with publicly benchmarking datasets can verify the effectiveness of capturing multi- relation between images in the unsuper-vised style by our proposed network.

    Unsupervised alignment of objects in images

    Get PDF
    With the advent of computer vision, various applications become interested to apply it to interpret the 3D and 2D scenes. The main core of computer vision is visual object detection which deals with detecting and representing objects in the image. Visual object detection requires to learn a model of each class type (e.g. car, cat) to be capable to detect objects belonging to the same class. Class learning benefits from a method which automatically aligns class examples making learning more straightforward. The objective of this thesis is to further develop the sate-of-the-art feature-based alignment method which rigidly and automatically aligns object class images to a manually selected seed image. We try to compensate the weakness by providing a method to automatically select the best seed from dataset. Our method first extracts features by utilizing dense sampling method and then scale invariant feature transform (SIFT) descriptor is used to find best matches as initial local feature matches. The final alignment is based on spatial scoring procedure where the initial matches are refined to a set of spatially verified matches. The spatial score is used next to calculate similarity scores. We propose an algorithm which operates on spatial and similarity scores and finally selects the best seed. We also investigate the performance of step-wise alignment using minimum spanning tree (MST) and Dijkstra shortest path instead of direct alignment utilizing a single seed. We conduct our experiments using classes of Caltech-101 for which our unsupervised seed selection and step-wise alignment achieve state-of-the-art performance

    Local Feature Based Unsupervised Alignment of Object Class Images

    No full text
    corecore