7,030 research outputs found

    Automated Visual Fin Identification of Individual Great White Sharks

    Get PDF
    This paper discusses the automated visual identification of individual great white sharks from dorsal fin imagery. We propose a computer vision photo ID system and report recognition results over a database of thousands of unconstrained fin images. To the best of our knowledge this line of work establishes the first fully automated contour-based visual ID system in the field of animal biometrics. The approach put forward appreciates shark fins as textureless, flexible and partially occluded objects with an individually characteristic shape. In order to recover animal identities from an image we first introduce an open contour stroke model, which extends multi-scale region segmentation to achieve robust fin detection. Secondly, we show that combinatorial, scale-space selective fingerprinting can successfully encode fin individuality. We then measure the species-specific distribution of visual individuality along the fin contour via an embedding into a global `fin space'. Exploiting this domain, we finally propose a non-linear model for individual animal recognition and combine all approaches into a fine-grained multi-instance framework. We provide a system evaluation, compare results to prior work, and report performance and properties in detail.Comment: 17 pages, 16 figures. To be published in IJCV. Article replaced to update first author contact details and to correct a Figure reference on page

    Grounding semantics in robots for Visual Question Answering

    Get PDF
    In this thesis I describe an operational implementation of an object detection and description system that incorporates in an end-to-end Visual Question Answering system and evaluated it on two visual question answering datasets for compositional language and elementary visual reasoning

    Multiresolution hierarchy co-clustering for semantic segmentation in sequences with small variations

    Full text link
    This paper presents a co-clustering technique that, given a collection of images and their hierarchies, clusters nodes from these hierarchies to obtain a coherent multiresolution representation of the image collection. We formalize the co-clustering as a Quadratic Semi-Assignment Problem and solve it with a linear programming relaxation approach that makes effective use of information from hierarchies. Initially, we address the problem of generating an optimal, coherent partition per image and, afterwards, we extend this method to a multiresolution framework. Finally, we particularize this framework to an iterative multiresolution video segmentation algorithm in sequences with small variations. We evaluate the algorithm on the Video Occlusion/Object Boundary Detection Dataset, showing that it produces state-of-the-art results in these scenarios.Comment: International Conference on Computer Vision (ICCV) 201

    Segmentation and semantic labelling of RGBD data with convolutional neural networks and surface fitting

    Get PDF
    We present an approach for segmentation and semantic labelling of RGBD data exploiting together geometrical cues and deep learning techniques. An initial over-segmentation is performed using spectral clustering and a set of non-uniform rational B-spline surfaces is fitted on the extracted segments. Then a convolutional neural network (CNN) receives in input colour and geometry data together with surface fitting parameters. The network is made of nine convolutional stages followed by a softmax classifier and produces a vector of descriptors for each sample. In the next step, an iterative merging algorithm recombines the output of the over-segmentation into larger regions matching the various elements of the scene. The couples of adjacent segments with higher similarity according to the CNN features are candidate to be merged and the surface fitting accuracy is used to detect which couples of segments belong to the same surface. Finally, a set of labelled segments is obtained by combining the segmentation output with the descriptors from the CNN. Experimental results show how the proposed approach outperforms state-of-the-art methods and provides an accurate segmentation and labelling

    Visual-hint Boundary to Segment Algorithm for Image Segmentation

    Full text link
    Image segmentation has been a very active research topic in image analysis area. Currently, most of the image segmentation algorithms are designed based on the idea that images are partitioned into a set of regions preserving homogeneous intra-regions and inhomogeneous inter-regions. However, human visual intuition does not always follow this pattern. A new image segmentation method named Visual-Hint Boundary to Segment (VHBS) is introduced, which is more consistent with human perceptions. VHBS abides by two visual hint rules based on human perceptions: (i) the global scale boundaries tend to be the real boundaries of the objects; (ii) two adjacent regions with quite different colors or textures tend to result in the real boundaries between them. It has been demonstrated by experiments that, compared with traditional image segmentation method, VHBS has better performance and also preserves higher computational efficiency.Comment: 45 page

    Plant image retrieval using color, shape and texture features

    Get PDF
    We present a content-based image retrieval system for plant image retrieval, intended especially for the house plant identification problem. A plant image consists of a collection of overlapping leaves and possibly flowers, which makes the problem challenging.We studied the suitability of various well-known color, shape and texture features for this problem, as well as introducing some new texture matching techniques and shape features. Feature extraction is applied after segmenting the plant region from the background using the max-flow min-cut technique. Results on a database of 380 plant images belonging to 78 different types of plants show promise of the proposed new techniques and the overall system: in 55% of the queries, the correct plant image is retrieved among the top-15 results. Furthermore, the accuracy goes up to 73% when a 132-image subset of well-segmented plant images are considered

    The Lagrangian description of aperiodic flows: a case study of the Kuroshio Current

    Get PDF
    This article reviews several recently developed Lagrangian tools and shows how their combined use succeeds in obtaining a detailed description of purely advective transport events in general aperiodic flows. In particular, because of the climate impact of ocean transport processes, we illustrate a 2D application on altimeter data sets over the area of the Kuroshio Current, although the proposed techniques are general and applicable to arbitrary time dependent aperiodic flows. The first challenge for describing transport in aperiodical time dependent flows is obtaining a representation of the phase portrait where the most relevant dynamical features may be identified. This representation is accomplished by using global Lagrangian descriptors that when applied for instance to the altimeter data sets retrieve over the ocean surface a phase portrait where the geometry of interconnected dynamical systems is visible. The phase portrait picture is essential because it evinces which transport routes are acting on the whole flow. Once these routes are roughly recognised it is possible to complete a detailed description by the direct computation of the finite time stable and unstable manifolds of special hyperbolic trajectories that act as organising centres of the flow.Comment: 40 pages, 24 figure

    Cumulative object categorization in clutter

    Get PDF
    In this paper we present an approach based on scene- or part-graphs for geometrically categorizing touching and occluded objects. We use additive RGBD feature descriptors and hashing of graph configuration parameters for describing the spatial arrangement of constituent parts. The presented experiments quantify that this method outperforms our earlier part-voting and sliding window classification. We evaluated our approach on cluttered scenes, and by using a 3D dataset containing over 15000 Kinect scans of over 100 objects which were grouped into general geometric categories. Additionally, color, geometric, and combined features were compared for categorization tasks
    corecore