7 research outputs found

    Learning RGB-D descriptors of garment parts for informed robot grasping

    Get PDF
    Robotic handling of textile objects in household environments is an emerging application that has recently received considerable attention thanks to the development of domestic robots. Most current approaches follow a multiple re-grasp strategy for this purpose, in which clothes are sequentially grasped from different points until one of them yields a desired configuration. In this work we propose a vision-based method, built on the Bag of Visual Words approach, that combines appearance and 3D information to detect parts suitable for grasping in clothes, even when they are highly wrinkled. We also contribute a new, annotated, garment part dataset that can be used for benchmarking classification, part detection, and segmentation algorithms. The dataset is used to evaluate our approach and several state-of-the-art 3D descriptors for the task of garment part detection. Results indicate that appearance is a reliable source of information, but that augmenting it with 3D information can help the method perform better with new clothing items.This research is partially funded by the Spanish Ministry of Science and Innovation under Project PAU+ DPI2011-2751, the EU Project IntellAct FP7-ICT2009-6-269959 and the ERA-Net Chistera Project ViSen PCIN-2013-047. A. Ramisa worked under the JAE-Doc grant from CSIC and FSE.Peer Reviewe

    Deep Learning based Automated Forest Health Diagnosis from Aerial Images

    Get PDF
    Global climate change has had a drastic impact on our environment. Previous study showed that pest disaster occured from global climate change may cause a tremendous number of trees died and they inevitably became a factor of forest fire. An important portent of the forest fire is the condition of forests. Aerial image-based forest analysis can give an early detection of dead trees and living trees. In this paper, we applied a synthetic method to enlarge imagery dataset and present a new framework for automated dead tree detection from aerial images using a re-trained Mask RCNN (Mask Region-based Convolutional Neural Network) approach, with a transfer learning scheme. We apply our framework to our aerial imagery datasets,and compare eight fine-tuned models. The mean average precision score (mAP) for the best of these models reaches 54\%. Following the automated detection, we are able to automatically produce and calculate number of dead tree masks to label the dead trees in an image, as an indicator of forest health that could be linked to the causal analysis of environmental changes and the predictive likelihood of forest fire

    Utilization and experimental evaluation of occlusion aware kernel correlation filter tracker using RGB-D

    Get PDF
    Unlike deep-learning which requires large training datasets, correlation filter-based trackers like Kernelized Correlation Filter (KCF) uses implicit properties of tracked images (circulant matrices) for training in real-time. Despite their practical application in tracking, a need for a better understanding of the fundamentals associated with KCF in terms of theoretically, mathematically, and experimentally exists. This thesis first details the workings prototype of the tracker and investigates its effectiveness in real-time applications and supporting visualizations. We further address some of the drawbacks of the tracker in cases of occlusions, scale changes, object rotation, out-of-view and model drift with our novel RGB-D Kernel Correlation tracker. We also study the use of particle filter to improve trackers\u27 accuracy. Our results are experimentally evaluated using a) standard dataset and b) real-time using Microsoft Kinect V2 sensor. We believe this work will set the basis for better understanding the effectiveness of kernel-based correlation filter trackers and to further define some of its possible advantages in tracking

    Image Fusion and Axial Labeling of the Spine

    Get PDF
    In order to improve radiological diagnosis of back pain and spine disease, two new algorithms have been developed to aid the 75% of Canadians who will suffer from back pain in a given year. With the associated medical imaging required for many of these patients, there is a potential for improvement in both patient care and healthcare economics by increasing the accuracy and efficiency of spine diagnosis. A real-time spine image fusion system and an automatic vertebra/disc labeling system have been developed to address this. Both magnetic resonance (MR) images and computed tomography (CT) images are often acquired for patients. The MR image highlights soft tissue detail while the CT image highlights bone detail. It is desirable to present both modalities on a single fused image containing the clinically relevant detail. The fusion problem was encoded in an energy functional balancing three competing goals for the fused image: 1) similarity to the MR image, 2) similarity to the CT image and 3) smoothness (containing natural transitions). Graph-Cut and convex solutions have been developed. They have similar performance to each other and outperform other fusion methods from recent literature. The convex solution has real-time performance on modern graphics processing units, allowing for interactive control of the fused image. Clinical validation has been conducted on the convex solution based on 15 patient images. The fused images have been shown to increase confidence of diagnosis compared to unregistered MR and CT images, with no change in time for diagnosis based on readings from 5 radiologists. Spinal vertebrae serve as a reference for the location of surrounding tissues, but vertebrae have a very similar appearance to each other, making it time consume for radiologist to keep track of their locations. To automate this, an axial MR labeling algorithm was developed that runs in near real-time. Probability product kernels and fast integral images combined with simple geometric rules were used to classify pixels, slices and vertebrae. Evaluation was conducted on 32 lumbar spine images and 24 cervical spine images. The algorithm demonstrated 99% and 79% accuracy on the lumbar and cervical spine respectively

    Morphological Analysis for Object Recognition, Matching, and Applications

    Get PDF
    This thesis deals with the detection and classifcation of objects in visual images and with the analysis of shape changes between object instances. Whereas the task of object recognition focuses on learning models which describe common properties between instances of a specific category, the analysis of the specific differences between instances is also relevant to understand the objects and the categories themselves. This research is governed by the idea that important properties for the automatic perception and understanding of objects are transmitted through their geometry or shape. Therefore, models for object recognition and shape matching are devised which exploit the geometry and properties of the objects, using as little user supervision as possible. In order to learn object models for detection in a reliable manner, suitable object representations are required. The key idea in this work is to use a richer representation of the object shape within the object model in order to increase the description power and thus the performance of the whole system. For this purpose, we first investigate the integration of curvature information of shapes in the object model which is learned. Since natural objects intrinsically exhibit curved boundaries, an object is better described if this shape cue is integrated. This subject extends the widely used object representation based on gradient orientation histograms by incorporating a robust histogram-based description of curvature. We show that integrating this information substantially improves detection results over descriptors that solely rely upon histograms of orientated gradients. The impact of using richer shape representations for object recognition is further investigated through a novel method which goes beyond traditional bounding-box representations for objects. Visual recognition requires learning object models from training data. Commonly, training samples are annotated by marking only the bounding-box of objects since this appears to be the best trade-off between labeling information and effectiveness. However, objects are typically not box-shaped. Thus, the usual parametrization of objects using a bounding box seems inappropriate since such a box contains a significant amount of background clutter. Therefore, the presented approach learns object models for detection while simultaneously learning to segregate objects from clutter and extracting their overall shape, without however, requiring manual segmentation of the training samples. Shape equivalence is another interesting property related to shape. It refers to the ability of perceiving two distinct objects as having the same or similar shape. This thesis also explores the usage of this ability to detect objects in unsupervised scenarios, that is where no annotation of training data is available for learning a statistical model. For this purpose, a dataset of historical Chinese cartoons drawn during the Cultural Revolution and immediately thereafter is analyzed. Relevant objects in this dataset are emphasized through annuli of light rays. The idea of our method is to consider the different annuli as shape equivalent objects, that is, as objects sharing the same shape and devise a method to detect them. Thereafter, it is possible to indirectly infer the position, size and scale of the emphasized objects using the annuli detections. Not only commonalities among objects, but also the specific differences between them are perceived by a visual system. These differences can be understood through the analysis of how objects and their shape change. For this reason, this thesis also develops a novel methodology for analyzing the shape deformation between a single pair of images under missing correspondences. The key observation is that objects cannot deform arbitrarily, but rather the deformation itself follows the geometry and constraints imposed by the object itself. We describe the overall complex object deformation using a piecewise linear model. Thereby, we are able to identify each of the parts in the shape which share the same deformation. Thus, we are able to understand how an object and its parts were transformed. A remarkable property of the algorithm is the ability to automatically estimate the model complexity according to the overall complexity of the shape deformation. Specifically, the introduced methodology is used to analyze the deformation between original instances and reproductions of artworks. The nature of the analyzed alterations ranges from deliberate modifications by the artist to geometrical errors accumulated during the reproduction process of the image. The usage of this method within this application shows how productive the interaction between computer vision and the field of the humanities is. The goal is not to supplant human expertise, but to enhance and deepen connoisseurship about a given problem

    Development of an image processing method for automated, non-invasive and scale-independent monitoring of adherent cell cultures

    Get PDF
    Adherent cell culture is a key experimental method for biological investigations in diverse areas such as developmental biology, drug discovery and biotechnology. Light microscopy-based methods, for example phase contrast microscopy (PCM), are routinely used for visual inspection of adherent cells cultured in transparent polymeric vessels. However, the outcome of such inspections is qualitative and highly subjective. Analytical methods that produce quantitative results can be used but often at the expense of culture integrity or viability. In this work, an imaging-based strategy to adherent cell cultures monitoring was investigated. Automated image processing and analysis of PCM images enabled quantitative measurements of key cell culture characteristics. Two types of segmentation algorithms for the detection of cellular objects on PCM images were evaluated. The first one, based on contrast filters and dynamic programming was quick (<1s per 1280×960 image) and performed well for different cell lines, over a wide range of imaging conditions. The second approach, termed ‘trainable segmentation’, was based on machine learning using a variety of image features such as local structures and symmetries. It accommodated complex segmentation tasks while maintaining low processing times (<5s per 1280×960 image). Based on the output from these segmentation algorithms, imaging-based monitoring of a large palette of cell responses was demonstrated, including proliferation, growth arrest, differentiation, and cell death. This approach is non-invasive and applicable to any transparent culture vessel, including microfabricated culture devices where a lack of suitable analytical methods often limits their applicability. This work was a significant contribution towards the establishment of robust, standardised, and affordable monitoring methods for adherent cell cultures. Finally, automated image processing was combined with computer-controlled cultures in small-scale devices. This provided a first demonstration of how adaptive culture protocols could be established; i.e. culture protocols which are based on cellular response instead of arbitrary time points

    Efficient Histogram-Based Sliding Window

    No full text
    Many computer vision problems rely on computing histogram-based objective functions with a sliding window. A main limiting factor is the high computational cost. Existing computational methods have a complexity linear in the histogram dimension. In this paper, we propose an efficient method that has a constant complexity in the histogram dimension and therefore scales well with high dimensional histograms. This is achieved by harnessing the spatial coherence of natural images and computing the objective function in an incremental manner. We demonstrate the significant performance enhancement by our method through important vision tasks including object detection, object tracking and image saliency analysis. Compared with stateof-the-art techniques, our method typically achieves from tens to hundreds of times speedup for those tasks. 1
    corecore