884 research outputs found

    Fruit Detection and Tree Segmentation for Yield Mapping in Orchards

    Get PDF
    Accurate information gathering and processing is critical for precision horticulture, as growers aim to optimise their farm management practices. An accurate inventory of the crop that details its spatial distribution along with health and maturity, can help farmers efficiently target processes such as chemical and fertiliser spraying, crop thinning, harvest management, labour planning and marketing. Growers have traditionally obtained this information by using manual sampling techniques, which tend to be labour intensive, spatially sparse, expensive, inaccurate and prone to subjective biases. Recent advances in sensing and automation for field robotics allow for key measurements to be made for individual plants throughout an orchard in a timely and accurate manner. Farmer operated machines or unmanned robotic platforms can be equipped with a range of sensors to capture a detailed representation over large areas. Robust and accurate data processing techniques are therefore required to extract high level information needed by the grower to support precision farming. This thesis focuses on yield mapping in orchards using image and light detection and ranging (LiDAR) data captured using an unmanned ground vehicle (UGV). The contribution is the framework and algorithmic components for orchard mapping and yield estimation that is applicable to different fruit types and orchard configurations. The framework includes detection of fruits in individual images and tracking them over subsequent frames. The fruit counts are then associated to individual trees, which are segmented from image and LiDAR data, resulting in a structured spatial representation of yield. The first contribution of this thesis is the development of a generic and robust fruit detection algorithm. Images captured in the outdoor environment are susceptible to highly variable external factors that lead to significant appearance variations. Specifically in orchards, variability is caused by changes in illumination, target pose, tree types, etc. The proposed techniques address these issues by using state-of-the-art feature learning approaches for image classification, while investigating the utility of orchard domain knowledge for fruit detection. Detection is performed using both pixel-wise classification of images followed instance segmentation, and bounding-box regression approaches. The experimental results illustrate the versatility of complex deep learning approaches over a multitude of fruit types. The second contribution of this thesis is a tree segmentation approach to detect the individual trees that serve as a standard unit for structured orchard information systems. The work focuses on trellised trees, which present unique challenges for segmentation algorithms due to their intertwined nature. LiDAR data are used to segment the trellis face, and to generate proposals for individual trees trunks. Additional trunk proposals are provided using pixel-wise classification of the image data. The multi-modal observations are fine-tuned by modelling trunk locations using a hidden semi-Markov model (HSMM), within which prior knowledge of tree spacing is incorporated. The final component of this thesis addresses the visual occlusion of fruit within geometrically complex canopies by using a multi-view detection and tracking approach. Single image fruit detections are tracked over a sequence of images, and associated to individual trees or farm rows, with the spatial distribution of the fruit counting forming a yield map over the farm. The results show the advantage of using multi-view imagery (instead of single view analysis) for fruit counting and yield mapping. This thesis includes extensive experimentation in almond, apple and mango orchards, with data captured by a UGV spanning a total of 5 hectares of farm area, over 30 km of vehicle traversal and more than 7,000 trees. The validation of the different processes is performed using manual annotations, which includes fruit and tree locations in image and LiDAR data respectively. Additional evaluation of yield mapping is performed by comparison against fruit counts on trees at the farm and counts made by the growers post-harvest. The framework developed in this thesis is demonstrated to be accurate compared to ground truth at all scales of the pipeline, including fruit detection and tree mapping, leading to accurate yield estimation, per tree and per row, for the different crops. Through the multitude of field experiments conducted over multiple seasons and years, the thesis presents key practical insights necessary for commercial development of an information gathering system in orchards

    Text Extraction From Natural Scene: Methodology And Application

    Full text link
    With the popularity of the Internet and the smart mobile device, there is an increasing demand for the techniques and applications of image/video-based analytics and information retrieval. Most of these applications can benefit from text information extraction in natural scene. However, scene text extraction is a challenging problem to be solved, due to cluttered background of natural scene and multiple patterns of scene text itself. To solve these problems, this dissertation proposes a framework of scene text extraction. Scene text extraction in our framework is divided into two components, detection and recognition. Scene text detection is to find out the regions containing text from camera captured images/videos. Text layout analysis based on gradient and color analysis is performed to extract candidates of text strings from cluttered background in natural scene. Then text structural analysis is performed to design effective text structural features for distinguishing text from non-text outliers among the candidates of text strings. Scene text recognition is to transform image-based text in detected regions into readable text codes. The most basic and significant step in text recognition is scene text character (STC) prediction, which is multi-class classification among a set of text character categories. We design robust and discriminative feature representations for STC structure, by integrating multiple feature descriptors, coding/pooling schemes, and learning models. Experimental results in benchmark datasets demonstrate the effectiveness and robustness of our proposed framework, which obtains better performance than previously published methods. Our proposed scene text extraction framework is applied to 4 scenarios, 1) reading print labels in grocery package for hand-held object recognition; 2) combining with car detection to localize license plate in camera captured natural scene image; 3) reading indicative signage for assistant navigation in indoor environments; and 4) combining with object tracking to perform scene text extraction in video-based natural scene. The proposed prototype systems and associated evaluation results show that our framework is able to solve the challenges in real applications

    A survey of visual preprocessing and shape representation techniques

    Get PDF
    Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention)

    Performance of Scattering Matrix Decomposition and Color Spaces for Synthetic Aperture Radar Imagery

    Get PDF
    Polarimetrc Synthetic Aperture Radar (SAR) has been shown to be a powerful tool in remote sensing because uses up to four simultaneous measurements giving additional degrees of freedom for processing. Typically, polarization decomposition techniques are applied to the polarization-dependent data to form colorful imagery that is easy for operators systems to interpret. Yet, the presumption is that the SAR system operates with maximum bandwidth which requires extensive processing for near- or real-time application. In this research, color space selection is investigated when processing sparse polarimetric SAR data as in the case of the publicly available \Gotcha Volumetric SAR Data Set, Version 1:0 . To improve information quality in resultant color imagery, three scattering matrix decompositions were investigated (linear, Pauli and Krogager) using two common color spaces (RGB, CMY) to determine the best combination for accurate feature extraction. A mathematical model is presented for each decomposition technique and color space to the Cramer-Rao lower bound (CRLB) and quantify the performance bounds from an estimation perspective for given SAR system and processing parameters. After a deep literature review in color science, the mathematical model for color spaces was not able to be computed together with the mathematical model for decomposition techniques. The color spaces used for this research were functions of variables that are out of the scope of electrical engineering research and include factors such as the way humans sense color, environment influences in the color stimulus and device technical characteristics used to display the SAR image. Hence, SAR imagery was computed for specific combinations of decomposition technique and color space and allow the reader to gain an abstract view of the performance differences

    Biologically Inspired Computer Vision/ Applications of Computational Models of Primate Visual Systems in Computer Vision and Image Processing

    Get PDF
    Biologically Inspired Computer VisionApplications of Computational Models of Primate Visual Systems in Computer Vision and Image Processing Reza Hojjaty Saeedy Abstract Biological vision systems are remarkable at extracting and analyzing the information that is essential for vital functional needs. They perform all these tasks with both high sensitivity and strong reliability. They can efficiently and quickly solve most of the difficult computa- tional problems that are still challenging for artificial systems, such as scene segmentation, 3D/depth perception, motion recognition, etc. So it is no surprise that biological vision systems have been a source of inspiration for computer vision problems. In this research, we aim to provide a computer vision task centric framework out of models primarily originating in biological vision studies. We try to address two specific tasks here: saliency detection and object classification. In both of these tasks we use features extracted from computational models of biological vision systems as a starting point for further processing. Saliency maps are 2D topographic maps that catch the most conspicuous regions of a scene, i.e. the pixels in an image that stand out against their neighboring pixels. So these maps can be thought of as representations of the human attention process and thus have a lot of applications in computer vision. We propose a cascade that combines two well- known computational models for perception of color and orientation in order to simulate the responses of the primary areas of the primate visual cortex. We use these responses as inputs to a spiking neural network(SNN) and finally the output of this SNN will serve as the input to our post-processing algorithm for saliency detection. Object classification/detection is the most studied task in computer vision and machine learning and it is interesting that while it looks trivial for humans it is a difficult problem for artificial systems. For this part of the thesis we also design a pipeline including feature extraction using biologically inspired systems, manifold learning for dimensionality reduction and self-organizing(vector quantization) neural network as a supervised method for prototype learning
    corecore