1,641 research outputs found

    Color and Shape Recognition

    Get PDF
    The object "car" and "cat" can be easily distinguished by humans, but how these labels are assigned? Grouping these images is easy for a person into different categories, but its very tedious for a computer. Hence, an object recognition system finds objects in the real world from an image. Object recognition algorithms rely on matching, learning or pattern recognition algorithms using appearance-based or feature-based techniques. In this thesis, the use of color and shape attributes as an explicit color and shape representation respectively for object detection is proposed. Color attributes are dense, computationally effective, and when joined with old-fashioned shape features provide pleasing results for object detection. The procedure of shape detection is actually a natural extension of the job of edge detection at the pixel level to the difficulty of global contour detection. A tool for a systematic analysis of edge based shape detection is provided by this filtering scheme. This enables us to find distinctions between objects based on color and shape

    An Image Understanding System for Detecting Indoor Features

    Get PDF
    The capability of identifying physical structures of an unknown environment is very important for vision based robot navigation and scene understanding. Among physical structures in indoor environments, corridor lines and doors are important visual landmarks for robot navigation since they show the topological structure in an indoor environment and establish connections among the different places or regions in the indoor environment. Furthermore, they provide clues for understanding the image. In this thesis, I present two algorithms to detect the vanishing point, corridor lines, and doors respectively using a single digital video camera. In both algorithms, we utilize a hypothesis generation and verification method to detect corridor and door structures using low level linear features. The proposed method consists of low, intermediate, and high level processing stages which correspond to the extraction of low level features, the formation of hypotheses, and verification of the hypotheses via seeking evidence actively. In particular, we extend this single-pass framework by employing a feedback strategy for more robust hypothesis generation and verification. We demonstrate the robustness of the proposed methods on a large number of real video images in a variety of corridor environments, with image acquisitions under different illumination and reflection conditions, with different moving speeds, and with different viewpoints of the camera. Experimental results performed on the corridor line detection algorithm validate that the method can detect corridor line locations in the presence of many spurious line features about one second. Experimental results carried on the door detection algorithm show that the system can detect visually important doors in an image with a very high accuracy rate when a robot navigates along a corridor environment

    Automatic Screening and Classification of Diabetic Retinopathy Eye Fundus Image

    Get PDF
    Diabetic Retinopathy (DR) is a disorder of the retinal vasculature. It develops to some degree in nearly all patients with long-standing diabetes mellitus and can result in blindness. Screening of DR is essential for both early detection and early treatment. This thesis aims to investigate automatic methods for diabetic retinopathy detection and subsequently develop an effective system for the detection and screening of diabetic retinopathy. The presented diabetic retinopathy research involves three development stages. Firstly, the thesis presents the development of a preliminary classification and screening system for diabetic retinopathy using eye fundus images. The research will then focus on the detection of the earliest signs of diabetic retinopathy, which are the microaneurysms. The detection of microaneurysms at an early stage is vital and is the first step in preventing diabetic retinopathy. Finally, the thesis will present decision support systems for the detection of diabetic retinopathy and maculopathy in eye fundus images. The detection of maculopathy, which are yellow lesions near the macula, is essential as it will eventually cause the loss of vision if the affected macula is not treated in time. An accurate retinal screening, therefore, is required to assist the retinal screeners to classify the retinal images effectively. Highly efficient and accurate image processing techniques must thus be used in order to produce an effective screening of diabetic retinopathy. In addition to the proposed diabetic retinopathy detection systems, this thesis will present a new dataset, and will highlight the dataset collection, the expert diagnosis process and the advantages of the new dataset, compared to other public eye fundus images datasets available. The new dataset will be useful to researchers and practitioners working in the retinal imaging area and would widely encourage comparative studies in the field of diabetic retinopathy research. It is envisaged that the proposed decision support system for clinical screening would greatly contribute to and assist the management and the detection of diabetic retinopathy. It is also hoped that the developed automatic detection techniques will assist clinicians to diagnose diabetic retinopathy at an early stage

    Localizing Polygonal Objects in Man-Made Environments

    Get PDF
    Object detection is a significant challenge in Computer Vision and has received a lot of attention in the field. One such challenge addressed in this thesis is the detection of polygonal objects, which are prevalent in man-made environments. Shape analysis is an important cue to detect these objects. We propose a contour-based object detection framework to deal with the related challenges, including how to efficiently detect polygonal shapes and how to exploit them for object detection. First, we propose an efficient component tree segmentation framework for stable region extraction and a multi-resolution line segment detection algorithm, which form the bases of our detection framework. Our component tree segmentation algorithm explores the optimal threshold for each branch of the component tree, and achieves a significant improvement over image thresholding segmentation, and comparable performance to more sophisticated methods but only at a fraction of computation time. Our line segment detector overcomes several inherent limitations of the Hough transform, and achieves a comparable performance to the state-of-the-art line segment detectors. However, our approach can better capture dominant structures and is more stable against low-quality imaging conditions. Second, we propose a global shape analysis measurement for simple polygon detection and use it to develop an approach for real-time landing site detection in unconstrained man-made environments. Since the task of detecting landing sites must be performed in a few seconds or less, existing methods are often limited to simple local intensity and edge variation cues. By contrast, we show how to efficiently take into account the potential sitesâ global shape, which is a critical cue in man-made scenes. Our method relies on component tree segmentation algorithm and a new shape regularity measure to look for polygonal regions in video sequences. In this way we enforce both temporal consistency and geometric regularity, resulting in reliable and consistent detections. Third, we propose a generic contour grouping based object detection approach by exploring promising cycles in a line fragment graph. Previous contour-based methods are limited to use additive scoring functions. In this thesis, we propose an approximate search approach that eliminates this restriction. Given a weighted line fragment graph, we prune its cycle space by removing cycles containing weak nodes or weak edges, until the upper bound of the cycle space is less than the threshold defined by the cyclomatic number. Object contours are then detected as maximally scoring elementary circuits in the pruned cycle space. Furthermore, we propose another more efficient algorithm, which reconstructs the graph by grouping the strongest edges iteratively until the number of the cycles reaches the upper bound. Our approximate search approaches can be used with any cycle scoring function. Moreover, unlike other contour grouping based approaches, our approach does not rely on a greedy strategy for finding multiple candidates and is capable of finding multiple candidates sharing common line fragments. We demonstrate that our approach significantly outperforms the state-of-the-art

    Generic object classification for autonomous robots

    Get PDF
    Un dels principals problemes de la interacció dels robots autònoms és el coneixement de l'escena. El reconeixement és fonamental per a solucionar aquest problema i permetre als robots interactuar en un escenari no controlat. En aquest document presentem una aplicació pràctica de la captura d'objectes, de la normalització i de la classificació de senyals triangulars i circulars. El sistema s'introdueix en el robot Aibo de Sony per a millorar-ne la interacció. La metodologia presentada s'ha comprobat en simulacions i problemes de categorització reals, com ara la classificació de senyals de trànsit, amb resultats molt prometedors.Uno de los principales problemas de la interacción de los robots autónomos es el conocimiento de la escena. El reconocimiento es fundamental para solventar este problema y permitir a los robots interactuar en un escenario no controlado. En este documento, presentamos una aplicación práctica de captura del objeto, normalización y clasificación de señales triangulares y circulares. El sistema es introducido en el robot Aibo de Sony para mejorar el comportamiento de la interacción del robot. La metodología presentada ha sido testeada en simulaciones y problemas de categorización reales, como es la clasificación de señales de tráfico, con resultados muy prometedores.One of the main problems of autonomous robots interaction is the scene knowledge. Recognition is concerned to deal with this problem and to allow robots to interact in uncontrolled environments. In this paper, we present a practical application for object fitting, normalization and classification of triangular and circular signs. The system is introduced in the Aibo robot of Sony to increase the robot interaction behaviour. The presented methodology has been tested in real simulations and categorization problems, as the traffic signs classification, with very promising results.Nota: Aquest document conté originàriament altre material i/o programari només consultable a la Biblioteca de Ciència i Tecnologia

    Efficient identification, localization and quantification of grapevine inflorescences and flowers in unprepared field images using Fully Convolutional Networks

    Get PDF
    Yield and its prediction is one of the most important tasks in grapevine breeding purposes and vineyard management. Commonly, this trait is estimated manually right before harvest by extrapolation, which mostly is labor-intensive, destructive and inaccurate. In the present study an automated image-based workflow was developed for quantifying inflorescences and single flowers in unprepared field images of grapevines, i.e. no artificial background or light was applied. It is a novel approach for non-invasive, inexpensive and objective phenotyping with high-throughput.First, image regions depicting inflorescences were identified and localized. This was done by segmenting the images into the classes "inflorescence" and "non-inflorescence" using a Fully Convolutional Network (FCN). Efficient image segmentation hereby is the most challenging step regarding the small geometry and dense distribution of single flowers (several hundred single flowers per inflorescence), similar color of all plant organs in the fore- and background as well as the circumstance that only approximately 5 % of an image show inflorescences. The trained FCN achieved a mean Intersection Over Union (IOU) of 87.6 % on the test data set. Finally, single flowers were extracted from the "inflorescence"-areas using Circular Hough Transform. The flower extraction achieved a recall of 80.3 % and a precision of 70.7 % using the segmentation derived by the trained FCN model.Summarized, the presented approach is a promising strategy in order to predict yield potential automatically in the earliest stage of grapevine development which is applicable for objective monitoring and evaluations of breeding material, genetic repositories or commercial vineyards

    Automatic Main Road Extraction from High Resolution Satellite Imagery

    Get PDF
    Road information is essential for automatic GIS (geographical information system) data acquisition, transportation and urban planning. Automatic road (network) detection from high resolution satellite imagery will hold great potential for significant reduction of database development/updating cost and turnaround time. From so called low level feature detection to high level context supported grouping, so many algorithms and methodologies have been presented for this purpose. There is not any practical system that can fully automatically extract road network from space imagery for the purpose of automatic mapping. This paper presents the methodology of automatic main road detection from high resolution satellite IKONOS imagery. The strategies include multiresolution or image pyramid method, Gaussian blurring and the line finder using 1-dimemsional template correlation filter, line segment grouping and multi-layer result integration. Multi-layer or multi-resolution method for road extraction is a very effective strategy to save processing time and improve robustness. To realize the strategy, the original IKONOS image is compressed into different corresponding image resolution so that an image pyramid is generated; after that the line finder of 1-dimemsional template correlation filter after Gaussian blurring filtering is applied to detect the road centerline. Extracted centerline segments belong to or do not belong to roads. There are two ways to identify the attributes of the segments, the one is using segment grouping to form longer line segments and assign a possibility to the segment depending on the length and other geometric and photometric attribute of the segment, for example the longer segment means bigger possibility of being road. Perceptual-grouping based method is used for road segment linking by a possibility model that takes multi-information into account; here the clues existing in the gaps are considered. Another way to identify the segments is feature detection back-to-higher resolution layer from the image pyramid

    Automated retinal analysis

    Get PDF
    Diabetes is a chronic disease affecting over 2% of the population in the UK [1]. Long-term complications of diabetes can affect many different systems of the body including the retina of the eye. In the retina, diabetes can lead to a disease called diabetic retinopathy, one of the leading causes of blindness in the working population of industrialised countries. The risk of visual loss from diabetic retinopathy can be reduced if treatment is given at the onset of sight-threatening retinopathy. To detect early indicators of the disease, the UK National Screening Committee have recommended that diabetic patients should receive annual screening by digital colour fundal photography [2]. Manually grading retinal images is a subjective and costly process requiring highly skilled staff. This thesis describes an automated diagnostic system based oil image processing and neural network techniques, which analyses digital fundus images so that early signs of sight threatening retinopathy can be identified. Within retinal analysis this research has concentrated on the development of four algorithms: optic nerve head segmentation, lesion segmentation, image quality assessment and vessel width measurements. This research amalgamated these four algorithms with two existing techniques to form an integrated diagnostic system. The diagnostic system when used as a 'pre-filtering' tool successfully reduced the number of images requiring human grading by 74.3%: this was achieved by identifying and excluding images without sight threatening maculopathy from manual screening
    corecore