134 research outputs found

    Pathfinding and positioning in a labyrinth game using a wide-angle camera

    Get PDF
    Alten AB has a technology demonstator in the form of a motorized and camera equipped large scale labyrinth game. The ball position is controlled by a ABB industrial PLC connected with Android tablets for user interface and a camera as a sensor for the ball position. This thesis demonstrates the ability to place a wide angle camera inside the cabinet, correcting the lens distortion caused by the wide angle lens and detect the ball with the use of a circular Hough transform. A path is also generated from the ball position to any position of the maze by capturing an image from the camera, generating a map for subsequent pathfinding, using an improvement of the Dijkstra’s pathfinding algorithm named Theta*. It further demonstrates the feasibility of using the computing power of the camera for both pathfinding and ball positioning

    An investigation into common challenges of 3D scene understanding in visual surveillance

    Get PDF
    Nowadays, video surveillance systems are ubiquitous. Most installations simply consist of CCTV cameras connected to a central control room and rely on human operators to interpret what they see on the screen in order to, for example, detect a crime (either during or after an event). Some modern computer vision systems aim to automate the process, at least to some degree, and various algorithms have been somewhat successful in certain limited areas. However, such systems remain inefficient in general circumstances and present real challenges yet to be solved. These challenges include the ability to recognise and ultimately predict and prevent abnormal behaviour or even reliably recognise objects, for example in order to detect left luggage or suspicious objects. This thesis first aims to study the state-of-the-art and identify the major challenges and possible requirements of future automated and semi-automated CCTV technology in the field. This thesis presents the application of a suite of 2D and highly novel 3D methodologies that go some way to overcome current limitations.The methods presented here are based on the analysis of object features directly extracted from the geometry of the scene and start with a consideration of mainly existing techniques, such as the use of lines, vanishing points (VPs) and planes, applied to real scenes. Then, an investigation is presented into the use of richer 2.5D/3D surface normal data. In all cases the aim is to combine both 2D and 3D data to obtain a better understanding of the scene, aimed ultimately at capturing what is happening within the scene in order to be able to move towards automated scene analysis. Although this thesis focuses on the widespread application of video surveillance, an example case of the railway station environment is used to represent typical real-world challenges, where the principles can be readily extended elsewhere, such as to airports, motorways, the households, shopping malls etc. The context of this research work, together with an overall presentation of existing methods used in video surveillance and their challenges are described in chapter 1.Common computer vision techniques such as VP detection, camera calibration, 3D reconstruction, segmentation etc., can be applied in an effort to extract meaning to video surveillance applications. According to the literature, these methods have been well researched and their use will be assessed in the context of current surveillance requirements in chapter 2. While existing techniques can perform well in some contexts, such as an architectural environment composed of simple geometrical elements, their robustness and performance in feature extraction and object recognition tasks is not sufficient to solve the key challenges encountered in general video surveillance context. This is largely due to issues such as variable lighting, weather conditions, and shadows and in general complexity of the real-world environment. Chapter 3 presents the research and contribution on those topics – methods to extract optimal features for a specific CCTV application – as well as their strengths and weaknesses to highlight that the proposed algorithm obtains better results than most due to its specific design.The comparison of current surveillance systems and methods from the literature has shown that 2D data are however almost constantly used for many applications. Indeed, industrial systems as well as the research community have been improving intensively 2D feature extraction methods since image analysis and Scene understanding has been of interest. The constant progress on 2D feature extraction methods throughout the years makes it almost effortless nowadays due to a large variety of techniques. Moreover, even if 2D data do not allow solving all challenges in video surveillance or other applications, they are still used as starting stages towards scene understanding and image analysis. Chapter 4 will then explore 2D feature extraction via vanishing point detection and segmentation methods. A combination of most common techniques and a novel approach will be then proposed to extract vanishing points from video surveillance environments. Moreover, segmentation techniques will be explored in the aim to determine how they can be used to complement vanishing point detection and lead towards 3D data extraction and analysis. In spite of the contribution above, 2D data is insufficient for all but the simplest applications aimed at obtaining an understanding of a scene, where the aim is for a robust detection of, say, left luggage or abnormal behaviour; without significant a priori information about the scene geometry. Therefore, more information is required in order to be able to design a more automated and intelligent algorithm to obtain richer information from the scene geometry and so a better understanding of what is happening within. This can be overcome by the use of 3D data (in addition to 2D data) allowing opportunity for object “classification” and from this to infer a map of functionality, describing feasible and unfeasible object functionality in a given environment. Chapter 5 presents how 3D data can be beneficial for this task and the various solutions investigated to recover 3D data, as well as some preliminary work towards plane extraction.It is apparent that VPs and planes give useful information about a scene’s perspective and can assist in 3D data recovery within a scene. However, neither VPs nor plane detection techniques alone allow the recovery of more complex generic object shapes - for example composed of spheres, cylinders etc - and any simple model will suffer in the presence of non-Manhattan features, e.g. introduced by the presence of an escalator. For this reason, a novel photometric stereo-based surface normal retrieval methodology is introduced to capture the 3D geometry of the whole scene or part of it. Chapter 6 describes how photometric stereo allows recovery of 3D information in order to obtain a better understanding of a scene, as well as also partially overcoming some current surveillance challenges, such as difficulty in resolving fine detail, particularly at large standoff distances, and in isolating and recognising more complex objects in real scenes. Here items of interest may be obscured by complex environmental factors that are subject to rapid change, making, for example, the detection of suspicious objects and behaviour highly problematic. Here innovative use is made of an untapped latent capability offered within modern surveillance environments to introduce a form of environmental structuring to good advantage in order to achieve a richer form of data acquisition. This chapter also goes on to explore the novel application of photometric stereo in such diverse applications, how our algorithm can be incorporated into an existing surveillance system and considers a typical real commercial application.One of the most important aspects of this research work is its application. Indeed, while most of the research literature has been based on relatively simple structured environments, the approach here has been designed to be applied to real surveillance environments, such as railway stations, airports, waiting rooms, etc, and where surveillance cameras may be fixed or in the future form part of a mobile robotic free roaming surveillance device, that must continually reinterpret its changing environment. So, as mentioned previously, while the main focus has been to apply this algorithm to railway station environments, the work has been approached in a way that allows adaptation to many other applications, such as autonomous robotics, and in motorway, shopping centre, street and home environments. All of these applications require a better understanding of the scene for security or safety purposes. Finally, chapter 7 presents a global conclusion and what will be achieved in the future

    Detection and Classification of Diabetic Retinopathy Pathologies in Fundus Images

    Get PDF
    Diabetic Retinopathy (DR) is a disease that affects up to 80% of diabetics around the world. It is the second greatest cause of blindness in the Western world, and one of the leading causes of blindness in the U.S. Many studies have demonstrated that early treatment can reduce the number of sight-threatening DR cases, mitigating the medical and economic impact of the disease. Accurate, early detection of eye disease is important because of its potential to reduce rates of blindness worldwide. Retinal photography for DR has been promoted for decades for its utility in both disease screening and clinical research studies. In recent years, several research centers have presented systems to detect pathology in retinal images. However, these approaches apply specialized algorithms to detect specific types of lesion in the retina. In order to detect multiple lesions, these systems generally implement multiple algorithms. Furthermore, some of these studies evaluate their algorithms on a single dataset, thus avoiding potential problems associated with the differences in fundus imaging devices, such as camera resolution. These methodologies primarily employ bottom-up approaches, in which the accurate segmentation of all the lesions in the retina is the basis for correct determination. A disadvantage of bottom-up approaches is that they rely on the accurate segmentation of all lesions in order to measure performance. On the other hand, top-down approaches do not depend on the segmentation of specific lesions. Thus, top-down methods can potentially detect abnormalities not explicitly used in their training phase. A disadvantage of these methods is that they cannot identify specific pathologies and require large datasets to build their training models. In this dissertation, I merged the advantages of the top-down and bottom-up approaches to detect DR with high accuracy. First, I developed an algorithm based on a top-down approach to detect abnormalities in the retina due to DR. By doing so, I was able to evaluate DR pathologies other than microaneurysms and exudates, which are the main focus of most current approaches. In addition, I demonstrated good generalization capacity of this algorithm by applying it to other eye diseases, such as age-related macular degeneration. Due to the fact that high accuracy is required for sight-threatening conditions, I developed two bottom-up approaches, since it has been proven that bottom-up approaches produce more accurate results than top-down approaches for particular structures. Consequently, I developed an algorithm to detect exudates in the macula. The presence of this pathology is considered to be a surrogate for clinical significant macular edema (CSME), a sight-threatening condition of DR. The analysis of the optic disc is usually not taken into account in DR screening systems. However, there is a pathology called neovascularization that is present in advanced stages of DR, making its detection of crucial clinical importance. In order to address this problem, I developed an algorithm to detect neovascularization in the optic disc. These algorithms are based on amplitude-modulation and frequency-modulation (AM-FM) representations, morphological image processing methods, and classification algorithms. The methods were tested on a diverse set of large databases and are considered to be the state-of the art in this field

    Novel Texture-based Probabilistic Object Recognition and Tracking Techniques for Food Intake Analysis and Traffic Monitoring

    Get PDF
    More complex image understanding algorithms are increasingly practical in a host of emerging applications. Object tracking has value in surveillance and data farming; and object recognition has applications in surveillance, data management, and industrial automation. In this work we introduce an object recognition application in automated nutritional intake analysis and a tracking application intended for surveillance in low quality videos. Automated food recognition is useful for personal health applications as well as nutritional studies used to improve public health or inform lawmakers. We introduce a complete, end-to-end system for automated food intake measurement. Images taken by a digital camera are analyzed, plates and food are located, food type is determined by neural network, distance and angle of food is determined and 3D volume estimated, the results are cross referenced with a nutritional database, and before and after meal photos are compared to determine nutritional intake. We compare against contemporary systems and provide detailed experimental results of our system\u27s performance. Our tracking systems consider the problem of car and human tracking on potentially very low quality surveillance videos, from fixed camera or high flying \acrfull{uav}. Our agile framework switches among different simple trackers to find the most applicable tracker based on the object and video properties. Our MAPTrack is an evolution of the agile tracker that uses soft switching to optimize between multiple pertinent trackers, and tracks objects based on motion, appearance, and positional data. In both cases we provide comparisons against trackers intended for similar applications i.e., trackers that stress robustness in bad conditions, with competitive results

    Morphological Analysis for Object Recognition, Matching, and Applications

    Get PDF
    This thesis deals with the detection and classifcation of objects in visual images and with the analysis of shape changes between object instances. Whereas the task of object recognition focuses on learning models which describe common properties between instances of a specific category, the analysis of the specific differences between instances is also relevant to understand the objects and the categories themselves. This research is governed by the idea that important properties for the automatic perception and understanding of objects are transmitted through their geometry or shape. Therefore, models for object recognition and shape matching are devised which exploit the geometry and properties of the objects, using as little user supervision as possible. In order to learn object models for detection in a reliable manner, suitable object representations are required. The key idea in this work is to use a richer representation of the object shape within the object model in order to increase the description power and thus the performance of the whole system. For this purpose, we first investigate the integration of curvature information of shapes in the object model which is learned. Since natural objects intrinsically exhibit curved boundaries, an object is better described if this shape cue is integrated. This subject extends the widely used object representation based on gradient orientation histograms by incorporating a robust histogram-based description of curvature. We show that integrating this information substantially improves detection results over descriptors that solely rely upon histograms of orientated gradients. The impact of using richer shape representations for object recognition is further investigated through a novel method which goes beyond traditional bounding-box representations for objects. Visual recognition requires learning object models from training data. Commonly, training samples are annotated by marking only the bounding-box of objects since this appears to be the best trade-off between labeling information and effectiveness. However, objects are typically not box-shaped. Thus, the usual parametrization of objects using a bounding box seems inappropriate since such a box contains a significant amount of background clutter. Therefore, the presented approach learns object models for detection while simultaneously learning to segregate objects from clutter and extracting their overall shape, without however, requiring manual segmentation of the training samples. Shape equivalence is another interesting property related to shape. It refers to the ability of perceiving two distinct objects as having the same or similar shape. This thesis also explores the usage of this ability to detect objects in unsupervised scenarios, that is where no annotation of training data is available for learning a statistical model. For this purpose, a dataset of historical Chinese cartoons drawn during the Cultural Revolution and immediately thereafter is analyzed. Relevant objects in this dataset are emphasized through annuli of light rays. The idea of our method is to consider the different annuli as shape equivalent objects, that is, as objects sharing the same shape and devise a method to detect them. Thereafter, it is possible to indirectly infer the position, size and scale of the emphasized objects using the annuli detections. Not only commonalities among objects, but also the specific differences between them are perceived by a visual system. These differences can be understood through the analysis of how objects and their shape change. For this reason, this thesis also develops a novel methodology for analyzing the shape deformation between a single pair of images under missing correspondences. The key observation is that objects cannot deform arbitrarily, but rather the deformation itself follows the geometry and constraints imposed by the object itself. We describe the overall complex object deformation using a piecewise linear model. Thereby, we are able to identify each of the parts in the shape which share the same deformation. Thus, we are able to understand how an object and its parts were transformed. A remarkable property of the algorithm is the ability to automatically estimate the model complexity according to the overall complexity of the shape deformation. Specifically, the introduced methodology is used to analyze the deformation between original instances and reproductions of artworks. The nature of the analyzed alterations ranges from deliberate modifications by the artist to geometrical errors accumulated during the reproduction process of the image. The usage of this method within this application shows how productive the interaction between computer vision and the field of the humanities is. The goal is not to supplant human expertise, but to enhance and deepen connoisseurship about a given problem

    Fast catheter segmentation and tracking based on x-ray fluoroscopic and echocardiographic modalities for catheter-based cardiac minimally invasive interventions

    Get PDF
    X-ray fluoroscopy and echocardiography imaging (ultrasound, US) are two imaging modalities that are widely used in cardiac catheterization. For these modalities, a fast, accurate and stable algorithm for the detection and tracking of catheters is required to allow clinicians to observe the catheter location in real-time. Currently X-ray fluoroscopy is routinely used as the standard modality in catheter ablation interventions. However, it lacks the ability to visualize soft tissue and uses harmful radiation. US does not have these limitations but often contains acoustic artifacts and has a small field of view. These make the detection and tracking of the catheter in US very challenging. The first contribution in this thesis is a framework which combines Kalman filter and discrete optimization for multiple catheter segmentation and tracking in X-ray images. Kalman filter is used to identify the whole catheter from a single point detected on the catheter in the first frame of a sequence of x-ray images. An energy-based formulation is developed that can be used to track the catheters in the following frames. We also propose a discrete optimization for minimizing the energy function in each frame of the X-ray image sequence. Our approach is robust to tangential motion of the catheter and combines the tubular and salient feature measurements into a single robust and efficient framework. The second contribution is an algorithm for catheter extraction in 3D ultrasound images based on (a) the registration between the X-ray and ultrasound images and (b) the segmentation of the catheter in X-ray images. The search space for the catheter extraction in the ultrasound images is constrained to lie on or close to a curved surface in the ultrasound volume. The curved surface corresponds to the back-projection of the extracted catheter from the X-ray image to the ultrasound volume. Blob-like features are detected in the US images and organized in a graphical model. The extracted catheter is modelled as the optimal path in this graphical model. Both contributions allow the use of ultrasound imaging for the improved visualization of soft tissue. However, X-ray imaging is still required for each ultrasound frame and the amount of X-ray exposure has not been reduced. The final contribution in this thesis is a system that can track the catheter in ultrasound volumes automatically without the need for X-ray imaging during the tracking. Instead X-ray imaging is only required for the system initialization and for recovery from tracking failures. This allows a significant reduction in the amount of X-ray exposure for patient and clinicians.Open Acces

    View generated database

    Get PDF
    This document represents the final report for the View Generated Database (VGD) project, NAS7-1066. It documents the work done on the project up to the point at which all project work was terminated due to lack of project funds. The VGD was to provide the capability to accurately represent any real-world object or scene as a computer model. Such models include both an accurate spatial/geometric representation of surfaces of the object or scene, as well as any surface detail present on the object. Applications of such models are numerous, including acquisition and maintenance of work models for tele-autonomous systems, generation of accurate 3-D geometric/photometric models for various 3-D vision systems, and graphical models for realistic rendering of 3-D scenes via computer graphics

    Three-flavour neutrino oscillations with MINOS and CHIPS

    Get PDF
    MINOS was a long-baseline neutrino oscillation experiment comprising two functionally identical detectors that observed Fermilab's NuMI neutrino beam in its low energy tune, at distances of 1 km and 735 km. When NuMI switched to a higher-power medium energy tune in 2012, the MINOS detectors continued to operate as MINOS+. Since its commissioning in 2003, the MINOS Far Detector has also been able to detect atmospheric neutrinos. Atmospheric neutrino oscillations are sensitive to the mass splitting m2 32 and mixing angle 23, and are also subject to the Matter E ect as the neutrinos pass through the Earth, which a ects neutrinos and antineutrinos di erently in a way that depends upon the mass hierarchy. This thesis presents the rst atmospheric neutrino analysis using data from the MINOS+ era, and the rst dedicated MINOS atmospheric neutrino analysis to use a full three- avour mixing model. It includes 10.79 kiloton years of new data and encompasses almost an entire period of the 11-year solar cycle, from 2003 to 2014. The CHIPS experiment aims to reduce construction costs of large water Cherenkov detectors to $200-300k per kiloton by submerging detectors with a lightweight structure in bodies of water on the surface of the Earth. Such detectors could reach masses of 1 Mton and would assist with the search for CP violation in the neutrino sector by measuring the rate of e appearance in a beam. A detailed reconstruction framework for CHIPS has been developed, incorporating a novel method based on the timing of PMT hits. This framework has been used to study the performance of di erent designs for a 10 kiloton CHIPS R&D module, and to demonstrate that e events can be identi ed in a sparsely-instrumented detector with a 6% coverage of 3" PMTs

    Mobile Robots Navigation

    Get PDF
    Mobile robots navigation includes different interrelated activities: (i) perception, as obtaining and interpreting sensory information; (ii) exploration, as the strategy that guides the robot to select the next direction to go; (iii) mapping, involving the construction of a spatial representation by using the sensory information perceived; (iv) localization, as the strategy to estimate the robot position within the spatial map; (v) path planning, as the strategy to find a path towards a goal location being optimal or not; and (vi) path execution, where motor actions are determined and adapted to environmental changes. The book addresses those activities by integrating results from the research work of several authors all over the world. Research cases are documented in 32 chapters organized within 7 categories next described
    • …
    corecore