3,309 research outputs found

    LiDAR and Camera Detection Fusion in a Real Time Industrial Multi-Sensor Collision Avoidance System

    Full text link
    Collision avoidance is a critical task in many applications, such as ADAS (advanced driver-assistance systems), industrial automation and robotics. In an industrial automation setting, certain areas should be off limits to an automated vehicle for protection of people and high-valued assets. These areas can be quarantined by mapping (e.g., GPS) or via beacons that delineate a no-entry area. We propose a delineation method where the industrial vehicle utilizes a LiDAR {(Light Detection and Ranging)} and a single color camera to detect passive beacons and model-predictive control to stop the vehicle from entering a restricted space. The beacons are standard orange traffic cones with a highly reflective vertical pole attached. The LiDAR can readily detect these beacons, but suffers from false positives due to other reflective surfaces such as worker safety vests. Herein, we put forth a method for reducing false positive detection from the LiDAR by projecting the beacons in the camera imagery via a deep learning method and validating the detection using a neural network-learned projection from the camera to the LiDAR space. Experimental data collected at Mississippi State University's Center for Advanced Vehicular Systems (CAVS) shows the effectiveness of the proposed system in keeping the true detection while mitigating false positives.Comment: 34 page

    Visual Human Tracking and Group Activity Analysis: A Video Mining System for Retail Marketing

    Get PDF
    Thesis (PhD) - Indiana University, Computer Sciences, 2007In this thesis we present a system for automatic human tracking and activity recognition from video sequences. The problem of automated analysis of visual information in order to derive descriptors of high level human activities has intrigued computer vision community for decades and is considered to be largely unsolved. A part of this interest is derived from the vast range of applications in which such a solution may be useful. We attempt to find efficient formulations of these tasks as applied to the extracting customer behavior information in a retail marketing context. Based on these formulations, we present a system that visually tracks customers in a retail store and performs a number of activity analysis tasks based on the output from the tracker. In tracking we introduce new techniques for pedestrian detection, initialization of the body model and a formulation of the temporal tracking as a global trans-dimensional optimization problem. Initial human detection is addressed by a novel method for head detection, which incorporates the knowledge of the camera projection model.The initialization of the human body model is addressed by newly developed shape and appearance descriptors. Temporal tracking of customer trajectories is performed by employing a human body tracking system designed as a Bayesian jump-diffusion filter. This approach demonstrates the ability to overcome model dimensionality ambiguities as people are leaving and entering the scene. Following the tracking, we developed a two-stage group activity formulation based upon the ideas from swarming research. For modeling purposes, all moving actors in the scene are viewed here as simplistic agents in the swarm. This allows to effectively define a set of inter-agent interactions, which combine to derive a distance metric used in further swarm clustering. This way, in the first stage the shoppers that belong to the same group are identified by deterministically clustering bodies to detect short term events and in the second stage events are post-processed to form clusters of group activities with fuzzy memberships. Quantitative analysis of the tracking subsystem shows an improvement over the state of the art methods, if used under similar conditions. Finally, based on the output from the tracker, the activity recognition procedure achieves over 80% correct shopper group detection, as validated by the human generated ground truth results

    Robot navigation control based on monocular images: An image processing algorithm for obstacle avoidance decisions

    Get PDF
    This paper covers the use of monocular vision to control autonomous navigation for a robot in a dynamically changing environment. The solution focused on using colour segmentation against a selected floor plane to distinctly separate obstacles from traversable space, this is then supplemented with canny edge detection to separate similarly coloured boundaries to the floor plane. The resulting binary map (where white identifies an obstacle-free area and black identifies an obstacle) could then be processed by fuzzy logic or neural networks to control the robot’s next movements. Findings shows that the algorithm performed strongly on solid coloured carpets, wooden and concrete floors but had difficulty in separating colours in multi-coloured floor types such as patterned carpets

    Intelligent computer vision processing techniques for fall detection in enclosed environments

    Get PDF
    Detecting unusual movement (falls) for elderly people in enclosed environments is receiving increasing attention and is likely to have massive potential social and economic impact. In this thesis, new intelligent computer vision processing based techniques are proposed to detect falls in indoor environments for senior citizens living independently, such as in intelligent homes. Different types of features extracted from video-camera recordings are exploited together with both background subtraction analysis and machine learning techniques. Initially, an improved background subtraction method is used to extract the region of a person in the recording of a room environment. A selective updating technique is introduced for adapting the change of the background model to ensure that the human body region will not be absorbed into the background model when it is static for prolonged periods of time. Since two-dimensional features can generate false alarms and are not invariant to different directions, more robust three-dimensional features are next extracted from a three-dimensional person representation formed from video-camera measurements of multiple calibrated video-cameras. The extracted three-dimensional features are applied to construct a single Gaussian model using the maximum likelihood technique. This can be used to distinguish falls from non-fall activity by comparing the model output with a single. In the final works, new fall detection schemes which use only one uncalibrated video-camera are tested in a real elderly person s home environment. These approaches are based on two-dimensional features which describe different human body posture. The extracted features are applied to construct a supervised method for posture classification for abnormal posture detection. Certain rules which are set according to the characteristics of fall activities are lastly used to build a robust fall detection model

    An investigation into common challenges of 3D scene understanding in visual surveillance

    Get PDF
    Nowadays, video surveillance systems are ubiquitous. Most installations simply consist of CCTV cameras connected to a central control room and rely on human operators to interpret what they see on the screen in order to, for example, detect a crime (either during or after an event). Some modern computer vision systems aim to automate the process, at least to some degree, and various algorithms have been somewhat successful in certain limited areas. However, such systems remain inefficient in general circumstances and present real challenges yet to be solved. These challenges include the ability to recognise and ultimately predict and prevent abnormal behaviour or even reliably recognise objects, for example in order to detect left luggage or suspicious objects. This thesis first aims to study the state-of-the-art and identify the major challenges and possible requirements of future automated and semi-automated CCTV technology in the field. This thesis presents the application of a suite of 2D and highly novel 3D methodologies that go some way to overcome current limitations.The methods presented here are based on the analysis of object features directly extracted from the geometry of the scene and start with a consideration of mainly existing techniques, such as the use of lines, vanishing points (VPs) and planes, applied to real scenes. Then, an investigation is presented into the use of richer 2.5D/3D surface normal data. In all cases the aim is to combine both 2D and 3D data to obtain a better understanding of the scene, aimed ultimately at capturing what is happening within the scene in order to be able to move towards automated scene analysis. Although this thesis focuses on the widespread application of video surveillance, an example case of the railway station environment is used to represent typical real-world challenges, where the principles can be readily extended elsewhere, such as to airports, motorways, the households, shopping malls etc. The context of this research work, together with an overall presentation of existing methods used in video surveillance and their challenges are described in chapter 1.Common computer vision techniques such as VP detection, camera calibration, 3D reconstruction, segmentation etc., can be applied in an effort to extract meaning to video surveillance applications. According to the literature, these methods have been well researched and their use will be assessed in the context of current surveillance requirements in chapter 2. While existing techniques can perform well in some contexts, such as an architectural environment composed of simple geometrical elements, their robustness and performance in feature extraction and object recognition tasks is not sufficient to solve the key challenges encountered in general video surveillance context. This is largely due to issues such as variable lighting, weather conditions, and shadows and in general complexity of the real-world environment. Chapter 3 presents the research and contribution on those topics – methods to extract optimal features for a specific CCTV application – as well as their strengths and weaknesses to highlight that the proposed algorithm obtains better results than most due to its specific design.The comparison of current surveillance systems and methods from the literature has shown that 2D data are however almost constantly used for many applications. Indeed, industrial systems as well as the research community have been improving intensively 2D feature extraction methods since image analysis and Scene understanding has been of interest. The constant progress on 2D feature extraction methods throughout the years makes it almost effortless nowadays due to a large variety of techniques. Moreover, even if 2D data do not allow solving all challenges in video surveillance or other applications, they are still used as starting stages towards scene understanding and image analysis. Chapter 4 will then explore 2D feature extraction via vanishing point detection and segmentation methods. A combination of most common techniques and a novel approach will be then proposed to extract vanishing points from video surveillance environments. Moreover, segmentation techniques will be explored in the aim to determine how they can be used to complement vanishing point detection and lead towards 3D data extraction and analysis. In spite of the contribution above, 2D data is insufficient for all but the simplest applications aimed at obtaining an understanding of a scene, where the aim is for a robust detection of, say, left luggage or abnormal behaviour; without significant a priori information about the scene geometry. Therefore, more information is required in order to be able to design a more automated and intelligent algorithm to obtain richer information from the scene geometry and so a better understanding of what is happening within. This can be overcome by the use of 3D data (in addition to 2D data) allowing opportunity for object “classification” and from this to infer a map of functionality, describing feasible and unfeasible object functionality in a given environment. Chapter 5 presents how 3D data can be beneficial for this task and the various solutions investigated to recover 3D data, as well as some preliminary work towards plane extraction.It is apparent that VPs and planes give useful information about a scene’s perspective and can assist in 3D data recovery within a scene. However, neither VPs nor plane detection techniques alone allow the recovery of more complex generic object shapes - for example composed of spheres, cylinders etc - and any simple model will suffer in the presence of non-Manhattan features, e.g. introduced by the presence of an escalator. For this reason, a novel photometric stereo-based surface normal retrieval methodology is introduced to capture the 3D geometry of the whole scene or part of it. Chapter 6 describes how photometric stereo allows recovery of 3D information in order to obtain a better understanding of a scene, as well as also partially overcoming some current surveillance challenges, such as difficulty in resolving fine detail, particularly at large standoff distances, and in isolating and recognising more complex objects in real scenes. Here items of interest may be obscured by complex environmental factors that are subject to rapid change, making, for example, the detection of suspicious objects and behaviour highly problematic. Here innovative use is made of an untapped latent capability offered within modern surveillance environments to introduce a form of environmental structuring to good advantage in order to achieve a richer form of data acquisition. This chapter also goes on to explore the novel application of photometric stereo in such diverse applications, how our algorithm can be incorporated into an existing surveillance system and considers a typical real commercial application.One of the most important aspects of this research work is its application. Indeed, while most of the research literature has been based on relatively simple structured environments, the approach here has been designed to be applied to real surveillance environments, such as railway stations, airports, waiting rooms, etc, and where surveillance cameras may be fixed or in the future form part of a mobile robotic free roaming surveillance device, that must continually reinterpret its changing environment. So, as mentioned previously, while the main focus has been to apply this algorithm to railway station environments, the work has been approached in a way that allows adaptation to many other applications, such as autonomous robotics, and in motorway, shopping centre, street and home environments. All of these applications require a better understanding of the scene for security or safety purposes. Finally, chapter 7 presents a global conclusion and what will be achieved in the future

    Robust Modular Feature-Based Terrain-Aided Visual Navigation and Mapping

    Get PDF
    The visual feature-based Terrain-Aided Navigation (TAN) system presented in this thesis addresses the problem of constraining inertial drift introduced into the location estimate of Unmanned Aerial Vehicles (UAVs) in GPS-denied environment. The presented TAN system utilises salient visual features representing semantic or human-interpretable objects (roads, forest and water boundaries) from onboard aerial imagery and associates them to a database of reference features created a-priori, through application of the same feature detection algorithms to satellite imagery. Correlation of the detected features with the reference features via a series of the robust data association steps allows a localisation solution to be achieved with a finite absolute bound precision defined by the certainty of the reference dataset. The feature-based Visual Navigation System (VNS) presented in this thesis was originally developed for a navigation application using simulated multi-year satellite image datasets. The extension of the system application into the mapping domain, in turn, has been based on the real (not simulated) flight data and imagery. In the mapping study the full potential of the system, being a versatile tool for enhancing the accuracy of the information derived from the aerial imagery has been demonstrated. Not only have the visual features, such as road networks, shorelines and water bodies, been used to obtain a position ’fix’, they have also been used in reverse for accurate mapping of vehicles detected on the roads into an inertial space with improved precision. Combined correction of the geo-coding errors and improved aircraft localisation formed a robust solution to the defense mapping application. A system of the proposed design will provide a complete independent navigation solution to an autonomous UAV and additionally give it object tracking capability

    Advances in automated tongue diagnosis techniques

    Get PDF
    This paper reviews the recent advances in a significant constituent of traditional oriental medicinal technology, called tongue diagnosis. Tongue diagnosis can be an effective, noninvasive method to perform an auxiliary diagnosis any time anywhere, which can support the global need in the primary healthcare system. This work explores the literature to evaluate the works done on the various aspects of computerized tongue diagnosis, namely preprocessing, tongue detection, segmentation, feature extraction, tongue analysis, especially in traditional Chinese medicine (TCM). In spite of huge volume of work done on automatic tongue diagnosis (ATD), there is a lack of adequate survey, especially to combine it with the current diagnosis trends. This paper studies the merits, capabilities, and associated research gaps in current works on ATD systems. After exploring the algorithms used in tongue diagnosis, the current trend and global requirements in health domain motivates us to propose a conceptual framework for the automated tongue diagnostic system on mobile enabled platform. This framework will be able to connect tongue diagnosis with the future point-of-care health system

    Automated freeform assembly of threaded fasteners

    Get PDF
    Over the past two decades, a major part of the manufacturing and assembly market has been driven by its customer requirements. Increasing customer demand for personalised products create the demand for smaller batch sizes, shorter production times, lower costs, and the flexibility to produce families of products - or different parts - with the same sets of equipment. Consequently, manufacturing companies have deployed various automation systems and production strategies to improve their resource efficiency and move towards right-first-time production. However, many of these automated systems, which are involved with robot-based, repeatable assembly automation, require component- specific fixtures for accurate positioning and extensive robot programming, to achieve flexibility in their production. Threaded fastening operations are widely used in assembly. In high-volume production, the fastening processes are commonly automated using jigs, fixtures, and semi-automated tools. This form of automation delivers reliable assembly results at the expense of flexibility and requires component variability to be adequately controlled. On the other hand, in low- volume, high- value manufacturing, fastening processes are typically carried out manually by skilled workers. This research is aimed at addressing the aforementioned issues by developing a freeform automated threaded fastener assembly system that uses 3D visual guidance. The proof-of-concept system developed focuses on picking up fasteners from clutter, identifying a hole feature in an imprecisely positioned target component and carry out torque-controlled fastening. This approach has achieved flexibility and adaptability without the use of dedicated fixtures and robot programming. This research also investigates and evaluates different 3D imaging technology to identify the suitable technology required for fastener assembly in a non-structured industrial environment. The proposed solution utilises the commercially available technologies to enhance the precision and speed of identification of components for assembly processes, thereby improving and validating the possibility of reliably implementing this solution for industrial applications. As a part of this research, a number of novel algorithms are developed to robustly identify assembly components located in a random environment by enhancing the existing methods and technologies within the domain of the fastening processes. A bolt identification algorithm was developed to identify bolts located in a random clutter by enhancing the existing surface-based matching algorithm. A novel hole feature identification algorithm was developed to detect threaded holes and identify its size and location in 3D. The developed bolt and feature identification algorithms are robust and has sub-millimetre accuracy required to perform successful fastener assembly in industrial conditions. In addition, the processing time required for these identification algorithms - to identify and localise bolts and hole features - is less than a second, thereby increasing the speed of fastener assembly

    A Framework for Optical Inspection Applications in Life-Science Automation

    Get PDF
    This thesis presents possible applications for Computer Vision based systems in the field of Laboratory Automation and applicable camera-based, multi-camera-based or flatbed scanner based imaging devices. A concept of a software framework for CV applications is developed with respect to hardware compatibility, data processing and user interfaces. An application is implemented using the framework. It aims at the detection of low-volume liquids in microtiter plates, a labware standard. Using this algorithm, it is possible to cover a wide range of labware on different imaging hardware
    corecore