128 research outputs found

    Connectivity-Enforcing Hough Transform for the Robust Extraction of Line Segments

    Full text link
    Global voting schemes based on the Hough transform (HT) have been widely used to robustly detect lines in images. However, since the votes do not take line connectivity into account, these methods do not deal well with cluttered images. In opposition, the so-called local methods enforce connectivity but lack robustness to deal with challenging situations that occur in many realistic scenarios, e.g., when line segments cross or when long segments are corrupted. In this paper, we address the critical limitations of the HT as a line segment extractor by incorporating connectivity in the voting process. This is done by only accounting for the contributions of edge points lying in increasingly larger neighborhoods and whose position and directional content agree with potential line segments. As a result, our method, which we call STRAIGHT (Segment exTRAction by connectivity-enforcInG HT), extracts the longest connected segments in each location of the image, thus also integrating into the HT voting process the usually separate step of individual segment extraction. The usage of the Hough space mapping and a corresponding hierarchical implementation make our approach computationally feasible. We present experiments that illustrate, with synthetic and real images, how STRAIGHT succeeds in extracting complete segments in several situations where current methods fail.Comment: Submitted for publicatio

    Robust Lane Detection through Self Pre-training with Masked Sequential Autoencoders and Fine-tuning with Customized PolyLoss

    Full text link
    Lane detection is crucial for vehicle localization which makes it the foundation for automated driving and many intelligent and advanced driving assistant systems. Available vision-based lane detection methods do not make full use of the valuable features and aggregate contextual information, especially the interrelationships between lane lines and other regions of the images in continuous frames. To fill this research gap and upgrade lane detection performance, this paper proposes a pipeline consisting of self pre-training with masked sequential autoencoders and fine-tuning with customized PolyLoss for the end-to-end neural network models using multi-continuous image frames. The masked sequential autoencoders are adopted to pre-train the neural network models with reconstructing the missing pixels from a random masked image as the objective. Then, in the fine-tuning segmentation phase where lane detection segmentation is performed, the continuous image frames are served as the inputs, and the pre-trained model weights are transferred and further updated using the backpropagation mechanism with customized PolyLoss calculating the weighted errors between the output lane detection results and the labeled ground truth. Extensive experiment results demonstrate that, with the proposed pipeline, the lane detection model performance on both normal and challenging scenes can be advanced beyond the state-of-the-art, delivering the best testing accuracy (98.38%), precision (0.937), and F1-measure (0.924) on the normal scene testing set, together with the best overall accuracy (98.36%) and precision (0.844) in the challenging scene test set, while the training time can be substantially shortened.Comment: 12 pages, 8 figures, under review by journal of IEEE Transactions on Intelligent Transportation System

    System Integration and Intelligence Improvements for WPI’s UGV - Prometheus

    Get PDF
    This project focuses on realizing a series of operational improvements for WPI\u27s unmanned ground vehicle Prometheus with the end goal of a prize winning entry to the Intelligent Ground Vehicle Challenge. Operational improvements include a practical implementation of stereo vision on an NVIDIA GPU, a more reliable implementation of line detection, a better approach to mapping and path planning, and a modified system architecture realized by an easier to work with GPIO implementation. The end result of these improvements is better autonomy, accessibility, robustness, reliability, and usability for Prometheus

    Optical polarimetry studies of Seyfert galaxies

    Get PDF
    Optical imaging polarimetry has been performed on seven nearby Seyfert galaxies, three with face-on and four with edge-on host galaxies of various morphological classifications. Observations in V, R, B and H(_a) wavebands are presented as maps of total intensity and of polarized intensity, overlaid with polarization vectors. Independent determinations of the interstellar polarization (ISP) contribution from our own galaxy are made where possible, and are used to produce ISP corrected maps. The polarization patterns seen in the maps show evidence of either dichroic extinction, which indicates the presence of non-spherical dust grains in large-scale galactic magnetic fields, or scattering, which is due to the illumination of regions of dust grains or electrons. The polarization features, which are observed at the different wavebands, are then compared to recent models of polarization in external galaxies. Estimates of the intrinsic Seyfert nuclear polarization are made where possible by correcting for ISP and for an approximation of the dilution due to the host galaxy flux by using values from previous studies. Both the measured and the corrected nuclear polarizations are compared with previously published values, and are discussed in the context of the standard models of Seyfert galaxies. Most of the observed galaxies show evidence of polarization, both from the host galaxy and from the intrinsic Seyfert nucleus. In particular, distinct polarization features: bands of polarization consistent with extended dusty disks aligned with the dusty tori proposed in Seyferts, and regions of polarization corresponding to scattering of the nuclear continuum along the biconical extended Seyfert emission-line regions, have been identified in several of the observed galaxies

    Lane estimation for autonomous vehicles using vision and LIDAR

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 109-114).Autonomous ground vehicles, or self-driving cars, require a high level of situational awareness in order to operate safely and eciently in real-world conditions. A system able to quickly and reliably estimate the location of the roadway and its lanes based upon local sensor data would be a valuable asset both to fully autonomous vehicles as well as driver assistance technologies. To be most useful, the system must accommodate a variety of roadways, a range of weather and lighting conditions, and highly dynamic scenes with other vehicles and moving objects. Lane estimation can be modeled as a curve estimation problem, where sensor data provides partial and noisy observations of curves. The number of curves to estimate may be initially unknown and many of the observations may be outliers and false detections (e.g., due to tree shadows or lens are). The challenge is to detect lanes when and where they exist, and to update the lane estimates as new observations are received. This thesis describes algorithms for feature detection and curve estimation, as well as a novel curve representation that permits fast and ecient estimation while rejecting outliers. Locally observed road paint and curb features are fused together in a lane estimation framework that detects and estimates all nearby travel lanes.(cont.) The system handles roads with complex geometries and makes no assumptions about the position and orientation of the vehicle with respect to the roadway. Early versions of these algorithms successfully guided a fully autonomous Land Rover LR3 through the 2007 DARPA Urban Challenge, a 90km urban race course, at speeds up to 40 km/h amidst moving traffic. We evaluate these and subsequent versions with a ground truth dataset containing manually labeled lane geometries for every moment of vehicle travel in two large and diverse datasets that include more than 300,000 images and 44km of roadway. The results illustrate the capabilities of our algorithms for robust lane estimation in the face of challenging conditions and unknown roadways.by Albert S. Huang.Ph.D

    Semi-Supervised Pattern Recognition and Machine Learning for Eye-Tracking

    Get PDF
    The first step in monitoring an observer’s eye gaze is identifying and locating the image of their pupils in video recordings of their eyes. Current systems work under a range of conditions, but fail in bright sunlight and rapidly varying illumination. A computer vision system was developed to assist with the recognition of the pupil in every frame of a video, in spite of the presence of strong first-surface reflections off of the cornea. A modified Hough Circle detector was developed that incorporates knowledge that the pupil is darker than the surrounding iris of the eye, and is able to detect imperfect circles, partial circles, and ellipses. As part of processing the image is modified to compensate for the distortion of the pupil caused by the out-of-plane rotation of the eye. A sophisticated noise cleaning technique was developed to mitigate first surface reflections, enhance edge contrast, and reduce image flare. Semi-supervised human input and validation is used to train the algorithm. The final results are comparable to those achieved using a human analyst, but require only a tenth of the human interaction

    Hierarchische Modelle für das visuelle Erkennen und Lernen von Objekten, Szenen und Aktivitäten

    Get PDF
    In many computer vision applications, objects have to be learned and recognized in images or image sequences. Most of these objects have a hierarchical structure.For example, 3d objects can be decomposed into object parts, and object parts, in turn, into geometric primitives. Furthermore, scenes are composed of objects. And also activities or behaviors can be divided hierarchically into actions, these into individual movements, etc. Hierarchical models are therefore ideally suited for the representation of a wide range of objects used in applications such as object recognition, human pose estimation, or activity recognition. In this work new probabilistic hierarchical models are presented that allow an efficient representation of multiple objects of different categories, scales, rotations, and views. The idea is to exploit similarities between objects, object parts or actions and movements in order to share calculations and avoid redundant information. We will introduce online and offline learning methods, which enable to create efficient hierarchies based on small or large training datasets, in which poses or articulated structures are given by instances. Furthermore, we present inference approaches for fast and robust detection. These new approaches combine the idea of compositional and similarity hierarchies and overcome limitations of previous methods. They will be used in an unified hierarchical framework spatially for object recognition as well as spatiotemporally for activity recognition. The unified generic hierarchical framework allows us to apply the proposed models in different projects. Besides classical object recognition it is used for detection of human poses in a project for gait analysis. The activity detection is used in a project for the design of environments for ageing, to identify activities and behavior patterns in smart homes. In a project for parking spot detection using an intelligent vehicle, the proposed approaches are used to hierarchically model the environment of the vehicle for an efficient and robust interpretation of the scene in real-time.In zahlreichen Computer Vision Anwendungen müssen Objekte in einzelnen Bildern oder Bildsequenzen erlernt und erkannt werden. Viele dieser Objekte sind hierarchisch aufgebaut.So lassen sich 3d Objekte in Objektteile zerlegen und Objektteile wiederum in geometrische Grundkörper. Und auch Aktivitäten oder Verhaltensmuster lassen sich hierarchisch in einzelne Aktionen aufteilen, diese wiederum in einzelne Bewegungen usw. Für die Repräsentation sind hierarchische Modelle dementsprechend gut geeignet. In dieser Arbeit werden neue probabilistische hierarchische Modelle vorgestellt, die es ermöglichen auch mehrere Objekte verschiedener Kategorien, Skalierungen, Rotationen und aus verschiedenen Blickrichtungen effizient zu repräsentieren. Eine Idee ist hierbei, Ähnlichkeiten unter Objekten, Objektteilen oder auch Aktionen und Bewegungen zu nutzen, um redundante Informationen und Mehrfachberechnungen zu vermeiden. In der Arbeit werden online und offline Lernverfahren vorgestellt, die es ermöglichen, effiziente Hierarchien auf Basis von kleinen oder großen Trainingsdatensätzen zu erstellen, in denen Posen und bewegliche Strukturen durch Beispiele gegeben sind. Des Weiteren werden Inferenzansätze zur schnellen und robusten Detektion vorgestellt. Diese werden innerhalb eines einheitlichen hierarchischen Frameworks sowohl räumlich zur Objekterkennung als auch raumzeitlich zur Aktivitätenerkennung verwendet. Das einheitliche Framework ermöglicht die Anwendung des vorgestellten Modells innerhalb verschiedener Projekte. Neben der klassischen Objekterkennung wird es zur Erkennung von menschlichen Posen in einem Projekt zur Ganganalyse verwendet. Die Aktivitätenerkennung wird in einem Projekt zur Gestaltung altersgerechter Lebenswelten genutzt, um in intelligenten Wohnräumen Aktivitäten und Verhaltensmuster von Bewohnern zu erkennen. Im Rahmen eines Projektes zur Parklückenvermessung mithilfe eines intelligenten Fahrzeuges werden die vorgestellten Ansätze verwendet, um das Umfeld des Fahrzeuges hierarchisch zu modellieren und dadurch das Szenenverstehen zu ermöglichen

    Mobile Robots Navigation

    Get PDF
    Mobile robots navigation includes different interrelated activities: (i) perception, as obtaining and interpreting sensory information; (ii) exploration, as the strategy that guides the robot to select the next direction to go; (iii) mapping, involving the construction of a spatial representation by using the sensory information perceived; (iv) localization, as the strategy to estimate the robot position within the spatial map; (v) path planning, as the strategy to find a path towards a goal location being optimal or not; and (vi) path execution, where motor actions are determined and adapted to environmental changes. The book addresses those activities by integrating results from the research work of several authors all over the world. Research cases are documented in 32 chapters organized within 7 categories next described

    Road Scene Interpretation for Autonomous Navigation Fusing Stereo Vision and Digital Maps

    Get PDF
    En esta tesis se ha presentado un método de detección de carretera basado en visión estereoscópica. El aprendizaje automático se utiliza para resolver problemas de visión artificial de muy diferente ámbito, en concreto, la técnica utilizada en este caso es la llamada boosting, la cual utiliza árboles de decisión para clasificar cada píxel de la imagen como zona que pertenece carretera o no. El vector de características utilizado incluye información proporcionada por mapas digitales, visión estéreo y cámaras en color y en escala de grises. La imagen en escala de grises es utilizada para detectar marcas viales, Local Binary Patterns (LBP) y Histogramas de Orientación de Gradiente (HOG). Las cámaras en color son utilizadas para el cálculo de una imagen que es invariante a la iluminación y también para detectar las sombras presentes en la imagen. Además, se ha desarrollado un método basado en el espacio de color HSV para detectar las zonas de vegetación presentes en la escena. Las cámaras estéreo tienen un papel importante porque son las encargadas de proporcionar información 3D al sistema. Algunas de las características que usan dicha información son los vectores normales y los valores de curvatura. Se ha desarrollado un nuevo método para la detección de bordillos. Este novedoso detector de bordillos se basa en el análisis de la curvatura porque describe la variación de la forma de la carretera incluso en presencia de pequeños bordillos. La función es capaz de detectar bordillos de 3 cm de altura incluso hasta 20 metros de distancia, siempre y cuando los píxeles que pertenecen al bordillo estén conectados entre si en la imagen de curvatura. Otros obstáculos como vehículos, muros o arboles son también detectados utilizando visión estereoscópica. Una nueva forma para convertir características que describen limites de carretera en características que describen zonas de carretera se ha descrito en esta tesis. Utiliza marcas viales, bordillos, obstáculos y zonas de vegetación como entradas y tras incluir información adicional del mapa se genera un modelo de carretera. La originalidad de este sistema es el punto desde donde se detecta es espacio libre. %Otros métodos crean lineas desde el punto medio del limite inferior de la imagen hasta que la linea llega a un obstáculo, pero nuestra propuesta utiliza otro punto de vista porque sus lineas empiezan desde el punto de fuga y los valores de las características de van acumulando a lo largo de dicha linea. Otra característica muy importante es la obtenida a partir de los mapas digitales. El objetivo es conseguir un imagen a priori de la forma de la carretera basado en la posición actual del vehículo y la información de las calles proporcionada por el mapa. La incertidumbre sobre los errores de posicionamiento son tenidos en cuenta durante el proceso y la anchura de la carretera es correctamente detectada usando el modelo radial propuesto. Se han realizado múltiples pruebas con diferentes clasificadores y parámetros basados en arboles de decisión para posteriormente elegir el clasificador que mejor funciona en la detección de carretera. El resultado de la clasificación es utilizado en un CRF para filtrar la respuesta y obtener un resultado mas suave. La métrica utilizada para evaluar los clasificadores es el F-score. El sistema es evaluado en el plano imagen, el cual es el método mas común en la literatura. Sin embargo, en un escenario de conducción autónoma, el control se realiza normalmente en una imagen a vista de pájaro de la escena. Se ha adoptado el mismo método de evaluación que se utiliza en la comparador internacional de algoritmos KITTI para poder comparar nuestros resultados con otros algoritmos
    corecore