9 research outputs found

    Image-based recognition, 3D localization, and retro-reflectivity evaluation of high-quantity low-cost roadway assets for enhanced condition assessment

    Get PDF
    Systematic condition assessment of high-quantity low-cost roadway assets such as traffic signs, guardrails, and pavement markings requires frequent reporting on location and up-to-date status of these assets. Today, most Departments of Transportation (DOTs) in the US collect data using camera-mounted vehicles to filter, annotate, organize, and present the data necessary for these assessments. However, the cost and complexity of the collection, analysis, and reporting as-is conditions result in sparse and infrequent monitoring. Thus, some of the gains in efficiency are consumed by monitoring costs. This dissertation proposes to improve frequency, detail, and applicability of image-based condition assessment via automating detection, classification, and 3D localization of multiple types of high-quantity low-cost roadway assets using both images collected by the DOTs and online databases such Google Street View Images. To address the new requirements of US Federal Highway Administration (FHWA), a new method is also developed that simulates nighttime visibility of traffic signs from images taken during daytime and measures their retro-reflectivity condition. To initiate detection and classification of high-quantity low-cost roadway assets from street-level images, a number of algorithms are proposed that automatically segment and localize high-level asset categories in 3D. The first set of algorithms focus on the task of detecting and segmenting assets at high-level categories. More specifically, a method based on Semantic Texton Forest classifiers, segments each geo-registered 2D video frame at the pixel-level based on shape, texture, and color. A Structure from Motion (SfM) procedure reconstructs the road and its assets in 3D. Next, a voting scheme assigns the most observed asset category to each point in 3D. The experimental results from application of this method are promising, nevertheless because this method relies on using supervised ground-truth pixel labels for training purposes, scaling it to various types of assets is challenging. To address this issue, a non-parametric image parsing method is proposed that leverages lazy learning scheme for segmentation and recognition of roadway assets. The semi-supervised technique used in the proposed method does not need training and provides ground truth data in a more efficient manner. It is easily scalable to thousands of video frames captured during data collection. Once the high-level asset categories are detected, specific techniques needs to be exploited to detect and classify the assets at a higher level of granularity. To this end, performance of three computer vision algorithms are evaluated for classification of traffic signs in presence of cluttered backgrounds and static and dynamic occlusions. Without making any prior assumptions about the location of traffic signs in 2D, the best performing method uses histograms of oriented gradients and color together with multiple one-vs-all Support Vector Machines, and classifies these assets into warning, regulatory, stop, and yield sign categories. To minimize the reliance on visual data collected by the DOTs and improve frequency and applicability of condition assessment, a new end-to-end procedure is presented that applies the above algorithms and creates comprehensive inventory of traffic signs using Google Street View images. By processing images extracted using Google Street View API and discriminative classification scores from all images that see a sign, the most probable 3D location of each traffic sign is derived and is shown on the Google Earth using a dynamic heat map. A data card containing information about location, type, and condition of each detected traffic sign is also created. Finally, a computer vision-based algorithm is proposed that measures retro-reflectivity of traffic signs during daytime using a vehicle mounted device. The algorithm simulates nighttime visibility of traffic signs from images taken during daytime and measures their retro-reflectivity. The technique is faster, cheaper, and safer compared to the state-of-the-art as it neither requires nighttime operation nor requires manual sign inspection. It also satisfies measurement guidelines set forth by FHWA both in terms of granularity and accuracy. To validate the techniques, new detailed video datasets and their ground-truth were generated from 2.2-mile smart road research facility and two interstate highways in the US. The comprehensive dataset contains over 11,000 annotated U.S. traffic sign images and exhibits large variations in sign pose, scale, background, illumination, and occlusion conditions. The performance of all algorithms were examined using these datasets. For retro-reflectivity measurement of traffic signs, experiments were conducted at different times of day and for different distances. Results were compared with a method recommended by ASTM standards. The experimental results show promise in scalability of these methods to reduce the time and effort required for developing road inventories, especially for those assets such as guardrails and traffic lights that are not typically considered in 2D asset recognition methods and also multiple categories of traffic signs. The applicability of Google Street View Images for inventory management purposes and also the technique for retro-reflectivity measurement during daytime demonstrate strong potential in lowering inspection costs and improving safety in practical applications

    Semantic multimedia analysis using knowledge and context

    Get PDF
    PhDThe difficulty of semantic multimedia analysis can be attributed to the extended diversity in form and appearance exhibited by the majority of semantic concepts and the difficulty to express them using a finite number of patterns. In meeting this challenge there has been a scientific debate on whether the problem should be addressed from the perspective of using overwhelming amounts of training data to capture all possible instantiations of a concept, or from the perspective of using explicit knowledge about the concepts’ relations to infer their presence. In this thesis we address three problems of pattern recognition and propose solutions that combine the knowledge extracted implicitly from training data with the knowledge provided explicitly in structured form. First, we propose a BNs modeling approach that defines a conceptual space where both domain related evi- dence and evidence derived from content analysis can be jointly considered to support or disprove a hypothesis. The use of this space leads to sig- nificant gains in performance compared to analysis methods that can not handle combined knowledge. Then, we present an unsupervised method that exploits the collective nature of social media to automatically obtain large amounts of annotated image regions. By proving that the quality of the obtained samples can be almost as good as manually annotated images when working with large datasets, we significantly contribute towards scal- able object detection. Finally, we introduce a method that treats images, visual features and tags as the three observable variables of an aspect model and extracts a set of latent topics that incorporates the semantics of both visual and tag information space. By showing that the cross-modal depen- dencies of tagged images can be exploited to increase the semantic capacity of the resulting space, we advocate the use of all existing information facets in the semantic analysis of social media

    Semantic Segmentation with Neural Networks in Environment Monitoring

    Get PDF
    The Finnish Environment Institute (SYKE) has at least two missions which require surveying large land areas: finding invasive alien species and monitoring the state of Finnish lakes. Various methods to accomplish these tasks exist, but they traditionally rely on manual labor by experts or citizen activism, and as such do not scale well. This thesis explores the usage of computer vision to dramatically improve the scaling of these tasks. Specifically, the aim is to fly a drone over selected areas and use a convolutional neural network architecture (U-net) to create segmentations of the images. The method performs well on select biomass estimation task classes due to large enough datasets and easy-to-distinguish core features of the classes. Furthermore, a qualitative study of datasets was performed, yielding an estimate for a lower bound of number of examples for an useful dataset. ACM Computing Classification System (CCS): CCS → Computing methodologies → Machine learning → Machine learning approaches → Neural network

    Abstracts on Radio Direction Finding (1899 - 1995)

    Get PDF
    The files on this record represent the various databases that originally composed the CD-ROM issue of "Abstracts on Radio Direction Finding" database, which is now part of the Dudley Knox Library's Abstracts and Selected Full Text Documents on Radio Direction Finding (1899 - 1995) Collection. (See Calhoun record https://calhoun.nps.edu/handle/10945/57364 for further information on this collection and the bibliography). Due to issues of technological obsolescence preventing current and future audiences from accessing the bibliography, DKL exported and converted into the three files on this record the various databases contained in the CD-ROM. The contents of these files are: 1) RDFA_CompleteBibliography_xls.zip [RDFA_CompleteBibliography.xls: Metadata for the complete bibliography, in Excel 97-2003 Workbook format; RDFA_Glossary.xls: Glossary of terms, in Excel 97-2003 Workbookformat; RDFA_Biographies.xls: Biographies of leading figures, in Excel 97-2003 Workbook format]; 2) RDFA_CompleteBibliography_csv.zip [RDFA_CompleteBibliography.TXT: Metadata for the complete bibliography, in CSV format; RDFA_Glossary.TXT: Glossary of terms, in CSV format; RDFA_Biographies.TXT: Biographies of leading figures, in CSV format]; 3) RDFA_CompleteBibliography.pdf: A human readable display of the bibliographic data, as a means of double-checking any possible deviations due to conversion

    Efficient Semantic Segmentation for Resource-Constrained Applications with Lightweight Neural Networks

    Get PDF
    This thesis focuses on developing lightweight semantic segmentation models tailored for resource-constrained applications, effectively balancing accuracy and computational efficiency. It introduces several novel concepts, including knowledge sharing, dense bottleneck, and feature re-usability, which enhance the feature hierarchy by capturing fine-grained details, long-range dependencies, and diverse geometrical objects within the scene. To achieve precise object localization and improved semantic representations in real-time environments, the thesis introduces multi-stage feature aggregation, feature scaling, and hybrid-path attention methods

    Detección de objetos en entornos dinámicos para videovigilancia

    Get PDF
    La videovigilancia por medios automáticos es un campo de investigación muy activo debido a la necesidad de seguridad y control. En este sentido, existen situaciones que dificultan el correcto funcionamiento de los algoritmos ya existentes. Esta tesis se centra en la detección de movimiento y aborda varias de las problemáticas habituales, planteando nuevos enfoques que, en la gran mayoría de las ocasiones, superan a otras propuestas pertenecientes al estado del arte. En particular estudiamos: - La importancia del espacio de color de cara a la detección de movimiento. - Los efectos del ruido en el vídeo de entrada. - Un nuevo modelo de fondo denominado MFBM que acepta cualquier número y tipo de rasgo de entrada. - Un método para paliar las dificultades que suponen los cambios de iluminación. - Un método no panorámico para detectar movimiento en cámaras no estáticas. Durante la tesis se han utilizado diferentes repositorios públicos que son ampliamente utilizados en el ámbito de la detección de movimiento. Además, los resultados obtenidos han sido comparados con los de otras propuestas existentes. Todo el código utilizado ha sido colgado en la Web de forma pública. En esta tesis se llega a las siguientes conclusiones: - El espacio de color con el que se codifique el vídeo de entrada repercute notablemente en el rendimiento de los métodos de detección. El modelo RGB no siempre es la mejor opción. También se ha comprobado que ponderar los canales de color del vídeo de entrada mejora el rendimiento de los métodos. - El ruido en el vídeo de entrada a la hora de realizar la detección de movimiento es un factor a tener en cuenta ya que condiciona el rendimiento de los métodos. Resulta llamativo que, si bien el ruido suele ser perjudicial, en ocasiones puede mejorar la detección. - El modelo MFBM supera a los demás métodos competidores estudiados, todos ellos pertenecientes al estado del arte. - Los problemas derivados de los cambios de iluminación se reducen significativamente al utilizar el método propuesto. - El método propuesto para detectar movimiento con cámaras no estáticas supera en la gran mayoría de las ocasiones a otras propuestas existentes. Se han consultado 280 entradas bibliográficas, entre ellas podemos destacar: - C. Wren, A. Azarbayejani, T. Darrell, and A. Pentl, “Pfinder: real-time tracking of the human body,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 19, pp. 780–785, 1997. - C. Stauffer and W. Grimson, “Adaptive background mixture models for real-time tracking,” in Proc. IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 1999. - L. Li, W. Huang, I.-H. Gu, and Q. Tian, “Statistical modeling of complex backgrounds for foreground object detection,” Image Processing, IEEE Transactions on, vol. 13, pp. 1459–1472, 2004. - T. Bouwmans, “Traditional and recent approaches in background modeling for foreground detection: An overview,” Computer Science Review, vol. 11-12, pp. 31 – 66, 2014

    Roadside vegetation segmentation with Adaptive Texton Clustering Model

    No full text
    Automatic roadside vegetation segmentation is important for various real-world applications and one main challenge is to design algorithms that are capable of representing discriminative characteristics of vegetation while maintaining robustness against environmental effects. This paper presents an Adaptive Texton Clustering Model (ATCM) that combines pixel-level supervised prediction and cluster-level unsupervised texton occurrence frequencies into superpixel-level majority voting for adaptive roadside vegetation segmentation. The ATCM learns generic characteristics of vegetation from training data using class-specific neural networks with color and texture features, and adaptively incorporates local properties of vegetation in every test image using texton based adaptive K-means clustering. The adaptive clustering groups test pixels into local clusters, accumulates texton frequencies in every cluster and calculates cluster-level class probabilities. The pixel- and cluster-level probabilities are integrated via superpixel-level voting to determine the category of every superpixel. We evaluate the ATCM on three real-world datasets, including the Queensland Department of Transport and Main Roads, the Croatia, and the Stanford background datasets, showing very competitive performance to state-of-the-art approaches. © 2018 Elsevier Lt

    Roadside vegetation segmentation with Adaptive Texton Clustering Model

    No full text
    Verma, B ORCiD: 0000-0002-4618-0479; Zhang, L ORCiD: 0000-0001-6925-9086Automatic roadside vegetation segmentation is important for various real-world applications and one main challenge is to design algorithms that are capable of representing discriminative characteristics of vegetation while maintaining robustness against environmental effects. This paper presents an Adaptive Texton Clustering Model (ATCM) that combines pixel-level supervised prediction and cluster-level unsupervised texton occurrence frequencies into superpixel-level majority voting for adaptive roadside vegetation segmentation. The ATCM learns generic characteristics of vegetation from training data using class-specific neural networks with color and texture features, and adaptively incorporates local properties of vegetation in every test image using texton based adaptive K-means clustering. The adaptive clustering groups test pixels into local clusters, accumulates texton frequencies in every cluster and calculates cluster-level class probabilities. The pixel- and cluster-level probabilities are integrated via superpixel-level voting to determine the category of every superpixel. We evaluate the ATCM on three real-world datasets, including the Queensland Department of Transport and Main Roads, the Croatia, and the Stanford background datasets, showing very competitive performance to state-of-the-art approaches. © 2018 Elsevier Lt
    corecore