3,390 research outputs found

    Video content analysis for intelligent forensics

    Get PDF
    The networks of surveillance cameras installed in public places and private territories continuously record video data with the aim of detecting and preventing unlawful activities. This enhances the importance of video content analysis applications, either for real time (i.e. analytic) or post-event (i.e. forensic) analysis. In this thesis, the primary focus is on four key aspects of video content analysis, namely; 1. Moving object detection and recognition, 2. Correction of colours in the video frames and recognition of colours of moving objects, 3. Make and model recognition of vehicles and identification of their type, 4. Detection and recognition of text information in outdoor scenes. To address the first issue, a framework is presented in the first part of the thesis that efficiently detects and recognizes moving objects in videos. The framework targets the problem of object detection in the presence of complex background. The object detection part of the framework relies on background modelling technique and a novel post processing step where the contours of the foreground regions (i.e. moving object) are refined by the classification of edge segments as belonging either to the background or to the foreground region. Further, a novel feature descriptor is devised for the classification of moving objects into humans, vehicles and background. The proposed feature descriptor captures the texture information present in the silhouette of foreground objects. To address the second issue, a framework for the correction and recognition of true colours of objects in videos is presented with novel noise reduction, colour enhancement and colour recognition stages. The colour recognition stage makes use of temporal information to reliably recognize the true colours of moving objects in multiple frames. The proposed framework is specifically designed to perform robustly on videos that have poor quality because of surrounding illumination, camera sensor imperfection and artefacts due to high compression. In the third part of the thesis, a framework for vehicle make and model recognition and type identification is presented. As a part of this work, a novel feature representation technique for distinctive representation of vehicle images has emerged. The feature representation technique uses dense feature description and mid-level feature encoding scheme to capture the texture in the frontal view of the vehicles. The proposed method is insensitive to minor in-plane rotation and skew within the image. The capability of the proposed framework can be enhanced to any number of vehicle classes without re-training. Another important contribution of this work is the publication of a comprehensive up to date dataset of vehicle images to support future research in this domain. The problem of text detection and recognition in images is addressed in the last part of the thesis. A novel technique is proposed that exploits the colour information in the image for the identification of text regions. Apart from detection, the colour information is also used to segment characters from the words. The recognition of identified characters is performed using shape features and supervised learning. Finally, a lexicon based alignment procedure is adopted to finalize the recognition of strings present in word images. Extensive experiments have been conducted on benchmark datasets to analyse the performance of proposed algorithms. The results show that the proposed moving object detection and recognition technique superseded well-know baseline techniques. The proposed framework for the correction and recognition of object colours in video frames achieved all the aforementioned goals. The performance analysis of the vehicle make and model recognition framework on multiple datasets has shown the strength and reliability of the technique when used within various scenarios. Finally, the experimental results for the text detection and recognition framework on benchmark datasets have revealed the potential of the proposed scheme for accurate detection and recognition of text in the wild

    Handcrafted and Transfer Learned Feature Techniques for Vehicle Make and Model Recognition on Nigerian Road

    Get PDF
    The vehicle makes and model recognition (VMMR) is a challenging task due to the wide range of vehicle categories and similarities between different classes. Studies have shown that works have recognized vehicles of different countries' make and models. Popular vehicles on Nigerian roads may include products like; Toyota, Honda, Peugeot, Benz, Innoson Vehicle Manufacturing (IVM), etc. The VMMR is important in the intelligent transport system hence, this paper presents a handcrafted and transfer learning model to detect stationary vehicles and classify them based on brand, make, and model. A new dataset was introduced consisting of selected images of popular brands of vehicles driven on Nigerian roads. Framework for a vehicle make and model recognition was developed by extracting features using EfficientNet and HOG models and evaluated on the locally gathered datasets. For classification, a linear Support Machine Vector (SVM) was used. Experimental results showed 94.5% on HOG, 97% with EfficientNet, and 98.1% accuracy when HOG and EfficientNet features were concatenation.  The proposed concatenated model outperformed HOG and EfficientNet extracted features by providing higher accuracy and confusion matrix with the highest number of classified images. The study shows the advantages of the proposed model in terms of its accuracy in terms of identifying the vehicle make and model

    Vehicle make and model recognition for intelligent transportation monitoring and surveillance.

    Get PDF
    Vehicle Make and Model Recognition (VMMR) has evolved into a significant subject of study due to its importance in numerous Intelligent Transportation Systems (ITS), such as autonomous navigation, traffic analysis, traffic surveillance and security systems. A highly accurate and real-time VMMR system significantly reduces the overhead cost of resources otherwise required. The VMMR problem is a multi-class classification task with a peculiar set of issues and challenges like multiplicity, inter- and intra-make ambiguity among various vehicles makes and models, which need to be solved in an efficient and reliable manner to achieve a highly robust VMMR system. In this dissertation, facing the growing importance of make and model recognition of vehicles, we present a VMMR system that provides very high accuracy rates and is robust to several challenges. We demonstrate that the VMMR problem can be addressed by locating discriminative parts where the most significant appearance variations occur in each category, and learning expressive appearance descriptors. Given these insights, we consider two data driven frameworks: a Multiple-Instance Learning-based (MIL) system using hand-crafted features and an extended application of deep neural networks using MIL. Our approach requires only image level class labels, and the discriminative parts of each target class are selected in a fully unsupervised manner without any use of part annotations or segmentation masks, which may be costly to obtain. This advantage makes our system more intelligent, scalable, and applicable to other fine-grained recognition tasks. We constructed a dataset with 291,752 images representing 9,170 different vehicles to validate and evaluate our approach. Experimental results demonstrate that the localization of parts and distinguishing their discriminative powers for categorization improve the performance of fine-grained categorization. Extensive experiments conducted using our approaches yield superior results for images that were occluded, under low illumination, partial camera views, or even non-frontal views, available in our real-world VMMR dataset. The approaches presented herewith provide a highly accurate VMMR system for rea-ltime applications in realistic environments.\\ We also validate our system with a significant application of VMMR to ITS that involves automated vehicular surveillance. We show that our application can provide law inforcement agencies with efficient tools to search for a specific vehicle type, make, or model, and to track the path of a given vehicle using the position of multiple cameras

    Vehicle Keypoint Detection and Fine-Grained Classification using Deep Learning

    Get PDF
    Los sistemas de detección de puntos clave en vehículos y de clasificación por marca y modelo han visto como sus capacidades evolucionaban a un ritmo nunca antes visto, pasando de rendimientos pobres a resultados increíbles en cuestión de unos años. La irrupción de las redes neuronales convolucionales y la disponibilidad de datos y sistemas de procesamiento cada vez más potentes han permitido que, mediante el uso de modelos cada vez más complejos, estos y muchos otros problemas sean afrontados y resueltos con enfoques muy diversos. Esta tesis se centra en el problema de detección de puntos clave y clasificación a nivel de marca y modelo de vehículos con un enfoque basado en aprendizaje profundo. Tras el análisis de los conjuntos datos existentes para afrontar ambas tareas se ha optado por crear tres bases de datos específicas. La primera, orientada a la detección de puntos clave en vehículos, es una mejora y extensión del famoso conjunto de datos PASCAL3D+, reetiquetando parte del mismo y añadiendo nuevos keypoints e imágenes para aportar mayor variabilidad. La segunda, se trata de un conjunto de prueba de clasificación de vehículos por marca y modelo basado en The PREVENTION dataset, una base de datos de predicción de trayectoria de vehículos en entornos de circulación real. Por último, un conjunto de datos cruzados (Cross-dataset) compuesto por las marcas y modelos comunes de tres de las principales bases de datos de clasificación de vehículos, CompCars, VMMR-db y Frontal-103. El sistema de detección de puntos clave se basa en un método de detección de pose en humanos que mediante el uso de redes neuronales convolucionales y capas de-convolucionales genera, a partir de una imagen de entrada, un mapa de calor por cada punto clave. La red ha sido modificada para ajustarse al problema de detección de puntos clave en vehículos obteniendo resultados que mejoran el estado del arte sin hacer uso de complejas arquitecturas o metodologías. Adicionalmente se ha analizado la idoneidad de los puntos clave de PASCAL3D+, validando la propuesta de nuevos puntos clave como una mejor alternativa. El sistema de clasificación de vehículos por marca y modelo se basa en el uso de redes preentrenadas en el famoso conjunto de datos ImageNet y adaptadas al problema de clasificación de vehículos. Uno de los problemas detectados en el estado del arte es la saturación de los resultados en las bases de datos existentes que, por otra parte, se encuentran sesgadas, limitando la capacidad de generalización de los modelos entrenados con ellas. Se han usado múltiples técnicas de aprendizaje y ponderación de los datos para tratar de aliviar el impacto del sesgo de los conjuntos de datos. Para poder evaluar la capacidad de generalización en situaciones reales de los modelos entrenados, se ha hecho uso del conjunto de pruebas derivado del PREVENTION dataset. Adicionalmente, se ha hecho uso del Cross-dataset para evaluar la complejidad de las bases de datos existentes y las capacidades de generalización de los modelos entrenados con ellas. Se demuestra que, sin hacer uso de complejas arquitecturas, se pueden obtener resultados competitivos y la necesidad de un conjunto de datos que refleje de manera adecuada el mundo real para poder afrontar adecuadamente el problema de clasificación de vehículos.Vehicle keypoint detection and fine-grained classification systems have seen their capabilities evolve at an unprecedented rate, from poor performance to incredible results in a matter of a few years. The advent of convolutional neural networks and the availability of large amounts of data and progress in computational capabilities have allowed these and many other problems to be tackled and solved with very different approaches using increasingly complex models. This thesis focuses on the problems of keypoint detection and fine-grained classification of vehicles with a deep learning approach. After the analysis of the existing datasets to tackle both tasks, three new datasets have been built. The first one, oriented to the detection of keypoints in vehicles, is an improvement and extension of the famous PASCAL3D+ dataset, re-labelling part of it and adding new keypoints and images to provide more variability. The second is a vehicle make and model classification test set based on the PREVENTION dataset, a realworld driving scenario vehicle trajectory prediction dataset. Finally, a cross-dataset composed of common makes and models from three major vehicle classification databases, CompCars, VMMR-db and Frontal-103. The keypoint detection system is based on a human pose detection method that by using convolutional neural networks and deconvolutional layers generates, from an input image, a heat map for each keypoint. The network has been modified to fit the problem of keypoint detection in vehicles obtaining results that improve the state of the art without using complex architectures or methodologies. Additionally, the suitability of the PASCAL3D+ keypoints has been analysed, validating the proposal of new keypoints as a better alternative. The vehicle make and model classification system is based on the use of ImageNet pre-trained networks and fine-tuned for the vehicle classification problem. One of the problems detected in the state of the art is the saturation of the results in the existing datasets, which, moreover, are biased, limiting the generalisation capacity of the models trained with them. Multiple data learning and weighting techniques have been used to try to alleviate the impact of dataset bias. In order to assess the generalisation capabilities of the trained models in real situations, the PREVENTION test set has been used. Additionally, the cross-dataset has been used to evaluate the complexity of the existing datasets and the generalisation capabilities of the models trained with them. It is shown that competitive results can be achieved without the use of complex architectures and that a high quality dataset that adequately reflects the real world is needed in order to properly address the vehicle classification problem

    An Exploration of Recent Intelligent Image Analysis Techniques for Visual Pavement Surface Condition Assessment.

    Get PDF
    Road pavement condition assessment is essential for maintenance, asset management, and budgeting for pavement infrastructure. Countries allocate a substantial annual budget to maintain and improve local, regional, and national highways. Pavement condition is assessed by measuring several pavement characteristics such as roughness, surface skid resistance, pavement strength, deflection, and visual surface distresses. Visual inspection identifies and quantifies surface distresses, and the condition is assessed using standard rating scales. This paper critically analyzes the research trends in the academic literature, professional practices and current commercial solutions for surface condition ratings by civil authorities. We observe that various surface condition rating systems exist, and each uses its own defined subset of pavement characteristics to evaluate pavement conditions. It is noted that automated visual sensing systems using intelligent algorithms can help reduce the cost and time required for assessing the condition of pavement infrastructure, especially for local and regional road networks. However, environmental factors, pavement types, and image collection devices are significant in this domain and lead to challenging variations. Commercial solutions for automatic pavement assessment with certain limitations exist. The topic is also a focus of academic research. More recently, academic research has pivoted toward deep learning, given that image data is now available in some form. However, research to automate pavement distress assessment often focuses on the regional pavement condition assessment standard that a country or state follows. We observe that the criteria a region adopts to make the evaluation depends on factors such as pavement construction type, type of road network in the area, flow and traffic, environmental conditions, and region\u27s economic situation. We summarized a list of publicly available datasets for distress detection and pavement condition assessment. We listed approaches focusing on crack segmentation and methods concentrating on distress detection and identification using object detection and classification. We segregated the recent academic literature in terms of the camera\u27s view and the dataset used, the year and country in which the work was published, the F1 score, and the architecture type. It is observed that the literature tends to focus more on distress identification ( presence/absence detection) but less on distress quantification, which is essential for developing approaches for automated pavement rating

    Use of Coherent Point Drift in computer vision applications

    Get PDF
    This thesis presents the novel use of Coherent Point Drift in improving the robustness of a number of computer vision applications. CPD approach includes two methods for registering two images - rigid and non-rigid point set approaches which are based on the transformation model used. The key characteristic of a rigid transformation is that the distance between points is preserved, which means it can be used in the presence of translation, rotation, and scaling. Non-rigid transformations - or affine transforms - provide the opportunity of registering under non-uniform scaling and skew. The idea is to move one point set coherently to align with the second point set. The CPD method finds both the non-rigid transformation and the correspondence distance between two point sets at the same time without having to use a-priori declaration of the transformation model used. The first part of this thesis is focused on speaker identification in video conferencing. A real-time, audio-coupled video based approach is presented, which focuses more on the video analysis side, rather than the audio analysis that is known to be prone to errors. CPD is effectively utilised for lip movement detection and a temporal face detection approach is used to minimise false positives if face detection algorithm fails to perform. The second part of the thesis is focused on multi-exposure and multi-focus image fusion with compensation for camera shake. Scale Invariant Feature Transforms (SIFT) are first used to detect keypoints in images being fused. Subsequently this point set is reduced to remove outliers, using RANSAC (RANdom Sample Consensus) and finally the point sets are registered using CPD with non-rigid transformations. The registered images are then fused with a Contourlet based image fusion algorithm that makes use of a novel alpha blending and filtering technique to minimise artefacts. The thesis evaluates the performance of the algorithm in comparison to a number of state-of-the-art approaches, including the key commercial products available in the market at present, showing significantly improved subjective quality in the fused images. The final part of the thesis presents a novel approach to Vehicle Make & Model Recognition in CCTV video footage. CPD is used to effectively remove skew of vehicles detected as CCTV cameras are not specifically configured for the VMMR task and may capture vehicles at different approaching angles. A LESH (Local Energy Shape Histogram) feature based approach is used for vehicle make and model recognition with the novelty that temporal processing is used to improve reliability. A number of further algorithms are used to maximise the reliability of the final outcome. Experimental results are provided to prove that the proposed system demonstrates an accuracy in excess of 95% when tested on real CCTV footage with no prior camera calibration