44 research outputs found

    Computer Vision Based Structural Identification Framework for Bridge Health Mornitoring

    Get PDF
    The objective of this dissertation is to develop a comprehensive Structural Identification (St-Id) framework with damage for bridge type structures by using cameras and computer vision technologies. The traditional St-Id frameworks rely on using conventional sensors. In this study, the collected input and output data employed in the St-Id system are acquired by series of vision-based measurements. The following novelties are proposed, developed and demonstrated in this project: a) vehicle load (input) modeling using computer vision, b) bridge response (output) using full non-contact approach using video/image processing, c) image-based structural identification using input-output measurements and new damage indicators. The input (loading) data due vehicles such as vehicle weights and vehicle locations on the bridges, are estimated by employing computer vision algorithms (detection, classification, and localization of objects) based on the video images of vehicles. Meanwhile, the output data as structural displacements are also obtained by defining and tracking image key-points of measurement locations. Subsequently, the input and output data sets are analyzed to construct novel types of damage indicators, named Unit Influence Surface (UIS). Finally, the new damage detection and localization framework is introduced that does not require a network of sensors, but much less number of sensors. The main research significance is the first time development of algorithms that transform the measured video images into a form that is highly damage-sensitive/change-sensitive for bridge assessment within the context of Structural Identification with input and output characterization. The study exploits the unique attributes of computer vision systems, where the signal is continuous in space. This requires new adaptations and transformations that can handle computer vision data/signals for structural engineering applications. This research will significantly advance current sensor-based structural health monitoring with computer-vision techniques, leading to practical applications for damage detection of complex structures with a novel approach. By using computer vision algorithms and cameras as special sensors for structural health monitoring, this study proposes an advance approach in bridge monitoring through which certain type of data that could not be collected by conventional sensors such as vehicle loads and location, can be obtained practically and accurately

    Text-detection and -recognition from natural images

    Get PDF
    Text detection and recognition from images could have numerous functional applications for document analysis, such as assistance for visually impaired people; recognition of vehicle license plates; evaluation of articles containing tables, street signs, maps, and diagrams; keyword-based image exploration; document retrieval; recognition of parts within industrial automation; content-based extraction; object recognition; address block location; and text-based video indexing. This research exploited the advantages of artificial intelligence (AI) to detect and recognise text from natural images. Machine learning and deep learning were used to accomplish this task.In this research, we conducted an in-depth literature review on the current detection and recognition methods used by researchers to identify the existing challenges, wherein the differences in text resulting from disparity in alignment, style, size, and orientation combined with low image contrast and a complex background make automatic text extraction a considerably challenging and problematic task. Therefore, the state-of-the-art suggested approaches obtain low detection rates (often less than 80%) and recognition rates (often less than 60%). This has led to the development of new approaches. The aim of the study was to develop a robust text detection and recognition method from natural images with high accuracy and recall, which would be used as the target of the experiments. This method could detect all the text in the scene images, despite certain specific features associated with the text pattern. Furthermore, we aimed to find a solution to the two main problems concerning arbitrarily shaped text (horizontal, multi-oriented, and curved text) detection and recognition in a low-resolution scene and with various scales and of different sizes.In this research, we propose a methodology to handle the problem of text detection by using novel combination and selection features to deal with the classification algorithms of the text/non-text regions. The text-region candidates were extracted from the grey-scale images by using the MSER technique. A machine learning-based method was then applied to refine and validate the initial detection. The effectiveness of the features based on the aspect ratio, GLCM, LBP, and HOG descriptors was investigated. The text-region classifiers of MLP, SVM, and RF were trained using selections of these features and their combinations. The publicly available datasets ICDAR 2003 and ICDAR 2011 were used to evaluate the proposed method. This method achieved the state-of-the-art performance by using machine learning methodologies on both databases, and the improvements were significant in terms of Precision, Recall, and F-measure. The F-measure for ICDAR 2003 and ICDAR 2011 was 81% and 84%, respectively. The results showed that the use of a suitable feature combination and selection approach could significantly increase the accuracy of the algorithms.A new dataset has been proposed to fill the gap of character-level annotation and the availability of text in different orientations and of curved text. The proposed dataset was created particularly for deep learning methods which require a massive completed and varying range of training data. The proposed dataset includes 2,100 images annotated at the character and word levels to obtain 38,500 samples of English characters and 12,500 words. Furthermore, an augmentation tool has been proposed to support the proposed dataset. The missing of object detection augmentation tool encroach to proposed tool which has the ability to update the position of bounding boxes after applying transformations on images. This technique helps to increase the number of samples in the dataset and reduce the time of annotations where no annotation is required. The final part of the thesis presents a novel approach for text spotting, which is a new framework for an end-to-end character detection and recognition system designed using an improved SSD convolutional neural network, wherein layers are added to the SSD networks and the aspect ratio of the characters is considered because it is different from that of the other objects. Compared with the other methods considered, the proposed method could detect and recognise characters by training the end-to-end model completely. The performance of the proposed method was better on the proposed dataset; it was 90.34. Furthermore, the F-measure of the method’s accuracy on ICDAR 2015, ICDAR 2013, and SVT was 84.5, 91.9, and 54.8, respectively. On ICDAR13, the method achieved the second-best accuracy. The proposed method could spot text in arbitrarily shaped (horizontal, oriented, and curved) scene text.</div

    Video content analysis for intelligent forensics

    Get PDF
    The networks of surveillance cameras installed in public places and private territories continuously record video data with the aim of detecting and preventing unlawful activities. This enhances the importance of video content analysis applications, either for real time (i.e. analytic) or post-event (i.e. forensic) analysis. In this thesis, the primary focus is on four key aspects of video content analysis, namely; 1. Moving object detection and recognition, 2. Correction of colours in the video frames and recognition of colours of moving objects, 3. Make and model recognition of vehicles and identification of their type, 4. Detection and recognition of text information in outdoor scenes. To address the first issue, a framework is presented in the first part of the thesis that efficiently detects and recognizes moving objects in videos. The framework targets the problem of object detection in the presence of complex background. The object detection part of the framework relies on background modelling technique and a novel post processing step where the contours of the foreground regions (i.e. moving object) are refined by the classification of edge segments as belonging either to the background or to the foreground region. Further, a novel feature descriptor is devised for the classification of moving objects into humans, vehicles and background. The proposed feature descriptor captures the texture information present in the silhouette of foreground objects. To address the second issue, a framework for the correction and recognition of true colours of objects in videos is presented with novel noise reduction, colour enhancement and colour recognition stages. The colour recognition stage makes use of temporal information to reliably recognize the true colours of moving objects in multiple frames. The proposed framework is specifically designed to perform robustly on videos that have poor quality because of surrounding illumination, camera sensor imperfection and artefacts due to high compression. In the third part of the thesis, a framework for vehicle make and model recognition and type identification is presented. As a part of this work, a novel feature representation technique for distinctive representation of vehicle images has emerged. The feature representation technique uses dense feature description and mid-level feature encoding scheme to capture the texture in the frontal view of the vehicles. The proposed method is insensitive to minor in-plane rotation and skew within the image. The capability of the proposed framework can be enhanced to any number of vehicle classes without re-training. Another important contribution of this work is the publication of a comprehensive up to date dataset of vehicle images to support future research in this domain. The problem of text detection and recognition in images is addressed in the last part of the thesis. A novel technique is proposed that exploits the colour information in the image for the identification of text regions. Apart from detection, the colour information is also used to segment characters from the words. The recognition of identified characters is performed using shape features and supervised learning. Finally, a lexicon based alignment procedure is adopted to finalize the recognition of strings present in word images. Extensive experiments have been conducted on benchmark datasets to analyse the performance of proposed algorithms. The results show that the proposed moving object detection and recognition technique superseded well-know baseline techniques. The proposed framework for the correction and recognition of object colours in video frames achieved all the aforementioned goals. The performance analysis of the vehicle make and model recognition framework on multiple datasets has shown the strength and reliability of the technique when used within various scenarios. Finally, the experimental results for the text detection and recognition framework on benchmark datasets have revealed the potential of the proposed scheme for accurate detection and recognition of text in the wild

    Object detection, recognition and re-identification in video footage

    Get PDF
    There has been a significant number of security concerns in recent times; as a result, security cameras have been installed to monitor activities and to prevent crimes in most public places. These analysis are done either through video analytic or forensic analysis operations on human observations. To this end, within the research context of this thesis, a proactive machine vision based military recognition system has been developed to help monitor activities in the military environment. The proposed object detection, recognition and re-identification systems have been presented in this thesis. A novel technique for military personnel recognition is presented in this thesis. Initially the detected camouflaged personnel are segmented using a grabcut segmentation algorithm. Since in general a camouflaged personnel's uniform appears to be similar both at the top and the bottom of the body, an image patch is initially extracted from the segmented foreground image and used as the region of interest. Subsequently the colour and texture features are extracted from each patch and used for classification. A second approach for personnel recognition is proposed through the recognition of the badge on the cap of a military person. A feature matching metric based on the extracted Speed Up Robust Features (SURF) from the badge on a personnel's cap enabled the recognition of the personnel's arm of service. A state-of-the-art technique for recognising vehicle types irrespective of their view angle is also presented in this thesis. Vehicles are initially detected and segmented using a Gaussian Mixture Model (GMM) based foreground/background segmentation algorithm. A Canny Edge Detection (CED) stage, followed by morphological operations are used as pre-processing stage to help enhance foreground vehicular object detection and segmentation. Subsequently, Region, Histogram Oriented Gradient (HOG) and Local Binary Pattern (LBP) features are extracted from the refined foreground vehicle object and used as features for vehicle type recognition. Two different datasets with variant views of front/rear and angle are used and combined for testing the proposed technique. For night-time video analytics and forensics, the thesis presents a novel approach to pedestrian detection and vehicle type recognition. A novel feature acquisition technique named, CENTROG, is proposed for pedestrian detection and vehicle type recognition in this thesis. Thermal images containing pedestrians and vehicular objects are used to analyse the performance of the proposed algorithms. The video is initially segmented using a GMM based foreground object segmentation algorithm. A CED based pre-processing step is used to enhance segmentation accuracy prior using Census Transforms for initial feature extraction. HOG features are then extracted from the Census transformed images and used for detection and recognition respectively of human and vehicular objects in thermal images. Finally, a novel technique for people re-identification is proposed in this thesis based on using low-level colour features and mid-level attributes. The low-level colour histogram bin values were normalised to 0 and 1. A publicly available dataset (VIPeR) and a self constructed dataset have been used in the experiments conducted with 7 clothing attributes and low-level colour histogram features. These 7 attributes are detected using features extracted from 5 different regions of a detected human object using an SVM classifier. The low-level colour features were extracted from the regions of a detected human object. These 5 regions are obtained by human object segmentation and subsequent body part sub-division. People are re-identified by computing the Euclidean distance between a probe and the gallery image sets. The experiments conducted using SVM classifier and Euclidean distance has proven that the proposed techniques attained all of the aforementioned goals. The colour and texture features proposed for camouflage military personnel recognition surpasses the state-of-the-art methods. Similarly, experiments prove that combining features performed best when recognising vehicles in different views subsequent to initial training based on multi-views. In the same vein, the proposed CENTROG technique performed better than the state-of-the-art CENTRIST technique for both pedestrian detection and vehicle type recognition at night-time using thermal images. Finally, we show that the proposed 7 mid-level attributes and the low-level features results in improved performance accuracy for people re-identification

    Optical-based ATR algorithms for applications in swarmed UAVs

    Get PDF
    Swarmed Unmanned Aerial Vehicles (UAVs) with Automatic Target Recognition (ATR) technology are becoming an important element of electronic warfare. The advantages of UAVs over piloted vehicles have increased the need to develop robust and reliable ATR algorithms. Various issues like changing weather, camouflage, low contrast and resolution, clutter, inadequate databases place a limit on the performance capabilities of a typical ATR algorithm. In an effort to deal with these issues, a correlation-based algorithm is proposed in this thesis. This algorithm calculates the correlation between the input image and a target template which is created by projecting a 3-D model from the perspective of the UAV. The locations of correlation peaks are then declared to be the locations of the targets. We apply this algorithm to images with one object of a known class and move on to the more general case of images with an unknown number of targets from one or more classes. We provide an analysis of the performance of this correlation-based algorithm.;We compare the performance of the proposed correlation-based approach with that of a training-based approach. To provide a concrete example of an off-the-shelf training-based ATR algorithm, the open source IntelCV library was used. In the training-based method, a sample set is created and trained (Haar-like features are used for training) to produce results for comparison purpose. We further develop and analyze a method of correlating across multiple frames that have been preprocessed using the correlation-based approach. This method is shown to be useful in detecting true targets and suppressing false alarms in cases where a single image is not sufficient for classification

    Vehicle make and model recognition for intelligent transportation monitoring and surveillance.

    Get PDF
    Vehicle Make and Model Recognition (VMMR) has evolved into a significant subject of study due to its importance in numerous Intelligent Transportation Systems (ITS), such as autonomous navigation, traffic analysis, traffic surveillance and security systems. A highly accurate and real-time VMMR system significantly reduces the overhead cost of resources otherwise required. The VMMR problem is a multi-class classification task with a peculiar set of issues and challenges like multiplicity, inter- and intra-make ambiguity among various vehicles makes and models, which need to be solved in an efficient and reliable manner to achieve a highly robust VMMR system. In this dissertation, facing the growing importance of make and model recognition of vehicles, we present a VMMR system that provides very high accuracy rates and is robust to several challenges. We demonstrate that the VMMR problem can be addressed by locating discriminative parts where the most significant appearance variations occur in each category, and learning expressive appearance descriptors. Given these insights, we consider two data driven frameworks: a Multiple-Instance Learning-based (MIL) system using hand-crafted features and an extended application of deep neural networks using MIL. Our approach requires only image level class labels, and the discriminative parts of each target class are selected in a fully unsupervised manner without any use of part annotations or segmentation masks, which may be costly to obtain. This advantage makes our system more intelligent, scalable, and applicable to other fine-grained recognition tasks. We constructed a dataset with 291,752 images representing 9,170 different vehicles to validate and evaluate our approach. Experimental results demonstrate that the localization of parts and distinguishing their discriminative powers for categorization improve the performance of fine-grained categorization. Extensive experiments conducted using our approaches yield superior results for images that were occluded, under low illumination, partial camera views, or even non-frontal views, available in our real-world VMMR dataset. The approaches presented herewith provide a highly accurate VMMR system for rea-ltime applications in realistic environments.\\ We also validate our system with a significant application of VMMR to ITS that involves automated vehicular surveillance. We show that our application can provide law inforcement agencies with efficient tools to search for a specific vehicle type, make, or model, and to track the path of a given vehicle using the position of multiple cameras

    Tracking moving objects in surveillance video

    Get PDF
    The thesis looks at approaches to the detection and tracking of potential objects of interest in surveillance video. The aim was to investigate and develop methods that might be suitable for eventual application through embedded software, running on a fixed-point processor, in analytics capable cameras. The work considers common approaches to object detection and representation, seeking out those that offer the necessary computational economy and the potential to be able to cope with constraints such as low frame rate due to possible limited processor time, or weak chromatic content that can occur in some typical surveillance contexts. The aim is for probabilistic tracking of objects rather than simple concatenation of frame by frame detections. This involves using recursive Bayesian estimation. The particle filter is a technique for implementing such a recursion and so it is examined in the context of both single target and combined multi-target tracking. A detailed examination of the operation of the single target tracking particle filter shows that objects can be tracked successfully using a relatively simple structured grey-scale histogram representation. It is shown that basic components of the particle filter can be simplified without loss in tracking quality. An analysis brings out the relationships between commonly used target representation distance measures and shows that in the context of the particle filter there is little to choose between them. With the correct choice of parameters, the simplest and computationally economic distance measure performs well. The work shows how to make that correct choice. Similarly, it is shown that a simple measurement likelihood function can be used in place of the more ubiquitous Gaussian. The important step of target state estimation is examined. The standard weighted mean approach is rejected, a recently proposed maximum a posteriori approach is shown to be not suitable in the context of the work, and a practical alternative is developed. Two methods are presented for tracker initialization. One of them is a simplification of an existing published method, the other is a novel approach. The aim is to detect trackable objects as they enter the scene, extract trackable features, then actively follow those features through subsequent frames. The multi-target tracking problem is then posed as one of management of multiple independent trackers

    Detecting head movement using gyroscope data collected via in-ear wearables

    Get PDF
    Abstract. Head movement is considered as an effective, natural, and simple method to determine the pointing towards an object. Head movement detection technology has significant potentiality in diverse field of applications and studies in this field verify such claim. The application includes fields like users interaction with computers, controlling many devices externally, power wheelchair operation, detecting drivers’ drowsiness while they drive, video surveillance system, and many more. Due to the diversity in application, the method of detecting head movement is also wide-ranging. A number of approaches such as acoustic-based, video-based, computer-vision based, inertial sensor data based head movement detection methods have been introduced by researchers over the years. In order to generate inertial sensor data, various types of wearables are available for example wrist band, smart watch, head-mounted device, and so on. For this thesis, eSense — a representative earable device — that has built-in inertial sensor to generate gyroscope data is employed. This eSense device is a True Wireless Stereo (TWS) earbud. It is augmented with some key equipment such as a 6-axis inertial motion unit, a microphone, and dual mode Bluetooth (Bluetooth Classic and Bluetooth Low Energy). Features are extracted from gyroscope data collected via eSense device. Subsequently, four machine learning models — Random Forest (RF), Support Vector Machine (SVM), Naïve Bayes, and Perceptron — are applied aiming to detect head movement. The performance of these models is evaluated by four different evaluation metrics such as Accuracy, Precision, Recall, and F1 score. Result shows that machine learning models that have been applied in this thesis are able to detect head movement. Comparing the performance of all these machine learning models, Random Forest performs better than others, it is able to detect head movement with approximately 77% accuracy. The accuracy rate of other three models such as Support Vector Machine, Naïve Bayes, and Perceptron is close to each other, where these models detect head movement with about 42%, 40%, and 39% accuracy, respectively. Besides, the result of other evaluation metrics like Precision, Recall, and F1 score verifies that using these machine learning models, different head direction such as left, right, or straight can be detected

    Object Tracking

    Get PDF
    Object tracking consists in estimation of trajectory of moving objects in the sequence of images. Automation of the computer object tracking is a difficult task. Dynamics of multiple parameters changes representing features and motion of the objects, and temporary partial or full occlusion of the tracked objects have to be considered. This monograph presents the development of object tracking algorithms, methods and systems. Both, state of the art of object tracking methods and also the new trends in research are described in this book. Fourteen chapters are split into two sections. Section 1 presents new theoretical ideas whereas Section 2 presents real-life applications. Despite the variety of topics contained in this monograph it constitutes a consisted knowledge in the field of computer object tracking. The intention of editor was to follow up the very quick progress in the developing of methods as well as extension of the application

    Deep Learning Methods for Remote Sensing

    Get PDF
    Remote sensing is a field where important physical characteristics of an area are exacted using emitted radiation generally captured by satellite cameras, sensors onboard aerial vehicles, etc. Captured data help researchers develop solutions to sense and detect various characteristics such as forest fires, flooding, changes in urban areas, crop diseases, soil moisture, etc. The recent impressive progress in artificial intelligence (AI) and deep learning has sparked innovations in technologies, algorithms, and approaches and led to results that were unachievable until recently in multiple areas, among them remote sensing. This book consists of sixteen peer-reviewed papers covering new advances in the use of AI for remote sensing
    corecore