578 research outputs found

    An adaptive and integrated multimodal sensing and processing framework for long-range moving object detection and classification

    Get PDF
    In applications such as surveillance, inspection and traffic monitoring, long-range detection and classification of targets (vehicles, humans, etc) is a highly desired feature for a sensing system. A single modality will no longer provide the required performance due to the challenges in detection and classification with low resolutions, noisy sensor signals, and various environmental factors due to large sensing distances. Multimodal sensing and processing, on the other hand, can provide complementary information from heterogeneous sensor modalities, such as audio, visual and range sensors. However, there is a lack of effective sensing mechanisms and systematic approaches for sensing and processing using multimodalities. In this thesis, a systematical framework is proposed for Adaptive and Integrated Multimodal Sensing and Processing (AIM-SP) that integrates novel multimodal long-range sensors, adaptive feature selection and learning-based object detection and classification for achieving the goal of adaptive and integrated multimodal sensing and processing. Based on the AIM-SP framework, we have made three unique contributions. First, we have designed a novel multimodal sensor system called Vision-Aided Automated Vibrometry (VAAV), consists of a laser Doppler vibrometer (LDV) and a pair of pan-tilt-zoom (PTZ) cameras, and the system is capable of automatically obtaining visual, range and acoustic signatures for moving object detection at a large distance. It provides a close loop adaptive sensing that allows determination of good surface points and quickly focusing the laser beam of the LDV based on the target detection, surface selection, and distance measurements by the PTZ pair and acoustic signal feedbacks of the LDV. Second, multimodal data of vehicles on both local roads and highways, acquired from multiple sensing sources, are integrated and represented in a Multimodal Temporal Panorama (MTP) for easy alignment and fast labelling of the multimodal data: visual, audio and range. Accuracy of target detection can be improved using multimodalities, and a visual reconstruction method is developed to remove occlusions, motion blurs and perspective distortions of moving vehicles so that scale- and perspective-invariant visual vehicle features are obtained. The concept of MTP is not limited to visual and audio information, but is also applicable when other modalities are available that can be presented in the same time axis. With various types of features extracted on aligned multimodal samples, we made our third contribution on feature modality selection using two approaches. The first approach uses multi-branch sequential-based feature searching (MBSF) and the second one uses boosting-based feature learning (BBFL). In our implementations, three types of visual features are used: aspect ratio and size (ARS), histograms of oriented gradients (HOGs), shape profile (SP), representing simple global scale features, statistical features, and global structure features, respectively. The audio features include short time energy (STE), spectral features (SPECs) which consists of spectral energy, entropy, flux and centroid, and perceptual features (PERCs) are Mel-frequency cepstral coefficients (MFFCs) for the perceptual features. The effectiveness of multimodal feature selection is thoroughly studied through empirical studies. The performance between MBSF and BBFL is compared based on our own dataset, which contains over 3000 samples of mainly four types of moving vehicles: sedans, pickup-trucks, vans and buses under various conditions. From this dataset, a subset of 667 samples of multimodal vehicle data is made publicly available at: http://www. cse. ohio-stata. edu/otcbvs-bench/. A number of important observations on the strengths and weakness of those features and their combinations are made as well

    Vehicle Detection and Tracking Techniques: A Concise Review

    Get PDF
    Vehicle detection and tracking applications play an important role for civilian and military applications such as in highway traffic surveillance control, management and urban traffic planning. Vehicle detection process on road are used for vehicle tracking, counts, average speed of each individual vehicle, traffic analysis and vehicle categorizing objectives and may be implemented under different environments changes. In this review, we present a concise overview of image processing methods and analysis tools which used in building these previous mentioned applications that involved developing traffic surveillance systems. More precisely and in contrast with other reviews, we classified the processing methods under three categories for more clarification to explain the traffic systems

    Vehicle Engine Classification Using of Laser Vibrometry Feature Extraction

    Full text link
    Used as a non-invasive and remote sensor, the laser Doppler vibrometer (LDV) has been used in many different applications, such as inspection of aircrafts, bridge and structure and remote voice acquisition. However, using LDV as a vehicle surveillance device has not been feasible due to the lack of systematic investigations on its behavioral properties. In this thesis, the LDV data from different vehicles are examined and features are extracted. A tone-pitch indexing (TPI) scheme is developed to classify different vehicles by exploiting the engine’s periodic vibrations that are transferred throughout the vehicle’s body. Using the TPI with a two-layer feed-forward 20 intermediate-nodes neural network to classify vehicles’ engine, the results are encouraging as they can consistently achieve accuracies over 96%. However, the TPI required a length of 1.25 seconds of vibration, which is a drawback of the TPI, as vehicles generally are moving whence the 1.25 second signals are unavailable. Based on the success of TPI, a new normalized tone-pitch indexing (nTPI) scheme is further developed, using the engine’s periodic vibrations, and shortened the time period from 1.25 seconds to a reasonable 0.2 seconds. Keywords: LDV, Machine Learning, Neural network, Deep learning, Vehicle classificatio

    Generative Models for Novelty Detection Applications in abnormal event and situational changedetection from data series

    Get PDF
    Novelty detection is a process for distinguishing the observations that differ in some respect from the observations that the model is trained on. Novelty detection is one of the fundamental requirements of a good classification or identification system since sometimes the test data contains observations that were not known at the training time. In other words, the novelty class is often is not presented during the training phase or not well defined. In light of the above, one-class classifiers and generative methods can efficiently model such problems. However, due to the unavailability of data from the novelty class, training an end-to-end model is a challenging task itself. Therefore, detecting the Novel classes in unsupervised and semi-supervised settings is a crucial step in such tasks. In this thesis, we propose several methods to model the novelty detection problem in unsupervised and semi-supervised fashion. The proposed frameworks applied to different related applications of anomaly and outlier detection tasks. The results show the superior of our proposed methods in compare to the baselines and state-of-the-art methods

    Seamless Multimodal Biometrics for Continuous Personalised Wellbeing Monitoring

    Full text link
    Artificially intelligent perception is increasingly present in the lives of every one of us. Vehicles are no exception, (...) In the near future, pattern recognition will have an even stronger role in vehicles, as self-driving cars will require automated ways to understand what is happening around (and within) them and act accordingly. (...) This doctoral work focused on advancing in-vehicle sensing through the research of novel computer vision and pattern recognition methodologies for both biometrics and wellbeing monitoring. The main focus has been on electrocardiogram (ECG) biometrics, a trait well-known for its potential for seamless driver monitoring. Major efforts were devoted to achieving improved performance in identification and identity verification in off-the-person scenarios, well-known for increased noise and variability. Here, end-to-end deep learning ECG biometric solutions were proposed and important topics were addressed such as cross-database and long-term performance, waveform relevance through explainability, and interlead conversion. Face biometrics, a natural complement to the ECG in seamless unconstrained scenarios, was also studied in this work. The open challenges of masked face recognition and interpretability in biometrics were tackled in an effort to evolve towards algorithms that are more transparent, trustworthy, and robust to significant occlusions. Within the topic of wellbeing monitoring, improved solutions to multimodal emotion recognition in groups of people and activity/violence recognition in in-vehicle scenarios were proposed. At last, we also proposed a novel way to learn template security within end-to-end models, dismissing additional separate encryption processes, and a self-supervised learning approach tailored to sequential data, in order to ensure data security and optimal performance. (...)Comment: Doctoral thesis presented and approved on the 21st of December 2022 to the University of Port

    Smart environment monitoring through micro unmanned aerial vehicles

    Get PDF
    In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection
    • …
    corecore