385 research outputs found

    Audio-visual multi-modality driven hybrid feature learning model for crowd analysis and classification

    Get PDF
    The high pace emergence in advanced software systems, low-cost hardware and decentralized cloud computing technologies have broadened the horizon for vision-based surveillance, monitoring and control. However, complex and inferior feature learning over visual artefacts or video streams, especially under extreme conditions confine majority of the at-hand vision-based crowd analysis and classification systems. Retrieving event-sensitive or crowd-type sensitive spatio-temporal features for the different crowd types under extreme conditions is a highly complex task. Consequently, it results in lower accuracy and hence low reliability that confines existing methods for real-time crowd analysis. Despite numerous efforts in vision-based approaches, the lack of acoustic cues often creates ambiguity in crowd classification. On the other hand, the strategic amalgamation of audio-visual features can enable accurate and reliable crowd analysis and classification. Considering it as motivation, in this research a novel audio-visual multi-modality driven hybrid feature learning model is developed for crowd analysis and classification. In this work, a hybrid feature extraction model was applied to extract deep spatio-temporal features by using Gray-Level Co-occurrence Metrics (GLCM) and AlexNet transferrable learning model. Once extracting the different GLCM features and AlexNet deep features, horizontal concatenation was done to fuse the different feature sets. Similarly, for acoustic feature extraction, the audio samples (from the input video) were processed for static (fixed size) sampling, pre-emphasis, block framing and Hann windowing, followed by acoustic feature extraction like GTCC, GTCC-Delta, GTCC-Delta-Delta, MFCC, Spectral Entropy, Spectral Flux, Spectral Slope and Harmonics to Noise Ratio (HNR). Finally, the extracted audio-visual features were fused to yield a composite multi-modal feature set, which is processed for classification using the random forest ensemble classifier. The multi-class classification yields a crowd-classification accurac12529y of (98.26%), precision (98.89%), sensitivity (94.82%), specificity (95.57%), and F-Measure of 98.84%. The robustness of the proposed multi-modality-based crowd analysis model confirms its suitability towards real-world crowd detection and classification tasks

    Signal Processing of Electroencephalogram for the Detection of Attentiveness towards Short Training Videos

    Get PDF
    This research has developed a novel method which uses an easy to deploy single dry electrode wireless electroencephalogram (EEG) collection device as an input to an automated system that measures indicators of a participant’s attentiveness while they are watching a short training video. The results are promising, including 85% or better accuracy in identifying whether a participant is watching a segment of video from a boring scene or lecture, versus a segment of video from an attentiveness inducing active lesson or memory quiz. In addition, the final system produces an ensemble average of attentiveness across many participants, pinpointing areas in the training videos that induce peak attentiveness. Qualitative analysis of the results of this research is also very promising. The system produces attentiveness graphs for individual participants and these triangulate well with the thoughts and feelings those participants had during different parts of the videos, as described in their own words. As distance learning and computer based training become more popular, it is of great interest to measure if students are attentive to recorded lessons and short training videos. This research was motivated by this interest, as well as recent advances in electronic and computer engineering’s use of biometric signal analysis for the detection of affective (emotional) response. Signal processing of EEG has proven useful in measuring alertness, emotional state, and even towards very specific applications such as whether or not participants will recall television commercials days after they have seen them. This research extended these advances by creating an automated system which measures attentiveness towards short training videos. The bulk of the research was focused on electrical and computer engineering, specifically the optimization of signal processing algorithms for this particular application. A review of existing methods of EEG signal processing and feature extraction methods shows that there is a common subdivision of the steps that are used in different EEG applications. These steps include hardware sensing filtering and digitizing, noise removal, chopping the continuous EEG data into windows for processing, normalization, transformation to extract frequency or scale information, treatment of phase or shift information, and additional post-transformation noise reduction techniques. A large degree of variation exists in most of these steps within the currently documented state of the art. This research connected these varied methods into a single holistic model that allows for comparison and selection of optimal algorithms for this application. The research described herein provided for such a structured and orderly comparison of individual signal analysis and feature extraction methods. This study created a concise algorithmic approach in examining all the aforementioned steps. In doing so, the study provided the framework for a systematic approach which followed a rigorous participant cross validation so that options could be tested, compared and optimized. Novel signal analysis methods were also developed, using new techniques to choose parameters, which greatly improved performance. The research also utilizes machine learning to automatically categorize extracted features into measures of attentiveness. The research improved existing machine learning with novel methods, including a method of using per-participant baselines with kNN machine learning. This provided an optimal solution to extend current EEG signal analysis methods that were used in other applications, and refined them for use in the measurement of attentiveness towards short training videos. These algorithms are proven to be best via selection of optimal signal analysis and optimal machine learning steps identified through both n-fold and participant cross validation. The creation of this new system which uses signal processing of EEG for the detection of attentiveness towards short training videos has created a significant advance in the field of attentiveness measuring towards short training videos

    Detection of anatomical structures in medical datasets

    Get PDF
    Detection and localisation of anatomical structures is extremely helpful for many image analysis algorithms. This thesis is concerned with the automatic identification of landmark points, anatomical regions and vessel centre lines in three-dimensional medical datasets. We examine how machine learning and atlas-based ideas may be combined to produce efficient, context-aware algorithms. For the problem of anatomical landmark detection, we develop an analog to the idea of autocontext, termed atlas location autocontext, whereby spatial context is iteratively learnt by the machine learning algorithm as part of a feedback loop. We then extend our anatomical landmark detection algorithm from Computed Tomography to Magnetic Resonance images, using image features based on histograms of oriented gradients. A cross-modality landmark detector is demonstrated using unsigned gradient orientations. The problem of brain parcellation is approached by independently training a random forest and a multi-atlas segmentation algorithm, then combining them by a simple Bayesian product operation. It is shown that, given classifiers providing complementary information, the hybrid classifier provides a superior result. The Bayesian product method of combination outperforms simple averaging where the classifiers are sufficiently independent. Finally, we present a system for identifying and tracking major arteries in Magnetic Resonance Angiography datasets, using automatically detected vascular landmarks to seed the tracking. Knowledge of individual vessel characteristics is employed to guide the tracking algorithm by two means. Firstly, the data is pre-processed using a top-hat transform of size corresponding to the vessel diameter. Secondly, a vascular atlas is generated to inform the cost function employed in the minimum path algorithm. Fully automatic tracking of the major arteries of the body is satisfactorily demonstrated

    Detection of Driver Cognitive Distraction Using Machine Learning Methods

    Get PDF
    Autonomous vehicles seem to be closer than expected on their timeline. However, there is still a decade of driving manual as well as semi-autonomous vehicles before we can experience completely automated vehicles on the road. Hence, the number of deaths due to driving accidents will take a while to drop, and we require alternative ways to prevent them. Driver distraction is one of the primary causes of accidents. Driver distraction has posed a significant problem since the first car appeared on our roadways. According to WHO findings, 1.25 million people lose their lives every year due to road traffic crashes. One of the major causes of traffic crashes is distracted driving. As a result, there is a profound need and necessity to continuously observe driver state and provide appropriately informed alerts to distracted drivers. As defined by the National Highway Traffic Safety Administration (NHTSA), there are several types of distractions including cognitive, visual and manual distractions, which may be distinguished from each other based upon the resources required to perform the task. Cognitive distraction refers to the "look but not see" situations when the drivers' eyes are focused on the forward roadway, but his/her mind is not. Typically, cognitive distractions can result from fatigue, conversation with a co-passenger, listening to the radio, or other similarly loading secondary tasks that do not necessarily take a driver's eyes off the roadway. This makes it one of the hardest distractions to detect as there are no visible clues whether the driver is distracted. In this thesis, we have identified features from different sources such as pupil size, heart rate, acceleration that are relevant to classify distracted and non-distracted drivers through collection and analysis of driving data collected from participants over multiple driving scenarios. The Machine Learning methods used dealt with classification including, but not limited to Random Forest, Decision Trees, and SVM. A reduced feature set including pupil area, pupil vertical and horizontal motion was found while maintaining an average accuracy of 90\% across different road types. Also, the impact of road types on driver behaviour is identified. Information about dominant features which affect the classification would aid early detection of distracted driving, and mitigation through the development of effective warning systems. The algorithm could be personalized to the specific driver depending on their reaction to driving situations. It would enable a safer and more comfortable driving experience

    Understanding Video Transformers for Segmentation: A Survey of Application and Interpretability

    Full text link
    Video segmentation encompasses a wide range of categories of problem formulation, e.g., object, scene, actor-action and multimodal video segmentation, for delineating task-specific scene components with pixel-level masks. Recently, approaches in this research area shifted from concentrating on ConvNet-based to transformer-based models. In addition, various interpretability approaches have appeared for transformer models and video temporal dynamics, motivated by the growing interest in basic scientific understanding, model diagnostics and societal implications of real-world deployment. Previous surveys mainly focused on ConvNet models on a subset of video segmentation tasks or transformers for classification tasks. Moreover, component-wise discussion of transformer-based video segmentation models has not yet received due focus. In addition, previous reviews of interpretability methods focused on transformers for classification, while analysis of video temporal dynamics modelling capabilities of video models received less attention. In this survey, we address the above with a thorough discussion of various categories of video segmentation, a component-wise discussion of the state-of-the-art transformer-based models, and a review of related interpretability methods. We first present an introduction to the different video segmentation task categories, their objectives, specific challenges and benchmark datasets. Next, we provide a component-wise review of recent transformer-based models and document the state of the art on different video segmentation tasks. Subsequently, we discuss post-hoc and ante-hoc interpretability methods for transformer models and interpretability methods for understanding the role of the temporal dimension in video models. Finally, we conclude our discussion with future research directions

    Adaptive Learning and Mining for Data Streams and Frequent Patterns

    Get PDF
    Aquesta tesi està dedicada al disseny d'algorismes de mineria de dades per fluxos de dades que evolucionen en el temps i per l'extracció d'arbres freqüents tancats. Primer ens ocupem de cadascuna d'aquestes tasques per separat i, a continuació, ens ocupem d'elles conjuntament, desenvolupant mètodes de classificació de fluxos de dades que contenen elements que són arbres. En el model de flux de dades, les dades arriben a gran velocitat, i els algorismes que els han de processar tenen limitacions estrictes de temps i espai. En la primera part d'aquesta tesi proposem i mostrem un marc per desenvolupar algorismes que aprenen de forma adaptativa dels fluxos de dades que canvien en el temps. Els nostres mètodes es basen en l'ús de mòduls detectors de canvi i estimadors en els llocs correctes. Proposem ADWIN, un algorisme de finestra lliscant adaptativa, per la detecció de canvi i manteniment d'estadístiques actualitzades, i proposem utilitzar-lo com a caixa negra substituint els comptadors en algorismes inicialment no dissenyats per a dades que varien en el temps. Com ADWIN té garanties teòriques de funcionament, això obre la possibilitat d'ampliar aquestes garanties als algorismes d'aprenentatge i de mineria de dades que l'usin. Provem la nostre metodologia amb diversos mètodes d'aprenentatge com el Naïve Bayes, partició, arbres de decisió i conjunt de classificadors. Construïm un marc experimental per fer mineria amb fluxos de dades que varien en el temps, basat en el programari MOA, similar al programari WEKA, de manera que sigui fàcil pels investigadors de realitzar-hi proves experimentals. Els arbres són grafs acíclics connectats i són estudiats com vincles en molts casos. En la segona part d'aquesta tesi, descrivim un estudi formal dels arbres des del punt de vista de mineria de dades basada en tancats. A més, presentem algorismes eficients per fer tests de subarbres i per fer mineria d'arbres freqüents tancats ordenats i no ordenats. S'inclou una anàlisi de l'extracció de regles d'associació de confiança plena dels conjunts d'arbres tancats, on hem trobat un fenomen interessant: les regles que la seva contrapart proposicional és no trivial, són sempre certes en els arbres a causa de la seva peculiar combinatòria. I finalment, usant aquests resultats en fluxos de dades evolutius i la mineria d'arbres tancats freqüents, hem presentat algorismes d'alt rendiment per fer mineria d'arbres freqüents tancats de manera adaptativa en fluxos de dades que evolucionen en el temps. Introduïm una metodologia general per identificar patrons tancats en un flux de dades, utilitzant la Teoria de Reticles de Galois. Usant aquesta metodologia, desenvolupem un algorisme incremental, un basat en finestra lliscant, i finalment un que troba arbres freqüents tancats de manera adaptativa en fluxos de dades. Finalment usem aquests mètodes per a desenvolupar mètodes de classificació per a fluxos de dades d'arbres.This thesis is devoted to the design of data mining algorithms for evolving data streams and for the extraction of closed frequent trees. First, we deal with each of these tasks separately, and then we deal with them together, developing classification methods for data streams containing items that are trees. In the data stream model, data arrive at high speed, and the algorithms that must process them have very strict constraints of space and time. In the first part of this thesis we propose and illustrate a framework for developing algorithms that can adaptively learn from data streams that change over time. Our methods are based on using change detectors and estimator modules at the right places. We propose an adaptive sliding window algorithm ADWIN for detecting change and keeping updated statistics from a data stream, and use it as a black-box in place or counters or accumulators in algorithms initially not designed for drifting data. Since ADWIN has rigorous performance guarantees, this opens the possibility of extending such guarantees to learning and mining algorithms. We test our methodology with several learning methods as Naïve Bayes, clustering, decision trees and ensemble methods. We build an experimental framework for data stream mining with concept drift, based on the MOA framework, similar to WEKA, so that it will be easy for researchers to run experimental data stream benchmarks. Trees are connected acyclic graphs and they are studied as link-based structures in many cases. In the second part of this thesis, we describe a rather formal study of trees from the point of view of closure-based mining. Moreover, we present efficient algorithms for subtree testing and for mining ordered and unordered frequent closed trees. We include an analysis of the extraction of association rules of full confidence out of the closed sets of trees, and we have found there an interesting phenomenon: rules whose propositional counterpart is nontrivial are, however, always implicitly true in trees due to the peculiar combinatorics of the structures. And finally, using these results on evolving data streams mining and closed frequent tree mining, we present high performance algorithms for mining closed unlabeled rooted trees adaptively from data streams that change over time. We introduce a general methodology to identify closed patterns in a data stream, using Galois Lattice Theory. Using this methodology, we then develop an incremental one, a sliding-window based one, and finally one that mines closed trees adaptively from data streams. We use these methods to develop classification methods for tree data streams.Postprint (published version

    Similarity search applications in medical images

    Get PDF

    Advances in Data Mining Knowledge Discovery and Applications

    Get PDF
    Advances in Data Mining Knowledge Discovery and Applications aims to help data miners, researchers, scholars, and PhD students who wish to apply data mining techniques. The primary contribution of this book is highlighting frontier fields and implementations of the knowledge discovery and data mining. It seems to be same things are repeated again. But in general, same approach and techniques may help us in different fields and expertise areas. This book presents knowledge discovery and data mining applications in two different sections. As known that, data mining covers areas of statistics, machine learning, data management and databases, pattern recognition, artificial intelligence, and other areas. In this book, most of the areas are covered with different data mining applications. The eighteen chapters have been classified in two parts: Knowledge Discovery and Data Mining Applications

    Extraction and representation of semantic information in digital media

    Get PDF

    Advanced Sensing and Image Processing Techniques for Healthcare Applications

    Get PDF
    This Special Issue aims to attract the latest research and findings in the design, development and experimentation of healthcare-related technologies. This includes, but is not limited to, using novel sensing, imaging, data processing, machine learning, and artificially intelligent devices and algorithms to assist/monitor the elderly, patients, and the disabled population
    corecore