32 research outputs found

    Machine Learning Methods for Image Analysis in Medical Applications, from Alzheimer\u27s Disease, Brain Tumors, to Assisted Living

    Get PDF
    Healthcare has progressed greatly nowadays owing to technological advances, where machine learning plays an important role in processing and analyzing a large amount of medical data. This thesis investigates four healthcare-related issues (Alzheimer\u27s disease detection, glioma classification, human fall detection, and obstacle avoidance in prosthetic vision), where the underlying methodologies are associated with machine learning and computer vision. For Alzheimer’s disease (AD) diagnosis, apart from symptoms of patients, Magnetic Resonance Images (MRIs) also play an important role. Inspired by the success of deep learning, a new multi-stream multi-scale Convolutional Neural Network (CNN) architecture is proposed for AD detection from MRIs, where AD features are characterized in both the tissue level and the scale level for improved feature learning. Good classification performance is obtained for AD/NC (normal control) classification with test accuracy 94.74%. In glioma subtype classification, biopsies are usually needed for determining different molecular-based glioma subtypes. We investigate non-invasive glioma subtype prediction from MRIs by using deep learning. A 2D multi-stream CNN architecture is used to learn the features of gliomas from multi-modal MRIs, where the training dataset is enlarged with synthetic brain MRIs generated by pairwise Generative Adversarial Networks (GANs). Test accuracy 88.82% has been achieved for IDH mutation (a molecular-based subtype) prediction. A new deep semi-supervised learning method is also proposed to tackle the problem of missing molecular-related labels in training datasets for improving the performance of glioma classification. In other two applications, we also address video-based human fall detection by using co-saliency-enhanced Recurrent Convolutional Networks (RCNs), as well as obstacle avoidance in prosthetic vision by characterizing obstacle-related video features using a Spiking Neural Network (SNN). These investigations can benefit future research, where artificial intelligence/deep learning may open a new way for real medical applications

    Tracking the Temporal-Evolution of Supernova Bubbles in Numerical Simulations

    Get PDF
    The study of low-dimensional, noisy manifolds embedded in a higher dimensional space has been extremely useful in many applications, from the chemical analysis of multi-phase flows to simulations of galactic mergers. Building a probabilistic model of the manifolds has helped in describing their essential properties and how they vary in space. However, when the manifold is evolving through time, a joint spatio-temporal modelling is needed, in order to fully comprehend its nature. We propose a first-order Markovian process that propagates the spatial probabilistic model of a manifold at fixed time, to its adjacent temporal stages. The proposed methodology is demonstrated using a particle simulation of an interacting dwarf galaxy to describe the evolution of a cavity generated by a Supernov

    Gaze-Based Human-Robot Interaction by the Brunswick Model

    Get PDF
    We present a new paradigm for human-robot interaction based on social signal processing, and in particular on the Brunswick model. Originally, the Brunswick model copes with face-to-face dyadic interaction, assuming that the interactants are communicating through a continuous exchange of non verbal social signals, in addition to the spoken messages. Social signals have to be interpreted, thanks to a proper recognition phase that considers visual and audio information. The Brunswick model allows to quantitatively evaluate the quality of the interaction using statistical tools which measure how effective is the recognition phase. In this paper we cast this theory when one of the interactants is a robot; in this case, the recognition phase performed by the robot and the human have to be revised w.r.t. the original model. The model is applied to Berrick, a recent open-source low-cost robotic head platform, where the gazing is the social signal to be considered

    3D face morphology classification for medical applications

    Get PDF
    Classification of facial morphology traits is an important problem for many medical applications, especially with regard to determining associations between facial morphological traits or facial abnormalities and genetic variants. A modern approach to the classification of facial characteristics(traits) is to use three-dimensional facial images. In clinical practice, classification is usually performed manually, which makes the process very tedious, time-consuming and prone to operator error. Also using simple landmark-to-landmark facial measurements may not accurately represent the underlying complex three-dimensional facial shape. This thesis presents the first automatic approach for classification and categorisation of facial morphological traits with application to lips and nose traits. It also introduces new 3D geodesic curvature features obtained along the geodesic paths between 3D facial anthropometric landmarks. These geometric features were used for lips and nose traits classification and categorisation. Finally, the influence of the discovered categories on the facial physical appearance are analysed using a new visualisation method in order to gain insight into suitability of categories for description of the underlying facial traits. The proposed approach was tested on the ALSPAC (Avon Longitudinal Study of Parents and Children) dataset consisting of 4747 3D full face meshes. The classification accuracy obtained using expert manual categories was not very high, in the region of 72%-79%, indicating that the manual categories may be unreliable. In an attempt to improve these accuracies,an automatic categorisation method was applied. In general,the classification accuracies based on the automatic lip categories were higher than those obtained using the manual categories by at least 8% and the automatic categories were found to be statistically more significant in the lip area than the manual categories. The same approach was used to categorise the nose traits, the result indicating that the proposed categorisation approach was capable of categorising any face morphological trait without the ground truth about its traits categories. Additionally, to test the robustness of the proposed features, they were used in a popular problem of gender classification and analysis. The results demonstrated superior classification accuracy to that of comparable methods. Finally, a discovery phase of a genome wide association analysis(GWAS) was carried out for 11 automatic lip and nose traits categories. As a result, statistically significant associations were found between four traits and six single nucleotide polymorphisms (SNPs). This is a very good result considering that for the 27 manual lip traits categories provided by medical expert, the associations were found between two traits and two SNPs only. This result testifies that the method proposed in this thesis for automatic categorisation of 3D facial morphology has a considerable potential for application to GWAS

    WiFi-Based Human Activity Recognition Using Attention-Based BiLSTM

    Get PDF
    Recently, significant efforts have been made to explore human activity recognition (HAR) techniques that use information gathered by existing indoor wireless infrastructures through WiFi signals without demanding the monitored subject to carry a dedicated device. The key intuition is that different activities introduce different multi-paths in WiFi signals and generate different patterns in the time series of channel state information (CSI). In this paper, we propose and evaluate a full pipeline for a CSI-based human activity recognition framework for 12 activities in three different spatial environments using two deep learning models: ABiLSTM and CNN-ABiLSTM. Evaluation experiments have demonstrated that the proposed models outperform state-of-the-art models. Also, the experiments show that the proposed models can be applied to other environments with different configurations, albeit with some caveats. The proposed ABiLSTM model achieves an overall accuracy of 94.03%, 91.96%, and 92.59% across the 3 target environments. While the proposed CNN-ABiLSTM model reaches an accuracy of 98.54%, 94.25% and 95.09% across those same environments

    Activity related biometrics for person authentication

    No full text
    One of the major challenges in human-machine interaction has always been the development of such techniques that are able to provide accurate human recognition, so as to other either personalized services or to protect critical infrastructures from unauthorized access. To this direction, a series of well stated and efficient methods have been proposed mainly based on biometric characteristics of the user. Despite the significant progress that has been achieved recently, there are still many open issues in the area, concerning not only the performance of the systems but also the intrusiveness of the collecting methods. The current thesis deals with the investigation of novel, activity-related biometric traits and their potential for multiple and unobtrusive authentication based on the spatiotemporal analysis of human activities. In particular, it starts with an extensive bibliography review regarding the most important works in the area of biometrics, exhibiting and justifying in parallel the transition that is performed from the classic biometrics to the new concept of behavioural biometrics. Based on previous works related to the human physiology and human motion and motivated by the intuitive assumption that different body types and different characters would produce distinguishable, and thus, valuable for biometric verification, activity-related traits, a new type of biometrics, the so-called prehension biometrics (i.e. the combined movement of reaching, grasping activities), is introduced and thoroughly studied herein. The analysis is performed via the so-called Activity hyper-Surfaces that form a dynamic movement-related manifold for the extraction of a series of behavioural features. Thereafter, the focus is laid on the extraction of continuous soft biometric features and their efficient combination with state-of-the-art biometric approaches towards increased authentication performance and enhanced security in template storage via Soft biometric Keys. In this context, a novel and generic probabilistic framework is proposed that produces an enhanced matching probability based on the modelling of the systematic error induced during the estimation of the aforementioned soft biometrics and the efficient clustering of the soft biometric feature space. Next, an extensive experimental evaluation of the proposed methodologies follows that effectively illustrates the increased authentication potential of the prehension-related biometrics and the significant advances in the recognition performance by the probabilistic framework. In particular, the prehension biometrics related biometrics is applied on several databases of ~100 different subjects in total performing a great variety of movements. The carried out experiments simulate both episodic and multiple authentication scenarios, while contextual parameters, (i.e. the ergonomic-based quality factors of the human body) are also taken into account. Furthermore, the probabilistic framework for augmenting biometric recognition via soft biometrics is applied on top of two state-of-art biometric systems, i.e. a gait recognition (> 100 subjects)- and a 3D face recognition-based one (~55 subjects), exhibiting significant advances to their performance. The thesis is concluded with an in-depth discussion summarizing the major achievements of the current work, as well as some possible drawbacks and other open issues of the proposed approaches that could be addressed in future works.Open Acces

    Image-set, Temporal and Spatiotemporal Representations of Videos for Recognizing, Localizing and Quantifying Actions

    Get PDF
    This dissertation addresses the problem of learning video representations, which is defined here as transforming the video so that its essential structure is made more visible or accessible for action recognition and quantification. In the literature, a video can be represented by a set of images, by modeling motion or temporal dynamics, and by a 3D graph with pixels as nodes. This dissertation contributes in proposing a set of models to localize, track, segment, recognize and assess actions such as (1) image-set models via aggregating subset features given by regularizing normalized CNNs, (2) image-set models via inter-frame principal recovery and sparsely coding residual actions, (3) temporally local models with spatially global motion estimated by robust feature matching and local motion estimated by action detection with motion model added, (4) spatiotemporal models 3D graph and 3D CNN to model time as a space dimension, (5) supervised hashing by jointly learning embedding and quantization, respectively. State-of-the-art performances are achieved for tasks such as quantifying facial pain and human diving. Primary conclusions of this dissertation are categorized as follows: (i) Image set can capture facial actions that are about collective representation; (ii) Sparse and low-rank representations can have the expression, identity and pose cues untangled and can be learned via an image-set model and also a linear model; (iii) Norm is related with recognizability; similarity metrics and loss functions matter; (v) Combining the MIL based boosting tracker with the Particle Filter motion model induces a good trade-off between the appearance similarity and motion consistence; (iv) Segmenting object locally makes it amenable to assign shape priors; it is feasible to learn knowledge such as shape priors online from Web data with weak supervision; (v) It works locally in both space and time to represent videos as 3D graphs; 3D CNNs work effectively when inputted with temporally meaningful clips; (vi) the rich labeled images or videos help to learn better hash functions after learning binary embedded codes than the random projections. In addition, models proposed for videos can be adapted to other sequential images such as volumetric medical images which are not included in this dissertation

    Time-Dependent Bag of Words on Manifolds for Geodesic-Based Classification of Video Activities towards Assisted Living and Healthcare

    No full text
    In this paper, we address the problem of classifying activities of daily living (ADL) in video. The basic idea of the proposed method is to treat each human activity in the video as a temporal sequence of points on a Riemannian manifold, and classify such time series with a geodesic-based kernel. The main novelties of this paper are summarized as follows: (a) for each frame of a video, low-level features of body pose and human-object interaction are unified by a covariance matrix, i.e., a manifold point in the space of symmetric positive definite (SPD) matrices Sym_+^d; (b) a timedependent bag-of-words (BoW+T) model is built, where its codebook is generated by clustering per frame covariance matrices on Sym_+^d; (c) for each video, high-level BoW+T features are extracted from its corresponding sequence of per-frame covariance matrices; (d) for activity classification, a positive definite kernel isformulated, taking into account the underlying geometry of our BoW+T features, i.e., the unit nn-sphere. Experiments were conducted on 2 video datasets. The first dataset contains 8 activity classes with a total of 943 videos, and the second one contains 7 activity classes with a total of 224 videos. The proposed method achieved high accuracy (average 89.66%) and small false alarms (average 1.43%) on the first dataset. Comparison with 6 existing methods on the second dataset showed further evidence on the effectiveness of the proposed method

    Deep Learning in Medical Image Analysis

    Get PDF
    The accelerating power of deep learning in diagnosing diseases will empower physicians and speed up decision making in clinical environments. Applications of modern medical instruments and digitalization of medical care have generated enormous amounts of medical images in recent years. In this big data arena, new deep learning methods and computational models for efficient data processing, analysis, and modeling of the generated data are crucially important for clinical applications and understanding the underlying biological process. This book presents and highlights novel algorithms, architectures, techniques, and applications of deep learning for medical image analysis
    corecore