100 research outputs found

    Intelligent Biosignal Processing in Wearable and Implantable Sensors

    Get PDF
    This reprint provides a collection of papers illustrating the state-of-the-art of smart processing of data coming from wearable, implantable or portable sensors. Each paper presents the design, databases used, methodological background, obtained results, and their interpretation for biomedical applications. Revealing examples are brain–machine interfaces for medical rehabilitation, the evaluation of sympathetic nerve activity, a novel automated diagnostic tool based on ECG data to diagnose COVID-19, machine learning-based hypertension risk assessment by means of photoplethysmography and electrocardiography signals, Parkinsonian gait assessment using machine learning tools, thorough analysis of compressive sensing of ECG signals, development of a nanotechnology application for decoding vagus-nerve activity, detection of liver dysfunction using a wearable electronic nose system, prosthetic hand control using surface electromyography, epileptic seizure detection using a CNN, and premature ventricular contraction detection using deep metric learning. Thus, this reprint presents significant clinical applications as well as valuable new research issues, providing current illustrations of this new field of research by addressing the promises, challenges, and hurdles associated with the synergy of biosignal processing and AI through 16 different pertinent studies. Covering a wide range of research and application areas, this book is an excellent resource for researchers, physicians, academics, and PhD or master students working on (bio)signal and image processing, AI, biomaterials, biomechanics, and biotechnology with applications in medicine

    Attention Mechanism for Recognition in Computer Vision

    Get PDF
    It has been proven that humans do not focus their attention on an entire scene at once when they perform a recognition task. Instead, they pay attention to the most important parts of the scene to extract the most discriminative information. Inspired by this observation, in this dissertation, the importance of attention mechanism in recognition tasks in computer vision is studied by designing novel attention-based models. In specific, four scenarios are investigated that represent the most important aspects of attention mechanism.First, an attention-based model is designed to reduce the visual features\u27 dimensionality by selectively processing only a small subset of the data. We study this aspect of the attention mechanism in a framework based on object recognition in distributed camera networks. Second, an attention-based image retrieval system (i.e., person re-identification) is proposed which learns to focus on the most discriminative regions of the person\u27s image and process those regions with higher computation power using a deep convolutional neural network. Furthermore, we show how visualizing the attention maps can make deep neural networks more interpretable. In other words, by visualizing the attention maps we can observe the regions of the input image where the neural network relies on, in order to make a decision. Third, a model for estimating the importance of the objects in a scene based on a given task is proposed. More specifically, the proposed model estimates the importance of the road users that a driver (or an autonomous vehicle) should pay attention to in a driving scenario in order to have safe navigation. In this scenario, the attention estimation is the final output of the model. Fourth, an attention-based module and a new loss function in a meta-learning based few-shot learning system is proposed in order to incorporate the context of the task into the feature representations of the samples and increasing the few-shot recognition accuracy.In this dissertation, we showed that attention can be multi-facet and studied the attention mechanism from the perspectives of feature selection, reducing the computational cost, interpretable deep learning models, task-driven importance estimation, and context incorporation. Through the study of four scenarios, we further advanced the field of where \u27\u27attention is all you need\u27\u27

    Signal processing and machine learning techniques for human verification based on finger textures

    Get PDF
    PhD ThesisIn recent years, Finger Textures (FTs) have attracted considerable attention as potential biometric characteristics. They can provide robust recognition performance as they have various human-speci c features, such as wrinkles and apparent lines distributed along the inner surface of all ngers. The main topic of this thesis is verifying people according to their unique FT patterns by exploiting signal processing and machine learning techniques. A Robust Finger Segmentation (RFS) method is rst proposed to isolate nger images from a hand area. It is able to detect the ngers as objects from a hand image. An e cient adaptive nger segmentation method is also suggested to address the problem of alignment variations in the hand image called the Adaptive and Robust Finger Segmentation (ARFS) method. A new Multi-scale Sobel Angles Local Binary Pattern (MSALBP) feature extraction method is proposed which combines the Sobel direction angles with the Multi-Scale Local Binary Pattern (MSLBP). Moreover, an enhanced method called the Enhanced Local Line Binary Pattern (ELLBP) is designed to e ciently analyse the FT patterns. As a result, a powerful human veri cation scheme based on nger Feature Level Fusion with a Probabilistic Neural Network (FLFPNN) is proposed. A multi-object fusion method, termed the Finger Contribution Fusion Neural Network (FCFNN), combines the contribution scores of the nger objects. The veri cation performances are examined in the case of missing FT areas. Consequently, to overcome nger regions which are poorly imaged a method is suggested to salvage missing FT elements by exploiting the information embedded within the trained Probabilistic Neural Network (PNN). Finally, a novel method to produce a Receiver Operating Characteristic (ROC) curve from a PNN is suggested. Furthermore, additional development to this method is applied to generate the ROC graph from the FCFNN. Three databases are employed for evaluation: The Hong Kong Polytechnic University Contact-free 3D/2D (PolyU3D2D), Indian Institute of Technology (IIT) Delhi and Spectral 460nm (S460) from the CASIA Multi-Spectral (CASIAMS) databases. Comparative simulation studies con rm the e ciency of the proposed methods for human veri cation. The main advantage of both segmentation approaches, the RFS and ARFS, is that they can collect all the FT features. The best results have been benchmarked for the ELLBP feature extraction with the FCFNN, where the best Equal Error Rate (EER) values for the three databases PolyU3D2D, IIT Delhi and CASIAMS (S460) have been achieved 0.11%, 1.35% and 0%, respectively. The proposed salvage approach for the missing feature elements has the capability to enhance the veri cation performance for the FLFPNN. Moreover, ROC graphs have been successively established from the PNN and FCFNN.the ministry of higher education and scientific research in Iraq (MOHESR); the Technical college of Mosul; the Iraqi Cultural Attach e; the active people in the MOHESR, who strongly supported Iraqi students

    Covariate-invariant gait recognition using random subspace method and its extensions

    Get PDF
    Compared with other biometric traits like fingerprint or iris, the most significant advantage of gait is that it can be used for remote human identification without cooperation from the subjects. The technology of gait recognition may play an important role in crime prevention, law enforcement, etc. Yet the performance of automatic gait recognition may be affected by covariate factors such as speed, carrying condition, elapsed time, shoe, walking surface, clothing, camera viewpoint, video quality, etc. In this thesis, we propose a random subspace method (RSM) based classifier ensemble framework and its extensions for robust gait recognition. Covariates change the human gait appearance in different ways. For example, speed may change the appearance of human arms or legs; camera viewpoint alters the human visual appearance in a global manner; carrying condition and clothing may change the appearance of any parts of the human body (depending on what is being carried/wore). Due to the unpredictable nature of covariates, it is difficult to collect all the representative training data. We claim overfitting may be the main problem that hampers the performance of gait recognition algorithms (that rely on learning). First, for speed-invariant gait recognition, we employ a basic RSM model, which can reduce the generalisation errors by combining a large number of weak classifiers in the decision level (i.e., by using majority voting). We find that the performance of RSM decreases when the intra-class variations are large. In RSM, although weak classifiers with lower dimensionality tend to have better generalisation ability, they may have to contend with the underfitting problem if the dimensionality is too low. We thus enhance the RSM-based weak classifiers by extending RSM to multimodal-RSM. In tackling the elapsed time covariate, we use face information to enhance the RSM-based gait classifiers before the decision-level fusion. We find significant performance gain can be achieved when lower weight is assigned to the face information. We also employ a weak form of multimodal-RSM for gait recognition from low quality videos (with low resolution and low frame-rate) when other modalities are unavailable. In this case, model-based information is used to enhance the RSM-based weak classifiers. Then we point out the relationship of base classifier accuracy, classifier ensemble accuracy, and diversity among the base classifiers. By incorporating the model-based information (with lower weight) into the RSM-based weak classifiers, the diversity of the classifiers, which is positively correlated to the ensemble accuracy, can be enhanced. In contrast to multimodal systems, large intra-class variations may have a significant impact on unimodal systems. We model the effect of various unknown covariates as a partial feature corruption problem with unknown locations in the spatial domain. By making some assumptions in ideal cases analysis, we provide the theoretical basis of RSM-based classifier ensemble in the application of covariate-invariant gait recognition. However, in real cases, these assumptions may not hold precisely, and the performance may be affected when the intra-class variations are large. We propose a criterion to address this issue. That is, in the decision-level fusion stage, for a query gait with unknown covariates, we need to dynamically suppress the ratio of the false votes and the true votes before the majority voting. Two strategies are employed, i.e., local enhancing (LE) which can increase true votes, and the proposed hybrid decision-level fusion (HDF) which can decrease false votes. Based on this criterion, the proposed RSM-based HDF (RSM-HDF) framework achieves very competitive performance in tackling the covariates such as walking surface, clothing, and elapsed time, which were deemed as the open questions. The factor of camera viewpoint is different from other covariates. It alters the human appearance in a global manner. By employing unitary projection (UP), we form a new space, where the same subjects are closer from different views. However, it may also give rise to a large amount of feature distortions. We deem these distortions as the corrupted features with unknown locations in the new space (after UP), and use the RSM-HDF framework to address this issue. Robust view-invariant gait recognition can be achieved by using the UP-RSM-HDF framework. In this thesis, we propose a RSM-based classifier ensemble framework and its extensions to realise the covariate-invariant gait recognition. It is less sensitive to most of the covariate factors such as speed, shoe, carrying condition, walking surface, video quality, clothing, elapsed time, camera viewpoint, etc., and it outperforms other state-of-the-art algorithms significantly on all the major public gait databases. Specifically, our method can achieve very competitive performance against (large changes in) view, clothing, walking surface, elapsed time, etc., which were deemed as the most difficult covariate factors

    Contribution to supervised representation learning: algorithms and applications.

    Get PDF
    278 p.In this thesis, we focus on supervised learning methods for pattern categorization. In this context, itremains a major challenge to establish efficient relationships between the discriminant properties of theextracted features and the inter-class sparsity structure.Our first attempt to address this problem was to develop a method called "Robust Discriminant Analysiswith Feature Selection and Inter-class Sparsity" (RDA_FSIS). This method performs feature selectionand extraction simultaneously. The targeted projection transformation focuses on the most discriminativeoriginal features while guaranteeing that the extracted (or transformed) features belonging to the sameclass share a common sparse structure, which contributes to small intra-class distances.In a further study on this approach, some improvements have been introduced in terms of theoptimization criterion and the applied optimization process. In fact, we proposed an improved version ofthe original RDA_FSIS called "Enhanced Discriminant Analysis with Class Sparsity using GradientMethod" (EDA_CS). The basic improvement is twofold: on the first hand, in the alternatingoptimization, we update the linear transformation and tune it with the gradient descent method, resultingin a more efficient and less complex solution than the closed form adopted in RDA_FSIS.On the other hand, the method could be used as a fine-tuning technique for many feature extractionmethods. The main feature of this approach lies in the fact that it is a gradient descent based refinementapplied to a closed form solution. This makes it suitable for combining several extraction methods andcan thus improve the performance of the classification process.In accordance with the above methods, we proposed a hybrid linear feature extraction scheme called"feature extraction using gradient descent with hybrid initialization" (FE_GD_HI). This method, basedon a unified criterion, was able to take advantage of several powerful linear discriminant methods. Thelinear transformation is computed using a descent gradient method. The strength of this approach is thatit is generic in the sense that it allows fine tuning of the hybrid solution provided by different methods.Finally, we proposed a new efficient ensemble learning approach that aims to estimate an improved datarepresentation. The proposed method is called "ICS Based Ensemble Learning for Image Classification"(EM_ICS). Instead of using multiple classifiers on the transformed features, we aim to estimate multipleextracted feature subsets. These were obtained by multiple learned linear embeddings. Multiple featuresubsets were used to estimate the transformations, which were ranked using multiple feature selectiontechniques. The derived extracted feature subsets were concatenated into a single data representationvector with strong discriminative properties.Experiments conducted on various benchmark datasets ranging from face images, handwritten digitimages, object images to text datasets showed promising results that outperformed the existing state-ofthe-art and competing methods

    Hyperspectral Data Acquisition and Its Application for Face Recognition

    Get PDF
    Current face recognition systems are rife with serious challenges in uncontrolled conditions: e.g., unrestrained lighting, pose variations, accessories, etc. Hyperspectral imaging (HI) is typically employed to counter many of those challenges, by incorporating the spectral information within different bands. Although numerous methods based on hyperspectral imaging have been developed for face recognition with promising results, three fundamental challenges remain: 1) low signal to noise ratios and low intensity values in the bands of the hyperspectral image specifically near blue bands; 2) high dimensionality of hyperspectral data; and 3) inter-band misalignment (IBM) correlated with subject motion during data acquisition. This dissertation concentrates mainly on addressing the aforementioned challenges in HI. First, to address low quality of the bands of the hyperspectral image, we utilize a custom light source that has more radiant power at shorter wavelengths and properly adjust camera exposure times corresponding to lower transmittance of the filter and lower radiant power of our light source. Second, the high dimensionality of spectral data imposes limitations on numerical analysis. As such, there is an emerging demand for robust data compression techniques with lows of less relevant information to manage real spectral data. To cope with these challenging problems, we describe a reduced-order data modeling technique based on local proper orthogonal decomposition in order to compute low-dimensional models by projecting high-dimensional clusters onto subspaces spanned by local reduced-order bases. Third, we investigate 11 leading alignment approaches to address IBM correlated with subject motion during data acquisition. To overcome the limitations of the considered alignment approaches, we propose an accurate alignment approach ( A3) by incorporating the strengths of point correspondence and a low-rank model. In addition, we develop two qualitative prediction models to assess the alignment quality of hyperspectral images in determining improved alignment among the conducted alignment approaches. Finally, we show that the proposed alignment approach leads to promising improvement on face recognition performance of a probabilistic linear discriminant analysis approach

    Sparse and low rank approximations for action recognition

    Get PDF
    Action recognition is crucial area of research in computer vision with wide range of applications in surveillance, patient-monitoring systems, video indexing, Human- Computer Interaction and many more. These applications require automated action recognition. Robust classification methods are sought-after despite influential research in this field over past decade. The data resources have grown tremendously owing to the advances in the digital revolution which cannot be compared to the meagre resources in the past. The main limitation on a system when dealing with video data is the computational burden due to large dimensions and data redundancy. Sparse and low rank approximation methods have evolved recently which aim at concise and meaningful representation of data. This thesis explores the application of sparse and low rank approximation methods in the context of video data classification with the following contributions. 1. An approach for solving the problem of action and gesture classification is proposed within the sparse representation domain, effectively dealing with large feature dimensions, 2. Low rank matrix completion approach is proposed to jointly classify more than one action 3. Deep features are proposed for robust classification of multiple actions within matrix completion framework which can handle data deficiencies. This thesis starts with the applicability of sparse representations based classifi- cation methods to the problem of action and gesture recognition. Random projection is used to reduce the dimensionality of the features. These are referred to as compressed features in this thesis. The dictionary formed with compressed features has proved to be efficient for the classification task achieving comparable results to the state of the art. Next, this thesis addresses the more promising problem of simultaneous classifi- cation of multiple actions. This is treated as matrix completion problem under transduction setting. Matrix completion methods are considered as the generic extension to the sparse representation methods from compressed sensing point of view. The features and corresponding labels of the training and test data are concatenated and placed as columns of a matrix. The unknown test labels would be the missing entries in that matrix. This is solved using rank minimization techniques based on the assumption that the underlying complete matrix would be a low rank one. This approach has achieved results better than the state of the art on datasets with varying complexities. This thesis then extends the matrix completion framework for joint classification of actions to handle the missing features besides missing test labels. In this context, deep features from a convolutional neural network are proposed. A convolutional neural network is trained on the training data and features are extracted from train and test data from the trained network. The performance of the deep features has proved to be promising when compared to the state of the art hand-crafted features

    Physical Activity Recognition and Identification System

    Get PDF
    Background: It is well-established that physical activity is beneficial to health. It is less known how the characteristics of physical activity impact health independently of total amount. This is due to the inability to measure these characteristics in an objective way that can be applied to large population groups. Accelerometry allows for objective monitoring of physical activity but is currently unable to identify type of physical activity accurately. Methods: This thesis details the creation of an activity classifier that can identify type from accelerometer data. The current research in activity classification was reviewed and methodological challenges were identified. The main challenge was the inability of classifiers to generalize to unseen data. Creating methods to mitigate this lack of generalisation represents the bulk of this thesis. Using the review, a classification pipeline was synthesised, representing the sequence of steps that all activity classifiers use. 1. Determination of device location and setting (Chapter 4) 2. Pre-processing (Chapter 5) 3. Segmenting into windows (Chapters 6) 4. Extracting features (Chapters 7,8) 5. Creating the classifier (Chapter 9) 6. Post-processing (Chapter 5) For each of these steps, methods were created and tested that allowed for a high level of generalisability without sacrificing overall performance. Results: The work in this thesis results in an activity classifier that had a good ability to generalize to unseen data. The classifier achieved an F1-score of 0.916 and 0.826 on data similar to its training data, which is statistically equivalent to the performance of current state of the art models (0.898, 0.765). On data dissimilar to its training data, the classifier achieved a significantly higher performance than current state of the art methods (0.759, 0.897 versus 0.352, 0.415). This shows that the classifier created in this work has a significantly greater ability to generalise to unseen data than current methods. Conclusion: This thesis details the creation of an activity classifier that allows for an improved ability to generalize to unseen data, thus allowing for identification of type from acceleration data. This should allow for more detailed investigation into the specific health effects of type in large population studies utilising accelerometers

    Re-identifying people in the crowd

    Get PDF
    Developing an automated surveillance system is of great interest for various reasons including forensic and security applications. In the case of a network of surveillance cameras with non-overlapping fields of view, person detection and tracking alone are insufficient to track a subject of interest across the network. In this case, instances of a person captured in one camera view need to be retrieved among a gallery of different people, in other camera views. This vision problem is commonly known as person re-identification (re-id). Cross-view instances of pedestrians exhibit varied levels of illumination, viewpoint, and pose variations which makes the problem very challenging. Despite recent progress towards improving accuracy, existing systems suffer from low applicability to real-world scenarios. This is mainly caused by the need for large amounts of annotated data from pairwise camera views to be available for training. Given the difficulty of obtaining such data and annotating it, this thesis aims to bring the person re-id problem a step closer to real-world deployment. In the first contribution, the single-shot protocol, where each individual is represented by a pair of images that need to be matched, is considered. Following the extensive annotation of four datasets for six attributes, an evaluation of the most widely used feature extraction schemes is conducted. The results reveal two high-performing descriptors among those evaluated, and show illumination variation to have the most impact on re-id accuracy. Motivated by the wide availability of videos from surveillance cameras and the additional visual and temporal information they provide, video-based person re-id is then investigated, and a su-pervised system is developed. This is achieved by improving and extending the best performing image-based person descriptor into three dimensions and combining it with distance metric learn-ing. The system obtained achieves state-of-the-art results on two widely used datasets. Given the cost and difficulty of obtaining labelled data from pairwise cameras in a network to train the model, an unsupervised video-based person re-id method is also developed. It is based on a set-based distance measure that leverages rank vectors to estimate the similarity scores between person tracklets. The proposed system outperforms other unsupervised methods by a large margin on two datasets while competing with deep learning methods on another large-scale dataset
    • …
    corecore