691 research outputs found

    Key body pose detection and movement assessment of fitness performances

    Get PDF
    Motion segmentation plays an important role in human motion analysis. Understanding the intrinsic features of human activities represents a challenge for modern science. Current solutions usually involve computationally demanding processing and achieve the best results using expensive, intrusive motion capture devices. In this thesis, research has been carried out to develop a series of methods for affordable and effective human motion assessment in the context of stand-up physical exercises. The objective of the research was to tackle the needs for an autonomous system that could be deployed in nursing homes or elderly people's houses, as well as rehabilitation of high profile sport performers. Firstly, it has to be designed so that instructions on physical exercises, especially in the case of elderly people, can be delivered in an understandable way. Secondly, it has to deal with the problem that some individuals may find it difficult to keep up with the programme due to physical impediments. They may also be discouraged because the activities are not stimulating or the instructions are hard to follow. In this thesis, a series of methods for automatic assessment production, as a combination of worded feedback and motion visualisation, is presented. The methods comprise two major steps. First, a series of key body poses are identified upon a model built by a multi-class classifier from a set of frame-wise features extracted from the motion data. Second, motion alignment (or synchronisation) with a reference performance (the tutor) is established in order to produce a second assessment model. Numerical assessment, first, and textual feedback, after, are delivered to the user along with a 3D skeletal animation to enrich the assessment experience. This animation is produced after the demonstration of the expert is transformed to the current level of performance of the user, in order to help encourage them to engage with the programme. The key body pose identification stage follows a two-step approach: first, the principal components of the input motion data are calculated in order to reduce the dimensionality of the input. Then, candidates of key body poses are inferred using multi-class, supervised machine learning techniques from a set of training samples. Finally, cluster analysis is used to refine the result. Key body pose identification is guaranteed to be invariant to the repetitiveness and symmetry of the performance. Results show the effectiveness of the proposed approach by comparing it against Dynamic Time Warping and Hierarchical Aligned Cluster Analysis. The synchronisation sub-system takes advantage of the cyclic nature of the stretches that are part of the stand-up exercises subject to study in order to remove out-of-sequence identified key body poses (i.e., false positives). Two approaches are considered for performing cycle analysis: a sequential, trivial algorithm and a proposed Genetic Algorithm, with and without prior knowledge on cyclic sequence patterns. These two approaches are compared and the Genetic Algorithm with prior knowledge shows a lower rate of false positives, but also a higher false negative rate. The GAs are also evaluated with randomly generated periodic string sequences. The automatic assessment follows a similar approach to that of key body pose identification. A multi-class, multi-target machine learning classifier is trained with features extracted from previous motion alignment. The inferred numerical assessment levels (one per identified key body pose and involved body joint) are translated into human-understandable language via a highly-customisable, context-free grammar. Finally, visual feedback is produced in the form of a synchronised skeletal animation of both the user's performance and the tutor's. If the user's performance is well below a standard then an affine offset transformation of the skeletal motion data series to an in-between performance is performed, in order to prevent dis-encouragement from the user and still provide a reference for improvement. At the end of this thesis, a study of the limitations of the methods in real circumstances is explored. Issues like the gimbal lock in the angular motion data, lack of accuracy of the motion capture system and the escalation of the training set are discussed. Finally, some conclusions are drawn and future work is discussed

    Editorial

    Get PDF
    calls & calendarEDITORIA

    Interactive Evolutionary Algorithms for Image Enhancement and Creation

    Get PDF
    Image enhancement and creation, particularly for aesthetic purposes, are tasks for which the use of interactive evolutionary algorithms would seem to be well suited. Previous work has concentrated on the development of various aspects of the interactive evolutionary algorithms and their application to various image enhancement and creation problems. Robust evaluation of algorithmic design options in interactive evolutionary algorithms and the comparison of interactive evolutionary algorithms to alternative approaches to achieving the same goals is generally less well addressed. The work presented in this thesis is primarily concerned with different interactive evolutionary algorithms, search spaces, and operators for setting the input values required by image processing and image creation tasks. A secondary concern is determining when the use of the interactive evolutionary algorithm approach to image enhancement problems is warranted and how it compares with alternative approaches. Various interactive evolutionary algorithms were implemented and compared in a number of specifically devised experiments using tasks of varying complexity. A novel aspect of this thesis, with regards to other work in the study of interactive evolutionary algorithms, was that statistical analysis of the data gathered from the experiments was performed. This analysis demonstrated, contrary to popular assumption, that the choice of algorithm parameters, operators, search spaces, and even the underlying evolutionary algorithm has little effect on the quality of the resulting images or the time it takes to develop them. It was found that the interaction methods chosen when implementing the user interface of the interactive evolutionary algorithms had a greater influence on the performances of the algorithms

    Artificial Intelligence Tools for Facial Expression Analysis.

    Get PDF
    Inner emotions show visibly upon the human face and are understood as a basic guide to an individual’s inner world. It is, therefore, possible to determine a person’s attitudes and the effects of others’ behaviour on their deeper feelings through examining facial expressions. In real world applications, machines that interact with people need strong facial expression recognition. This recognition is seen to hold advantages for varied applications in affective computing, advanced human-computer interaction, security, stress and depression analysis, robotic systems, and machine learning. This thesis starts by proposing a benchmark of dynamic versus static methods for facial Action Unit (AU) detection. AU activation is a set of local individual facial muscle parts that occur in unison constituting a natural facial expression event. Detecting AUs automatically can provide explicit benefits since it considers both static and dynamic facial features. For this research, AU occurrence activation detection was conducted by extracting features (static and dynamic) of both nominal hand-crafted and deep learning representation from each static image of a video. This confirmed the superior ability of a pretrained model that leaps in performance. Next, temporal modelling was investigated to detect the underlying temporal variation phases using supervised and unsupervised methods from dynamic sequences. During these processes, the importance of stacking dynamic on top of static was discovered in encoding deep features for learning temporal information when combining the spatial and temporal schemes simultaneously. Also, this study found that fusing both temporal and temporal features will give more long term temporal pattern information. Moreover, we hypothesised that using an unsupervised method would enable the leaching of invariant information from dynamic textures. Recently, fresh cutting-edge developments have been created by approaches based on Generative Adversarial Networks (GANs). In the second section of this thesis, we propose a model based on the adoption of an unsupervised DCGAN for the facial features’ extraction and classification to achieve the following: the creation of facial expression images under different arbitrary poses (frontal, multi-view, and in the wild), and the recognition of emotion categories and AUs, in an attempt to resolve the problem of recognising the static seven classes of emotion in the wild. Thorough experimentation with the proposed cross-database performance demonstrates that this approach can improve the generalization results. Additionally, we showed that the features learnt by the DCGAN process are poorly suited to encoding facial expressions when observed under multiple views, or when trained from a limited number of positive examples. Finally, this research focuses on disentangling identity from expression for facial expression recognition. A novel technique was implemented for emotion recognition from a single monocular image. A large-scale dataset (Face vid) was created from facial image videos which were rich in variations and distribution of facial dynamics, appearance, identities, expressions, and 3D poses. This dataset was used to train a DCNN (ResNet) to regress the expression parameters from a 3D Morphable Model jointly with a back-end classifier

    Emotion and Stress Recognition Related Sensors and Machine Learning Technologies

    Get PDF
    This book includes impactful chapters which present scientific concepts, frameworks, architectures and ideas on sensing technologies and machine learning techniques. These are relevant in tackling the following challenges: (i) the field readiness and use of intrusive sensor systems and devices for capturing biosignals, including EEG sensor systems, ECG sensor systems and electrodermal activity sensor systems; (ii) the quality assessment and management of sensor data; (iii) data preprocessing, noise filtering and calibration concepts for biosignals; (iv) the field readiness and use of nonintrusive sensor technologies, including visual sensors, acoustic sensors, vibration sensors and piezoelectric sensors; (v) emotion recognition using mobile phones and smartwatches; (vi) body area sensor networks for emotion and stress studies; (vii) the use of experimental datasets in emotion recognition, including dataset generation principles and concepts, quality insurance and emotion elicitation material and concepts; (viii) machine learning techniques for robust emotion recognition, including graphical models, neural network methods, deep learning methods, statistical learning and multivariate empirical mode decomposition; (ix) subject-independent emotion and stress recognition concepts and systems, including facial expression-based systems, speech-based systems, EEG-based systems, ECG-based systems, electrodermal activity-based systems, multimodal recognition systems and sensor fusion concepts and (x) emotion and stress estimation and forecasting from a nonlinear dynamical system perspective

    Audio-coupled video content understanding of unconstrained video sequences

    Get PDF
    Unconstrained video understanding is a difficult task. The main aim of this thesis is to recognise the nature of objects, activities and environment in a given video clip using both audio and video information. Traditionally, audio and video information has not been applied together for solving such complex task, and for the first time we propose, develop, implement and test a new framework of multi-modal (audio and video) data analysis for context understanding and labelling of unconstrained videos. The framework relies on feature selection techniques and introduces a novel algorithm (PCFS) that is faster than the well-established SFFS algorithm. We use the framework for studying the benefits of combining audio and video information in a number of different problems. We begin by developing two independent content recognition modules. The first one is based on image sequence analysis alone, and uses a range of colour, shape, texture and statistical features from image regions with a trained classifier to recognise the identity of objects, activities and environment present. The second module uses audio information only, and recognises activities and environment. Both of these approaches are preceded by detailed pre-processing to ensure that correct video segments containing both audio and video content are present, and that the developed system can be made robust to changes in camera movement, illumination, random object behaviour etc. For both audio and video analysis, we use a hierarchical approach of multi-stage classification such that difficult classification tasks can be decomposed into simpler and smaller tasks. When combining both modalities, we compare fusion techniques at different levels of integration and propose a novel algorithm that combines advantages of both feature and decision-level fusion. The analysis is evaluated on a large amount of test data comprising unconstrained videos collected for this work. We finally, propose a decision correction algorithm which shows that further steps towards combining multi-modal classification information effectively with semantic knowledge generates the best possible results

    Understanding Optimisation Processes with Biologically-Inspired Visualisations

    Get PDF
    Evolutionary algorithms (EAs) constitute a branch of artificial intelligence utilised to evolve solutions to solve optimisation problems abound in industry and research. EAs often generate many solutions and visualisation has been a primary strategy to display EA solutions, given that visualisation is a multi-domain well-evaluated medium to comprehend extensive data. The endeavour of visualising solutions is inherent with challenges resulting from high dimensional phenomenons and the large number of solutions to display. Recently, scholars have produced methods to mitigate some of these known issues when illustrating solutions. However, one key consideration is that displaying the final subset of solutions exclusively (rather than the whole population) discards most of the informativeness of the search, creating inadequate insight into the black-box EA. There is an unequivocal knowledge gap and requirement for methods which can visualise the whole population of solutions from an optimiser and subjugate the high-dimensional problems and scaling issues to create interpretability of the EA search process. Furthermore, a requirement for explainability in evolutionary computing has been demanded by the evolutionary computing community, which could take the form of visualisations, to support EA comprehension much like the support explainable artificial intelligence has brought to artificial intelligence. In this thesis, we report novel visualisation methods that can be used to visualise large and high-dimensional optimiser populations with the aim of creating greater interpretability during a search. We consider the nascent intersection of visualisation and explainability in evolutionary computing. The potential high informativeness of a visualisation method from an early chapter of this work forms an effective platform to develop an explainability visualisation method, namely the population dynamics plot, to attempt to inject explainability into the inner workings of the search process. We further support the visualisation of populations using machine learning to construct models which can capture the characteristics of an EA search and develop intelligent visualisations which use artificial intelligence to potentially enhance and support visualisation for a more informative search process. The methods developed in this thesis are evaluated both quantitatively and qualitatively. We use multi-feature benchmark problems to show the method’s ability to reveal specific problem characteristics such as disconnected fronts, local optima and bias, as well as potentially creating a better understanding of the problem landscape and optimiser search for evaluating and comparing algorithm performance (we show the visualisation method to be more insightful than conventional metrics like hypervolume alone). One of the most insightful methods developed in this thesis can produce a visualisation requiring less than 1% of the time and memory necessary to produce a visualisation of the same objective space solutions using existing methods. This allows for greater scalability and the use in short compile time applications such as online visualisations. Predicated by an existing visualisation method in this thesis, we then develop and apply an explainability method to a real-world problem and evaluate it to show the method to be highly effective at explaining the search via solutions in the objective spaces, solution lineage and solution variation operators to compactly comprehend, evaluate and communicate the search of an optimiser, although we note the explainability properties are only evaluated against the author’s ability and could be evaluated further in future work with a usability study. The work is then supported by the development of intelligent visualisation models that may allow one to predict solutions in optima (importantly local optima) in unseen problems by using a machine learning model. The results are effective, with some models able to predict and visualise solution optima with a balanced F1 accuracy metric of 96%. The results of this thesis provide a suite of visualisations which aims to provide greater informativeness of the search and scalability than previously existing literature. The work develops one of the first explainability methods aiming to create greater insight into the search space, solution lineage and reproductive operators. The work applies machine learning to potentially enhance EA understanding via visualisation. These models could also be used for a number of applications outside visualisation. Ultimately, the work provides novel methods for all EA stakeholders which aims to support understanding, evaluation and communication of EA processes with visualisation

    Brief history of natural sciences for nature-inspired computing in engineering

    Get PDF
    The goal of the authors is adroit integration of three mainstream disciplines of the natural sciences, physics, chemistry and biology to create novel problem solving paradigms. This paper presents a brief history of the develop­ment of the natural sciences and highlights some milestones which subsequently influenced many branches of science, engineering and computing as a prelude to nature-inspired computing which has captured the imagination of computing researchers in the past three decades. The idea is to summarize the massive body of knowledge in a single paper suc­cinctly. The paper is organised into three main sections: developments in physics, developments in chemistry, and de­velopments in biology. Examples of recently-proposed computing approaches inspired by the three branches of natural sciences are provided
    • …
    corecore