128 research outputs found

    Action recognition from RGB-D data

    Get PDF
    In recent years, action recognition based on RGB-D data has attracted increasing attention. Different from traditional 2D action recognition, RGB-D data contains extra depth and skeleton modalities. Different modalities have their own characteristics. This thesis presents seven novel methods to take advantages of the three modalities for action recognition. First, effective handcrafted features are designed and frequent pattern mining method is employed to mine the most discriminative, representative and nonredundant features for skeleton-based action recognition. Second, to take advantages of powerful Convolutional Neural Networks (ConvNets), it is proposed to represent spatio-temporal information carried in 3D skeleton sequences in three 2D images by encoding the joint trajectories and their dynamics into color distribution in the images, and ConvNets are adopted to learn the discriminative features for human action recognition. Third, for depth-based action recognition, three strategies of data augmentation are proposed to apply ConvNets to small training datasets. Forth, to take full advantage of the 3D structural information offered in the depth modality and its being insensitive to illumination variations, three simple, compact yet effective images-based representations are proposed and ConvNets are adopted for feature extraction and classification. However, both of previous two methods are sensitive to noise and could not differentiate well fine-grained actions. Fifth, it is proposed to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling to deal with the issue. The structured dynamic image preserves the spatial-temporal information, enhances the structure information across both body parts/joints and different temporal scales, and takes advantages of ConvNets for action recognition. Sixth, it is proposed to extract and use scene flow for action recognition from RGB and depth data. Last, to exploit the joint information in multi-modal features arising from heterogeneous sources (RGB, depth), it is proposed to cooperatively train a single ConvNet (referred to as c-ConvNet) on both RGB features and depth features, and deeply aggregate the two modalities to achieve robust action recognition

    An Intelligent Robot and Augmented Reality Instruction System

    Get PDF
    Human-Centered Robotics (HCR) is a research area that focuses on how robots can empower people to live safer, simpler, and more independent lives. In this dissertation, I present a combination of two technologies to deliver human-centric solutions to an important population. The first nascent area that I investigate is the creation of an Intelligent Robot Instructor (IRI) as a learning and instruction tool for human pupils. The second technology is the use of augmented reality (AR) to create an Augmented Reality Instruction (ARI) system to provide instruction via a wearable interface. To function in an intelligent and context-aware manner, both systems require the ability to reason about their perception of the environment and make appropriate decisions. In this work, I construct a novel formulation of several education methodologies, particularly those known as response prompting, as part of a cognitive framework to create a system for intelligent instruction, and compare these methodologies in the context of intelligent decision making using both technologies. The IRI system is demonstrated through experiments with a humanoid robot that uses object recognition and localization for perception and interacts with students through speech, gestures, and object interaction. The ARI system uses augmented reality, computer vision, and machine learning methods to create an intelligent, contextually aware instructional system. By using AR to teach prerequisite skills that lend themselves well to visual, augmented reality instruction prior to a robot instructor teaching skills that lend themselves to embodied interaction, I am able to demonstrate the potential of each system independently as well as in combination to facilitate students\u27 learning. I identify people with intellectual and developmental disabilities (I/DD) as a particularly significant use case and show that IRI and ARI systems can help fulfill the compelling need to develop tools and strategies for people with I/DD. I present results that demonstrate both systems can be used independently by students with I/DD to quickly and easily acquire the skills required for performance of relevant vocational tasks. This is the first successful real-world application of response-prompting for decision making in a robotic and augmented reality intelligent instruction system

    Exploration of closing-in behaviour in dementia, development and healthy adulthood

    Get PDF
    Closing-in Behaviour (CIB) is the tendency observed in copying tasks, both graphic and gestural, in which the copy is made inappropriately close to or on top of the model. It is classically considered as a manifestation of Constructional Apraxia (CA) and it is often observed in patients with dementia. CIB is not only a symptom of pathology, but it is also observed in children’s first attempts at graphic copying. However, CIB shows an inverse pattern in development and dementia: while its frequency increases in severe dementia, CIB progressively decreases with development. The cognitive origins of CIB are still unclear. Two main interpretations dominate CIB literature: the compensation and the attraction hypotheses. The first hypothesis interprets CIB as a strategy specific to copying tasks that the patient adopts to overcome visuospatial and working memory deficits. In contrast, the attraction hypothesis considers CIB as a primitive behaviour, not specific to copying, and characterized by the default tendency to perform an action toward the focus of attention. This thesis aimed to study the characteristics and the cognitive origins of CIB in dementia, development and healthy adulthood. It has three main sections. The first and second sections explore CIB in patients (with Alzheimer’s disease- AD and Frontotemporal dementia) and in pre-school children, using survey and experimental studies, to investigate if CIB might have common characteristics and cognitive substrates in these different populations. The results provided converging evidence for the similar nature of CIB in development and dementia. For instance, survey studies in patients with dementia (Chapter 3) and preschool children (Chapter 6) showed that performance in attentional tasks predicted the appearance of CIB. In a similar vein, experimental studies showed support for the attraction hypothesis of CIB in a single patient with AD (Chapter 4) and pre-school children (Chapter 7 and 8). These results were not, however, replicated in a larger cohort of patients with AD due to practical reasons (Chapter 5). The last section was devoted to modelling CIB in normal participants, using complex graphic copying (Chapter 9) and dual task paradigms (Chapter 10). The results showed further support for the attraction hypothesis of CIB and underlined the difficulties of eliciting this default bias in normal adults. To conclude, this thesis radically changes the classical consideration of CIB as a manifestation of CA and demonstrates that CIB is a general default tendency, not specific to copying tasks. This work indicates avenues for new studies, which might consider the possible expression and consequences of this behaviour in patients’ daily lives

    Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review

    Full text link
    In this paper, a critical bibliometric analysis study is conducted, coupled with an extensive literature survey on recent developments and associated applications in machine learning research with a perspective on Africa. The presented bibliometric analysis study consists of 2761 machine learning-related documents, of which 98% were articles with at least 482 citations published in 903 journals during the past 30 years. Furthermore, the collated documents were retrieved from the Science Citation Index EXPANDED, comprising research publications from 54 African countries between 1993 and 2021. The bibliometric study shows the visualization of the current landscape and future trends in machine learning research and its application to facilitate future collaborative research and knowledge exchange among authors from different research institutions scattered across the African continent

    De-identification for privacy protection in multimedia content : A survey

    Get PDF
    This document is the Accepted Manuscript version of the following article: Slobodan Ribaric, Aladdin Ariyaeeinia, and Nikola Pavesic, ‘De-identification for privacy protection in multimedia content: A survey’, Signal Processing: Image Communication, Vol. 47, pp. 131-151, September 2016, doi: https://doi.org/10.1016/j.image.2016.05.020. This manuscript version is distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License CC BY NC-ND 4.0 (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.Privacy is one of the most important social and political issues in our information society, characterized by a growing range of enabling and supporting technologies and services. Amongst these are communications, multimedia, biometrics, big data, cloud computing, data mining, internet, social networks, and audio-video surveillance. Each of these can potentially provide the means for privacy intrusion. De-identification is one of the main approaches to privacy protection in multimedia contents (text, still images, audio and video sequences and their combinations). It is a process for concealing or removing personal identifiers, or replacing them by surrogate personal identifiers in personal information in order to prevent the disclosure and use of data for purposes unrelated to the purpose for which the information was originally obtained. Based on the proposed taxonomy inspired by the Safe Harbour approach, the personal identifiers, i.e., the personal identifiable information, are classified as non-biometric, physiological and behavioural biometric, and soft biometric identifiers. In order to protect the privacy of an individual, all of the above identifiers will have to be de-identified in multimedia content. This paper presents a review of the concepts of privacy and the linkage among privacy, privacy protection, and the methods and technologies designed specifically for privacy protection in multimedia contents. The study provides an overview of de-identification approaches for non-biometric identifiers (text, hairstyle, dressing style, license plates), as well as for the physiological (face, fingerprint, iris, ear), behavioural (voice, gait, gesture) and soft-biometric (body silhouette, gender, age, race, tattoo) identifiers in multimedia documents.Peer reviewe

    Understanding egocentric human actions with temporal decision forests

    Get PDF
    Understanding human actions is a fundamental task in computer vision with a wide range of applications including pervasive health-care, robotics and game control. This thesis focuses on the problem of egocentric action recognition from RGB-D data, wherein the world is viewed through the eyes of the actor whose hands describe the actions. The main contributions of this work are its findings regarding egocentric actions as described by hands in two application scenarios and a proposal of a new technique that is based on temporal decision forests. The thesis first introduces a novel framework to recognise fingertip writing in mid-air in the context of human-computer interaction. This framework detects whether the user is writing and tracks the fingertip over time to generate spatio-temporal trajectories that are recognised by using a Hough forest variant that encourages temporal consistency in prediction. A problem with using such forest approach for action recognition is that the learning of temporal dynamics is limited to hand-crafted temporal features and temporal regression, which may break the temporal continuity and lead to inconsistent predictions. To overcome this limitation, the thesis proposes transition forests. Besides any temporal information that is encoded in the feature space, the forest automatically learns the temporal dynamics during training, and it is exploited in inference in an online and efficient manner achieving state-of-the-art results. The last contribution of this thesis is its introduction of the first RGB-D benchmark to allow for the study of egocentric hand-object actions with both hand and object pose annotations. This study conducts an extensive evaluation of different baselines, state-of-the art approaches and temporal decision forest models using colour, depth and hand pose features. Furthermore, it extends the transition forest model to incorporate data from different modalities and demonstrates the benefit of using hand pose features to recognise egocentric human actions. The thesis concludes by discussing and analysing the contributions and proposing a few ideas for future work.Open Acces

    The Ecology of Cultural Space: Towards an Understanding of the Contemporary Artist-led Collective

    Get PDF
    The importance of friendship has been under-researched in relation to artistic discourse. This lack of research becomes particularly acute when considering ambiguous formations of collective artistic activity. My thesis draws upon friendship as a socio-cultural phenomenon in order to situate the artist-led collective both historically and within the contemporary art continuum. Tracing an historiography of the personal relationships which blurred the boundaries between art and politics, from the re-imagining of the medieval artisanal guild in the nineteenth century to the development of Futurism in the early twentieth century, I argue that the contemporary artist-led collective is haunted by these ‘collectivisms past’ and the spectre of autonomy. Further, the contradictions located within the ideological notions of individualism, which pervade the neo-liberal capitalist hegemony, both deny collective agency and yet accept collective praxis in the guise of enterprise culture. It is this contradictory character that frames my thesis and provides the context for understanding the complex role which friendship plays in the genesis of the contemporary artist-led collective. In order to understand the implications of friendship as a vital component of the artist-led collective, I utilise Relational Dialectics Theory (RDT) developed by Leslie Baxter and Barbara Montgomery, as a conceptual framework. I employ in-depth case studies of the artist-led collective duo The Cool Couple and architecture collective Assemble, in order to explore how friendship informs artist-led collectives throughout their life cycles. I question how and why these social bonds, which constitute relationships and thus shape the collectives, interrelate with a multiplicity of forces in their specific cultural ecology. These interrelations are further explored through a mapping study of artist-led collective activity in Leeds, UK. This study problematises the dualistic perspective of resistance and co-option between artist-led collectives and institutions. I argue that the evolution of the artist-led collective is implicitly interrelated with the institution and thus the binary opposition of resistance and co-option becomes a dialectical knot of ever-changing relationships. Finally, I situate myself in the research through an auto-ethnographic study of the artist-led collective The Retro Bar at the End of the Universe, of which I am a founding member. This case study enables an internal view of the social bonds which formed The Retro Bar at the End of the Universe and provides an insight that would otherwise be impossible from an external perspective
    • 

    corecore