300 research outputs found

    On the security of mobile sensors

    Get PDF
    PhD ThesisThe age of sensor technology is upon us. Sensor-rich mobile devices are ubiquitous. Smart-phones, tablets, and wearables are increasingly equipped with sensors such as GPS, accelerometer, Near Field Communication (NFC), and ambient sensors. Data provided by such sensors, combined with the fast-growing computational capabilities on mobile platforms, offer richer and more personalised apps. However, these sensors introduce new security challenges to the users, and make sensor management more complicated. In this PhD thesis, we contribute to the field of mobile sensor security by investigating a wide spectrum of open problems in this field covering attacks and defences, standardisation and industrial approaches, and human dimensions. We study the problems in detail and propose solutions. First, we propose “Tap-Tap and Pay” (TTP), a sensor-based protocol to prevent the Mafia attack in NFC payment. The Mafia attack is a special type of Man-In-The-Middle attack which charges the user for something more expensive than what she intends to pay by relaying transactions to a remote payment terminal. In TTP, a user initiates the payment by physically tapping her mobile phone against the reader. We observe that this tapping causes transient vibrations at both devices which are measurable by the embedded accelerometers. Our observations indicate that these sensor measurements are closely correlated within the same tapping, and different if obtained from different tapping events. By comparing the similarity between the two measurements, the bank can distinguish the Mafia fraud apart from a legitimate NFC transaction. The experimental results and the user feedback suggest the practical feasibility of TTP. As compared with previous sensor-based solutions, ours is the only one that works even when the attacker and the user are in nearby locations or share similar ambient environments. Second, we demonstrate an in-app attack based on a real world problem in contactless payment known as the card collision or card clash. A card collision happens when more than one card (or NFC-enabled device) are presented to the payment terminal’s field, and the terminal does not know which card to choose. By performing experiments, we observe that the implementation of contactless terminals in practice matches neither EMV nor ISO standards (the two primary standards for smart card payment) on card collision. Based on this inconsistency, we propose “NFC Payment Spy”, a malicious app that tracks the user’s contactless payment transactions. This app, running on a smart phone, simulates a card which requests the payment information (amount, time, etc.) from the terminal. When the phone and the card are both presented to a contactless terminal (given that many people use mobile case wallets to travel light and keep wallet essentials close to hand), our app can effectively win the race condition over the card. This attack is the first privacy attack on contactless payments based on the problem of card collision. By showing the feasibility of this attack, we raise awareness of privacy and security issues in contactless payment protocols and implementation, specifically in the presence of new technologies for payment such as mobile platforms. Third, we show that, apart from attacking mobile devices by having access to the sensors through native apps, we can also perform sensor-based attacks via mobile browsers. We examine multiple browsers on Android and iOS platforms and study their policies in granting permissions to JavaScript code with respect to access to motion and orientation sensor data. Based on our observations, we identify multiple vulnerabilities, and propose “TouchSignatures” and “PINLogger.js”, two novel attacks in which malicious JavaScript code listens to such sensor data measurements. We demonstrate that, despite the much lower sampling rate (comparing to a native app), a remote attacker is able to learn sensitive user information such as physical activities, phone call timing, touch actions (tap, scroll, hold, zoom), and PINs based on these sensor data. This is the first report of such a JavaScript-based attack. We disclosed the above vulnerability to the community and major mobile browser vendors classified the problem as high-risk and fixed it accordingly. Finally, we investigate human dimensions in the problem of sensor management. Although different types of attacks via sensors have been known for many years, the problem of data leakage caused by sensors has remained unsolved. While working with W3C and browser vendors to fix the identified problem, we came to appreciate the complexity of this problem in practice and the challenge of balancing security, usability, and functionality. We believe a major reason for this is that users are not fully aware of these sensors and the associated risks to their privacy and security. Therefore, we study user understanding of mobile sensors, specifically their risk perceptions. This is the only research to date that studies risk perceptions for a comprehensive list of mobile sensors (25 in total). We interview multiple participants from a range of backgrounds by providing them with multiple self-declared questionnaires. The results indicate that people in general do not have a good understanding of the complexities of these sensors; hence making security judgements about these sensors is not easy for them. We discuss how this observation, along with other factors, renders many academic and industry solutions ineffective. This makes the security and privacy issues of mobile sensors and other sensorenabled technologies an important topic to be investigated further

    Applied Cognitive Sciences

    Get PDF
    Cognitive science is an interdisciplinary field in the study of the mind and intelligence. The term cognition refers to a variety of mental processes, including perception, problem solving, learning, decision making, language use, and emotional experience. The basis of the cognitive sciences is the contribution of philosophy and computing to the study of cognition. Computing is very important in the study of cognition because computer-aided research helps to develop mental processes, and computers are used to test scientific hypotheses about mental organization and functioning. This book provides a platform for reviewing these disciplines and presenting cognitive research as a separate discipline

    A Framework For Abstracting, Designing And Building Tangible Gesture Interactive Systems

    Get PDF
    This thesis discusses tangible gesture interaction, a novel paradigm for interacting with computer that blends concepts from the more popular fields of tangible interaction and gesture interaction. Taking advantage of the human innate abilities to manipulate physical objects and to communicate through gestures, tangible gesture interaction is particularly interesting for interacting in smart environments, bringing the interaction with computer beyond the screen, back to the real world. Since tangible gesture interaction is a relatively new field of research, this thesis presents a conceptual framework that aims at supporting future work in this field. The Tangible Gesture Interaction Framework provides support on three levels. First, it helps reflecting from a theoretical point of view on the different types of tangible gestures that can be designed, physically, through a taxonomy based on three components (move, hold and touch) and additional attributes, and semantically, through a taxonomy of the semantic constructs that can be used to associate meaning to tangible gestures. Second, it helps conceiving new tangible gesture interactive systems and designing new interactions based on gestures with objects, through dedicated guidelines for tangible gesture definition and common practices for different application domains. Third, it helps building new tangible gesture interactive systems supporting the choice between four different technological approaches (embedded and embodied, wearable, environmental or hybrid) and providing general guidance for the different approaches. As an application of this framework, this thesis presents also seven tangible gesture interactive systems for three different application domains, i.e., interacting with the In-Vehicle Infotainment System (IVIS) of the car, the emotional and interpersonal communication, and the interaction in a smart home. For the first application domain, four different systems that use gestures on the steering wheel as interaction means with the IVIS have been designed, developed and evaluated. For the second application domain, an anthropomorphic lamp able to recognize gestures that humans typically perform for interpersonal communication has been conceived and developed. A second system, based on smart t-shirts, recognizes when two people hug and reward the gesture with an exchange of digital information. Finally, a smart watch for recognizing gestures performed with objects held in the hand in the context of the smart home has been investigated. The analysis of existing systems found in literature and of the system developed during this thesis shows that the framework has a good descriptive and evaluative power. The applications developed during this thesis show that the proposed framework has also a good generative power.Questa tesi discute l’interazione gestuale tangibile, un nuovo paradigma per interagire con il computer che unisce i principi dei più comuni campi di studio dell’interazione tangibile e dell’interazione gestuale. Sfruttando le abilità innate dell’uomo di manipolare oggetti fisici e di comunicare con i gesti, l’interazione gestuale tangibile si rivela particolarmente interessante per interagire negli ambienti intelligenti, riportando l’attenzione sul nostro mondo reale, al di là dello schermo dei computer o degli smartphone. Poiché l’interazione gestuale tangibile è un campo di studio relativamente recente, questa tesi presenta un framework (quadro teorico) che ha lo scopo di assistere lavori futuri in questo campo. Il Framework per l’Interazione Gestuale Tangibile fornisce supporto su tre livelli. Per prima cosa, aiuta a riflettere da un punto di vista teorico sui diversi tipi di gesti tangibili che possono essere eseguiti fisicamente, grazie a una tassonomia basata su tre componenti (muovere, tenere, toccare) e attributi addizionali, e che possono essere concepiti semanticamente, grazie a una tassonomia di tutti i costrutti semantici che permettono di associare dei significati ai gesti tangibili. In secondo luogo, il framework proposto aiuta a concepire nuovi sistemi interattivi basati su gesti tangibili e a ideare nuove interazioni basate su gesti con gli oggetti, attraverso linee guida per la definizione di gesti tangibili e una selezione delle migliore pratiche per i differenti campi di applicazione. Infine, il framework aiuta a implementare nuovi sistemi interattivi basati su gesti tangibili, permettendo di scegliere tra quattro differenti approcci tecnologici (incarnato e integrato negli oggetti, indossabile, distribuito nell’ambiente, o ibrido) e fornendo una guida generale per la scelta tra questi differenti approcci. Come applicazione di questo framework, questa tesi presenta anche sette sistemi interattivi basati su gesti tangibili, realizzati per tre differenti campi di applicazione: l’interazione con i sistemi di infotainment degli autoveicoli, la comunicazione interpersonale delle emozioni, e l’interazione nella casa intelligente. Per il primo campo di applicazione, sono stati progettati, sviluppati e testati quattro differenti sistemi che usano gesti tangibili effettuati sul volante come modalità di interazione con il sistema di infotainment. Per il secondo campo di applicazione, è stata concepita e sviluppata una lampada antropomorfica in grado di riconoscere i gesti tipici dell’interazione interpersonale. Per lo stesso campo di applicazione, un secondo sistema, basato su una maglietta intelligente, riconosce quando due persone si abbracciano e ricompensa questo gesto con uno scambio di informazioni digitali. Infine, per l’interazione nella casa intelligente, è stata investigata la realizzazione di uno smart watch per il riconoscimento di gesti eseguiti con oggetti tenuti nella mano. L’analisi dei sistemi interattivi esistenti basati su gesti tangibili permette di dimostrare che il framework ha un buon potere descrittivo e valutativo. Le applicazioni sviluppate durante la tesi mostrano che il framework proposto ha anche un valido potere generativo

    Robust Audio and WiFi Sensing via Domain Adaptation and Knowledge Sharing From External Domains

    Get PDF
    Recent advancements in machine learning have initiated a revolution in embedded sensing and inference systems. Acoustic and WiFi-based sensing and inference systems have enabled a wide variety of applications ranging from home activity detection to health vitals monitoring. While many existing solutions paved the way for acoustic event recognition and WiFi-based activity detection, the diverse characteristics in sensors, systems, and environments used for data capture cause a shift in the distribution of data and thus results in sub-optimal classification performance when the sensor and environment discrepancy occurs between training and inference stage. Moreover, large-scale acoustic and WiFi data collection is non-trivial and cumbersome. Therefore, current acoustic and WiFi-based sensing systems suffer when there is a lack of labeled samples as they only rely on the provided training data. In this thesis, we aim to address the performance loss of machine learning-based classifiers for acoustic and WiFi-based sensing systems due to sensor and environment heterogeneity and lack of labeled examples. We show that discovering latent domains (sensor type, environment, etc.) and removing domain bias from machine learning classifiers make acoustic and WiFi-based sensing robust and generalized. We also propose a few-shot domain adaptation method that requires only one labeled sample for a new domain that relieves the users and developers from the painstaking task of data collection at each new domain. Furthermore, to address the lack of labeled examples, we propose to exploit the information or learned knowledge from sources where available data already exists in volumes, such as textual descriptions and visual domain. We implemented our algorithms in mobile and embedded platforms and collected data from participants to evaluate our proposed algorithms and frameworks in an extensive manner.Doctor of Philosoph

    Contributions to Pen & Touch Human-Computer Interaction

    Full text link
    [EN] Computers are now present everywhere, but their potential is not fully exploited due to some lack of acceptance. In this thesis, the pen computer paradigm is adopted, whose main idea is to replace all input devices by a pen and/or the fingers, given that the origin of the rejection comes from using unfriendly interaction devices that must be replaced by something easier for the user. This paradigm, that was was proposed several years ago, has been only recently fully implemented in products, such as the smartphones. But computers are actual illiterates that do not understand gestures or handwriting, thus a recognition step is required to "translate" the meaning of these interactions to computer-understandable language. And for this input modality to be actually usable, its recognition accuracy must be high enough. In order to realistically think about the broader deployment of pen computing, it is necessary to improve the accuracy of handwriting and gesture recognizers. This thesis is devoted to study different approaches to improve the recognition accuracy of those systems. First, we will investigate how to take advantage of interaction-derived information to improve the accuracy of the recognizer. In particular, we will focus on interactive transcription of text images. Here the system initially proposes an automatic transcript. If necessary, the user can make some corrections, implicitly validating a correct part of the transcript. Then the system must take into account this validated prefix to suggest a suitable new hypothesis. Given that in such application the user is constantly interacting with the system, it makes sense to adapt this interactive application to be used on a pen computer. User corrections will be provided by means of pen-strokes and therefore it is necessary to introduce a recognizer in charge of decoding this king of nondeterministic user feedback. However, this recognizer performance can be boosted by taking advantage of interaction-derived information, such as the user-validated prefix. Then, this thesis focuses on the study of human movements, in particular, hand movements, from a generation point of view by tapping into the kinematic theory of rapid human movements and the Sigma-Lognormal model. Understanding how the human body generates movements and, particularly understand the origin of the human movement variability, is important in the development of a recognition system. The contribution of this thesis to this topic is important, since a new technique (which improves the previous results) to extract the Sigma-lognormal model parameters is presented. Closely related to the previous work, this thesis study the benefits of using synthetic data as training. The easiest way to train a recognizer is to provide "infinite" data, representing all possible variations. In general, the more the training data, the smaller the error. But usually it is not possible to infinitely increase the size of a training set. Recruiting participants, data collection, labeling, etc., necessary for achieving this goal can be time-consuming and expensive. One way to overcome this problem is to create and use synthetically generated data that looks like the human. We study how to create these synthetic data and explore different approaches on how to use them, both for handwriting and gesture recognition. The different contributions of this thesis have obtained good results, producing several publications in international conferences and journals. Finally, three applications related to the work of this thesis are presented. First, we created Escritorie, a digital desk prototype based on the pen computer paradigm for transcribing handwritten text images. Second, we developed "Gestures à Go Go", a web application for bootstrapping gestures. Finally, we studied another interactive application under the pen computer paradigm. In this case, we study how translation reviewing can be done more ergonomically using a pen.[ES] Hoy en día, los ordenadores están presentes en todas partes pero su potencial no se aprovecha debido al "miedo" que se les tiene. En esta tesis se adopta el paradigma del pen computer, cuya idea fundamental es sustituir todos los dispositivos de entrada por un lápiz electrónico o, directamente, por los dedos. El origen del rechazo a los ordenadores proviene del uso de interfaces poco amigables para el humano. El origen de este paradigma data de hace más de 40 años, pero solo recientemente se ha comenzado a implementar en dispositivos móviles. La lenta y tardía implantación probablemente se deba a que es necesario incluir un reconocedor que "traduzca" los trazos del usuario (texto manuscrito o gestos) a algo entendible por el ordenador. Para pensar de forma realista en la implantación del pen computer, es necesario mejorar la precisión del reconocimiento de texto y gestos. El objetivo de esta tesis es el estudio de diferentes estrategias para mejorar esta precisión. En primer lugar, esta tesis investiga como aprovechar información derivada de la interacción para mejorar el reconocimiento, en concreto, en la transcripción interactiva de imágenes con texto manuscrito. En la transcripción interactiva, el sistema y el usuario trabajan "codo con codo" para generar la transcripción. El usuario valida la salida del sistema proporcionando ciertas correcciones, mediante texto manuscrito, que el sistema debe tener en cuenta para proporcionar una mejor transcripción. Este texto manuscrito debe ser reconocido para ser utilizado. En esta tesis se propone aprovechar información contextual, como por ejemplo, el prefijo validado por el usuario, para mejorar la calidad del reconocimiento de la interacción. Tras esto, la tesis se centra en el estudio del movimiento humano, en particular del movimiento de las manos, utilizando la Teoría Cinemática y su modelo Sigma-Lognormal. Entender como se mueven las manos al escribir, y en particular, entender el origen de la variabilidad de la escritura, es importante para el desarrollo de un sistema de reconocimiento, La contribución de esta tesis a este tópico es importante, dado que se presenta una nueva técnica (que mejora los resultados previos) para extraer el modelo Sigma-Lognormal de trazos manuscritos. De forma muy relacionada con el trabajo anterior, se estudia el beneficio de utilizar datos sintéticos como entrenamiento. La forma más fácil de entrenar un reconocedor es proporcionar un conjunto de datos "infinito" que representen todas las posibles variaciones. En general, cuanto más datos de entrenamiento, menor será el error del reconocedor. No obstante, muchas veces no es posible proporcionar más datos, o hacerlo es muy caro. Por ello, se ha estudiado como crear y usar datos sintéticos que se parezcan a los reales. Las diferentes contribuciones de esta tesis han obtenido buenos resultados, produciendo varias publicaciones en conferencias internacionales y revistas. Finalmente, también se han explorado tres aplicaciones relaciones con el trabajo de esta tesis. En primer lugar, se ha creado Escritorie, un prototipo de mesa digital basada en el paradigma del pen computer para realizar transcripción interactiva de documentos manuscritos. En segundo lugar, se ha desarrollado "Gestures à Go Go", una aplicación web para generar datos sintéticos y empaquetarlos con un reconocedor de forma rápida y sencilla. Por último, se presenta un sistema interactivo real bajo el paradigma del pen computer. En este caso, se estudia como la revisión de traducciones automáticas se puede realizar de forma más ergonómica.[CA] Avui en dia, els ordinadors són presents a tot arreu i es comunament acceptat que la seva utilització proporciona beneficis. No obstant això, moltes vegades el seu potencial no s'aprofita totalment. En aquesta tesi s'adopta el paradigma del pen computer, on la idea fonamental és substituir tots els dispositius d'entrada per un llapis electrònic, o, directament, pels dits. Aquest paradigma postula que l'origen del rebuig als ordinadors prové de l'ús d'interfícies poc amigables per a l'humà, que han de ser substituïdes per alguna cosa més coneguda. Per tant, la interacció amb l'ordinador sota aquest paradigma es realitza per mitjà de text manuscrit i/o gestos. L'origen d'aquest paradigma data de fa més de 40 anys, però només recentment s'ha començat a implementar en dispositius mòbils. La lenta i tardana implantació probablement es degui al fet que és necessari incloure un reconeixedor que "tradueixi" els traços de l'usuari (text manuscrit o gestos) a alguna cosa comprensible per l'ordinador, i el resultat d'aquest reconeixement, actualment, és lluny de ser òptim. Per pensar de forma realista en la implantació del pen computer, cal millorar la precisió del reconeixement de text i gestos. L'objectiu d'aquesta tesi és l'estudi de diferents estratègies per millorar aquesta precisió. En primer lloc, aquesta tesi investiga com aprofitar informació derivada de la interacció per millorar el reconeixement, en concret, en la transcripció interactiva d'imatges amb text manuscrit. En la transcripció interactiva, el sistema i l'usuari treballen "braç a braç" per generar la transcripció. L'usuari valida la sortida del sistema donant certes correccions, que el sistema ha d'usar per millorar la transcripció. En aquesta tesi es proposa utilitzar correccions manuscrites, que el sistema ha de reconèixer primer. La qualitat del reconeixement d'aquesta interacció és millorada, tenint en compte informació contextual, com per exemple, el prefix validat per l'usuari. Després d'això, la tesi se centra en l'estudi del moviment humà en particular del moviment de les mans, des del punt de vista generatiu, utilitzant la Teoria Cinemàtica i el model Sigma-Lognormal. Entendre com es mouen les mans en escriure és important per al desenvolupament d'un sistema de reconeixement, en particular, per entendre l'origen de la variabilitat de l'escriptura. La contribució d'aquesta tesi a aquest tòpic és important, atès que es presenta una nova tècnica (que millora els resultats previs) per extreure el model Sigma- Lognormal de traços manuscrits. De forma molt relacionada amb el treball anterior, s'estudia el benefici d'utilitzar dades sintètiques per a l'entrenament. La forma més fàcil d'entrenar un reconeixedor és proporcionar un conjunt de dades "infinit" que representin totes les possibles variacions. En general, com més dades d'entrenament, menor serà l'error del reconeixedor. No obstant això, moltes vegades no és possible proporcionar més dades, o fer-ho és molt car. Per això, s'ha estudiat com crear i utilitzar dades sintètiques que s'assemblin a les reals. Les diferents contribucions d'aquesta tesi han obtingut bons resultats, produint diverses publicacions en conferències internacionals i revistes. Finalment, també s'han explorat tres aplicacions relacionades amb el treball d'aquesta tesi. En primer lloc, s'ha creat Escritorie, un prototip de taula digital basada en el paradigma del pen computer per realitzar transcripció interactiva de documents manuscrits. En segon lloc, s'ha desenvolupat "Gestures à Go Go", una aplicació web per a generar dades sintètiques i empaquetar-les amb un reconeixedor de forma ràpida i senzilla. Finalment, es presenta un altre sistema inter- actiu sota el paradigma del pen computer. En aquest cas, s'estudia com la revisió de traduccions automàtiques es pot realitzar de forma més ergonòmica.Martín-Albo Simón, D. (2016). Contributions to Pen & Touch Human-Computer Interaction [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/68482TESI

    Psychophysiological analysis of a pedagogical agent and robotic peer for individuals with autism spectrum disorders.

    Get PDF
    Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by ongoing problems in social interaction and communication, and engagement in repetitive behaviors. According to Centers for Disease Control and Prevention, an estimated 1 in 68 children in the United States has ASD. Mounting evidence shows that many of these individuals display an interest in social interaction with computers and robots and, in general, feel comfortable spending time in such environments. It is known that the subtlety and unpredictability of people’s social behavior are intimidating and confusing for many individuals with ASD. Computerized learning environments and robots, however, prepare a predictable, dependable, and less complicated environment, where the interaction complexity can be adjusted so as to account for these individuals’ needs. The first phase of this dissertation presents an artificial-intelligence-based tutoring system which uses an interactive computer character as a pedagogical agent (PA) that simulates a human tutor teaching sight word reading to individuals with ASD. This phase examines the efficacy of an instructional package comprised of an autonomous pedagogical agent, automatic speech recognition, and an evidence-based instructional procedure referred to as constant time delay (CTD). A concurrent multiple-baseline across-participants design is used to evaluate the efficacy of intervention. Additionally, post-treatment probes are conducted to assess maintenance and generalization. The results suggest that all three participants acquired and maintained new sight words and demonstrated generalized responding. The second phase of this dissertation describes the augmentation of the tutoring system developed in the first phase with an autonomous humanoid robot which serves the instructional role of a peer for the student. In this tutoring paradigm, the robot adopts a peer metaphor, where its function is to act as a peer. With the introduction of the robotic peer (RP), the traditional dyadic interaction in tutoring systems is augmented to a novel triadic interaction in order to enhance the social richness of the tutoring system, and to facilitate learning through peer observation. This phase evaluates the feasibility and effects of using PA-delivered sight word instruction, based on a CTD procedure, within a small-group arrangement including a student with ASD and the robotic peer. A multiple-probe design across word sets, replicated across three participants, is used to evaluate the efficacy of intervention. The findings illustrate that all three participants acquired, maintained, and generalized all the words targeted for instruction. Furthermore, they learned a high percentage (94.44% on average) of the non-target words exclusively instructed to the RP. The data show that not only did the participants learn nontargeted words by observing the instruction to the RP but they also acquired their target words more efficiently and with less errors by the addition of an observational component to the direct instruction. The third and fourth phases of this dissertation focus on physiology-based modeling of the participants’ affective experiences during naturalistic interaction with the developed tutoring system. While computers and robots have begun to co-exist with humans and cooperatively share various tasks; they are still deficient in interpreting and responding to humans as emotional beings. Wearable biosensors that can be used for computerized emotion recognition offer great potential for addressing this issue. The third phase presents a Bluetooth-enabled eyewear – EmotiGO – for unobtrusive acquisition of a set of physiological signals, i.e., skin conductivity, photoplethysmography, and skin temperature, which can be used as autonomic readouts of emotions. EmotiGO is unobtrusive and sufficiently lightweight to be worn comfortably without interfering with the users’ usual activities. This phase presents the architecture of the device and results from testing that verify its effectiveness against an FDA-approved system for physiological measurement. The fourth and final phase attempts to model the students’ engagement levels using their physiological signals collected with EmotiGO during naturalistic interaction with the tutoring system developed in the second phase. Several physiological indices are extracted from each of the signals. The students’ engagement levels during the interaction with the tutoring system are rated by two trained coders using the video recordings of the instructional sessions. Supervised pattern recognition algorithms are subsequently used to map the physiological indices to the engagement scores. The results indicate that the trained models are successful at classifying participants’ engagement levels with the mean classification accuracy of 86.50%. These models are an important step toward an intelligent tutoring system that can dynamically adapt its pedagogical strategies to the affective needs of learners with ASD

    An Intelligent Robot and Augmented Reality Instruction System

    Get PDF
    Human-Centered Robotics (HCR) is a research area that focuses on how robots can empower people to live safer, simpler, and more independent lives. In this dissertation, I present a combination of two technologies to deliver human-centric solutions to an important population. The first nascent area that I investigate is the creation of an Intelligent Robot Instructor (IRI) as a learning and instruction tool for human pupils. The second technology is the use of augmented reality (AR) to create an Augmented Reality Instruction (ARI) system to provide instruction via a wearable interface. To function in an intelligent and context-aware manner, both systems require the ability to reason about their perception of the environment and make appropriate decisions. In this work, I construct a novel formulation of several education methodologies, particularly those known as response prompting, as part of a cognitive framework to create a system for intelligent instruction, and compare these methodologies in the context of intelligent decision making using both technologies. The IRI system is demonstrated through experiments with a humanoid robot that uses object recognition and localization for perception and interacts with students through speech, gestures, and object interaction. The ARI system uses augmented reality, computer vision, and machine learning methods to create an intelligent, contextually aware instructional system. By using AR to teach prerequisite skills that lend themselves well to visual, augmented reality instruction prior to a robot instructor teaching skills that lend themselves to embodied interaction, I am able to demonstrate the potential of each system independently as well as in combination to facilitate students\u27 learning. I identify people with intellectual and developmental disabilities (I/DD) as a particularly significant use case and show that IRI and ARI systems can help fulfill the compelling need to develop tools and strategies for people with I/DD. I present results that demonstrate both systems can be used independently by students with I/DD to quickly and easily acquire the skills required for performance of relevant vocational tasks. This is the first successful real-world application of response-prompting for decision making in a robotic and augmented reality intelligent instruction system

    Online learning of personalised human activity recognition models from user-provided annotations

    Get PDF
    PhD ThesisIn Human Activity Recognition (HAR), supervised and semi-supervised training are important tools for devising parametric activity models. For the best modelling performance, large amounts of annotated personalised sample data are typically required. Annotating often represents the bottleneck in the overall modelling process as it usually involves retrospective analysis of experimental ground truth, like video footage. These approaches typically neglect that prospective users of HAR systems are themselves key sources of ground truth for their own activities. This research therefore involves the users of HAR monitors in the annotation process. The process relies solely on users' short term memory and engages with them to parsimoniously provide annotations for their own activities as they unfold. E ects of user input are optimised by using Online Active Learning (OAL) to identify the most critical annotations which are expected to lead to highly optimal HAR model performance gains. Personalised HAR models are trained from user-provided annotations as part of the evaluation, focusing mainly on objective model accuracy. The OAL approach is contrasted with Random Selection (RS) { a naive method which makes uninformed annotation requests. A range of simulation-based annotation scenarios demonstrate that using OAL brings bene ts in terms of HAR model performance over RS. Additionally, a mobile application is implemented and deployed in a naturalistic context to collect annotations from a panel of human participants. The deployment is proof that the method can truly run in online mode and it also shows that considerable HAR model performance gains can be registered even under realistic conditions. The ndings from this research point to the conclusion that online learning from userprovided annotations is a valid solution to the problem of constructing personalised HAR models

    Guided Autonomy for Quadcopter Photography

    Get PDF
    Photographing small objects with a quadcopter is non-trivial to perform with many common user interfaces, especially when it requires maneuvering an Unmanned Aerial Vehicle (C) to difficult angles in order to shoot high perspectives. The aim of this research is to employ machine learning to support better user interfaces for quadcopter photography. Human Robot Interaction (HRI) is supported by visual servoing, a specialized vision system for real-time object detection, and control policies acquired through reinforcement learning (RL). Two investigations of guided autonomy were conducted. In the first, the user directed the quadcopter with a sketch based interface, and periods of user direction were interspersed with periods of autonomous flight. In the second, the user directs the quadcopter by taking a single photo with a handheld mobile device, and the quadcopter autonomously flies to the requested vantage point. This dissertation focuses on the following problems: 1) evaluating different user interface paradigms for dynamic photography in a GPS-denied environment; 2) learning better Convolutional Neural Network (CNN) object detection models to assure a higher precision in detecting human subjects than the currently available state-of-the-art fast models; 3) transferring learning from the Gazebo simulation into the real world; 4) learning robust control policies using deep reinforcement learning to maneuver the quadcopter to multiple shooting positions with minimal human interaction
    corecore