3,105 research outputs found

    Pinching sweaters on your phone – iShoogle : multi-gesture touchscreen fabric simulator using natural on-fabric gestures to communicate textile qualities

    Get PDF
    The inability to touch fabrics online frustrates consumers, who are used to evaluating physical textiles by engaging in complex, natural gestural interactions. When customers interact with physical fabrics, they combine cross-modal information about the fabric's look, sound and handle to build an impression of its physical qualities. But whenever an interaction with a fabric is limited (i.e. when watching clothes online) there is a perceptual gap between the fabric qualities perceived digitally and the actual fabric qualities that a person would perceive when interacting with the physical fabric. The goal of this thesis was to create a fabric simulator that minimized this perceptual gap, enabling accurate perception of the qualities of fabrics presented digitally. We designed iShoogle, a multi-gesture touch-screen sound-enabled fabric simulator that aimed to create an accurate representation of fabric qualities without the need for touching the physical fabric swatch. iShoogle uses on-screen gestures (inspired by natural on-fabric movements e.g. Crunching) to control pre-recorded videos and audio of fabrics being deformed (e.g. being Crunched). iShoogle creates an illusion of direct video manipulation and also direct manipulation of the displayed fabric. This thesis describes the results of nine studies leading towards the development and evaluation of iShoogle. In the first three studies, we combined expert and non-expert textile-descriptive words and grouped them into eight dimensions labelled with terms Crisp, Hard, Soft, Textured, Flexible, Furry, Rough and Smooth. These terms were used to rate fabric qualities throughout the thesis. We observed natural on-fabric gestures during a fabric handling study (Study 4) and used the results to design iShoogle's on-screen gestures. In Study 5 we examined iShoogle's performance and speed in a fabric handling task and in Study 6 we investigated users' preferences for sound playback interactivity. iShoogle's accuracy was then evaluated in the last three studies by comparing participants’ ratings of textile qualities when using iShoogle with ratings produced when handling physical swatches. We also described the recording and processing techniques for the video and audio content that iShoogle used. Finally, we described the iShoogle iPhone app that was released to the general public. Our evaluation studies showed that iShoogle significantly improved the accuracy of fabric perception in at least some cases. Further research could investigate which fabric qualities and which fabrics are particularly suited to be represented with iShoogle

    Augmented Reality

    Get PDF
    Augmented Reality (AR) is a natural development from virtual reality (VR), which was developed several decades earlier. AR complements VR in many ways. Due to the advantages of the user being able to see both the real and virtual objects simultaneously, AR is far more intuitive, but it's not completely detached from human factors and other restrictions. AR doesn't consume as much time and effort in the applications because it's not required to construct the entire virtual scene and the environment. In this book, several new and emerging application areas of AR are presented and divided into three sections. The first section contains applications in outdoor and mobile AR, such as construction, restoration, security and surveillance. The second section deals with AR in medical, biological, and human bodies. The third and final section contains a number of new and useful applications in daily living and learning

    A Comprehensive Review of Data-Driven Co-Speech Gesture Generation

    Full text link
    Gestures that accompany speech are an essential part of natural and efficient embodied human communication. The automatic generation of such co-speech gestures is a long-standing problem in computer animation and is considered an enabling technology in film, games, virtual social spaces, and for interaction with social robots. The problem is made challenging by the idiosyncratic and non-periodic nature of human co-speech gesture motion, and by the great diversity of communicative functions that gestures encompass. Gesture generation has seen surging interest recently, owing to the emergence of more and larger datasets of human gesture motion, combined with strides in deep-learning-based generative models, that benefit from the growing availability of data. This review article summarizes co-speech gesture generation research, with a particular focus on deep generative models. First, we articulate the theory describing human gesticulation and how it complements speech. Next, we briefly discuss rule-based and classical statistical gesture synthesis, before delving into deep learning approaches. We employ the choice of input modalities as an organizing principle, examining systems that generate gestures from audio, text, and non-linguistic input. We also chronicle the evolution of the related training data sets in terms of size, diversity, motion quality, and collection method. Finally, we identify key research challenges in gesture generation, including data availability and quality; producing human-like motion; grounding the gesture in the co-occurring speech in interaction with other speakers, and in the environment; performing gesture evaluation; and integration of gesture synthesis into applications. We highlight recent approaches to tackling the various key challenges, as well as the limitations of these approaches, and point toward areas of future development.Comment: Accepted for EUROGRAPHICS 202

    Continuous Interaction with a Virtual Human

    Get PDF
    Attentive Speaking and Active Listening require that a Virtual Human be capable of simultaneous perception/interpretation and production of communicative behavior. A Virtual Human should be able to signal its attitude and attention while it is listening to its interaction partner, and be able to attend to its interaction partner while it is speaking – and modify its communicative behavior on-the-fly based on what it perceives from its partner. This report presents the results of a four week summer project that was part of eNTERFACE’10. The project resulted in progress on several aspects of continuous interaction such as scheduling and interrupting multimodal behavior, automatic classification of listener responses, generation of response eliciting behavior, and models for appropriate reactions to listener responses. A pilot user study was conducted with ten participants. In addition, the project yielded a number of deliverables that are released for public access

    Predicting and Reducing the Impact of Errors in Character-Based Text Entry

    Get PDF
    This dissertation focuses on the effect of errors in character-based text entry techniques. The effect of errors is targeted from theoretical, behavioral, and practical standpoints. This document starts with a review of the existing literature. It then presents results of a user study that investigated the effect of different error correction conditions on popular text entry performance metrics. Results showed that the way errors are handled has a significant effect on all frequently used error metrics. The outcomes also provided an understanding of how users notice and correct errors. Building on this, the dissertation then presents a new high-level and method-agnostic model for predicting the cost of error correction with a given text entry technique. Unlike the existing models, it accounts for both human and system factors and is general enough to be used with most character-based techniques. A user study verified the model through measuring the effects of a faulty keyboard on text entry performance. Subsequently, the work then explores the potential user adaptation to a gesture recognizer’s misrecognitions in two user studies. Results revealed that users gradually adapt to misrecognition errors by replacing the erroneous gestures with alternative ones, if available. Also, users adapt to a frequently misrecognized gesture faster if it occurs more frequently than the other error-prone gestures. Finally, this work presents a new hybrid approach to simulate pressure detection on standard touchscreens. The new approach combines the existing touch-point- and time-based methods. Results of two user studies showed that it can simulate pressure detection more reliably for at least two pressure levels: regular (~1 N) and extra (~3 N). Then, a new pressure-based text entry technique is presented that does not require tapping outside the virtual keyboard to reject an incorrect or unwanted prediction. Instead, the technique requires users to apply extra pressure for the tap on the next target key. The performance of the new technique was compared with the conventional technique in a user study. Results showed that for inputting short English phrases with 10% non-dictionary words, the new technique increases entry speed by 9% and decreases error rates by 25%. Also, most users (83%) favor the new technique over the conventional one. Together, the research presented in this dissertation gives more insight into on how errors affect text entry and also presents improved text entry methods

    Contributions to Pen & Touch Human-Computer Interaction

    Full text link
    [EN] Computers are now present everywhere, but their potential is not fully exploited due to some lack of acceptance. In this thesis, the pen computer paradigm is adopted, whose main idea is to replace all input devices by a pen and/or the fingers, given that the origin of the rejection comes from using unfriendly interaction devices that must be replaced by something easier for the user. This paradigm, that was was proposed several years ago, has been only recently fully implemented in products, such as the smartphones. But computers are actual illiterates that do not understand gestures or handwriting, thus a recognition step is required to "translate" the meaning of these interactions to computer-understandable language. And for this input modality to be actually usable, its recognition accuracy must be high enough. In order to realistically think about the broader deployment of pen computing, it is necessary to improve the accuracy of handwriting and gesture recognizers. This thesis is devoted to study different approaches to improve the recognition accuracy of those systems. First, we will investigate how to take advantage of interaction-derived information to improve the accuracy of the recognizer. In particular, we will focus on interactive transcription of text images. Here the system initially proposes an automatic transcript. If necessary, the user can make some corrections, implicitly validating a correct part of the transcript. Then the system must take into account this validated prefix to suggest a suitable new hypothesis. Given that in such application the user is constantly interacting with the system, it makes sense to adapt this interactive application to be used on a pen computer. User corrections will be provided by means of pen-strokes and therefore it is necessary to introduce a recognizer in charge of decoding this king of nondeterministic user feedback. However, this recognizer performance can be boosted by taking advantage of interaction-derived information, such as the user-validated prefix. Then, this thesis focuses on the study of human movements, in particular, hand movements, from a generation point of view by tapping into the kinematic theory of rapid human movements and the Sigma-Lognormal model. Understanding how the human body generates movements and, particularly understand the origin of the human movement variability, is important in the development of a recognition system. The contribution of this thesis to this topic is important, since a new technique (which improves the previous results) to extract the Sigma-lognormal model parameters is presented. Closely related to the previous work, this thesis study the benefits of using synthetic data as training. The easiest way to train a recognizer is to provide "infinite" data, representing all possible variations. In general, the more the training data, the smaller the error. But usually it is not possible to infinitely increase the size of a training set. Recruiting participants, data collection, labeling, etc., necessary for achieving this goal can be time-consuming and expensive. One way to overcome this problem is to create and use synthetically generated data that looks like the human. We study how to create these synthetic data and explore different approaches on how to use them, both for handwriting and gesture recognition. The different contributions of this thesis have obtained good results, producing several publications in international conferences and journals. Finally, three applications related to the work of this thesis are presented. First, we created Escritorie, a digital desk prototype based on the pen computer paradigm for transcribing handwritten text images. Second, we developed "Gestures à Go Go", a web application for bootstrapping gestures. Finally, we studied another interactive application under the pen computer paradigm. In this case, we study how translation reviewing can be done more ergonomically using a pen.[ES] Hoy en día, los ordenadores están presentes en todas partes pero su potencial no se aprovecha debido al "miedo" que se les tiene. En esta tesis se adopta el paradigma del pen computer, cuya idea fundamental es sustituir todos los dispositivos de entrada por un lápiz electrónico o, directamente, por los dedos. El origen del rechazo a los ordenadores proviene del uso de interfaces poco amigables para el humano. El origen de este paradigma data de hace más de 40 años, pero solo recientemente se ha comenzado a implementar en dispositivos móviles. La lenta y tardía implantación probablemente se deba a que es necesario incluir un reconocedor que "traduzca" los trazos del usuario (texto manuscrito o gestos) a algo entendible por el ordenador. Para pensar de forma realista en la implantación del pen computer, es necesario mejorar la precisión del reconocimiento de texto y gestos. El objetivo de esta tesis es el estudio de diferentes estrategias para mejorar esta precisión. En primer lugar, esta tesis investiga como aprovechar información derivada de la interacción para mejorar el reconocimiento, en concreto, en la transcripción interactiva de imágenes con texto manuscrito. En la transcripción interactiva, el sistema y el usuario trabajan "codo con codo" para generar la transcripción. El usuario valida la salida del sistema proporcionando ciertas correcciones, mediante texto manuscrito, que el sistema debe tener en cuenta para proporcionar una mejor transcripción. Este texto manuscrito debe ser reconocido para ser utilizado. En esta tesis se propone aprovechar información contextual, como por ejemplo, el prefijo validado por el usuario, para mejorar la calidad del reconocimiento de la interacción. Tras esto, la tesis se centra en el estudio del movimiento humano, en particular del movimiento de las manos, utilizando la Teoría Cinemática y su modelo Sigma-Lognormal. Entender como se mueven las manos al escribir, y en particular, entender el origen de la variabilidad de la escritura, es importante para el desarrollo de un sistema de reconocimiento, La contribución de esta tesis a este tópico es importante, dado que se presenta una nueva técnica (que mejora los resultados previos) para extraer el modelo Sigma-Lognormal de trazos manuscritos. De forma muy relacionada con el trabajo anterior, se estudia el beneficio de utilizar datos sintéticos como entrenamiento. La forma más fácil de entrenar un reconocedor es proporcionar un conjunto de datos "infinito" que representen todas las posibles variaciones. En general, cuanto más datos de entrenamiento, menor será el error del reconocedor. No obstante, muchas veces no es posible proporcionar más datos, o hacerlo es muy caro. Por ello, se ha estudiado como crear y usar datos sintéticos que se parezcan a los reales. Las diferentes contribuciones de esta tesis han obtenido buenos resultados, produciendo varias publicaciones en conferencias internacionales y revistas. Finalmente, también se han explorado tres aplicaciones relaciones con el trabajo de esta tesis. En primer lugar, se ha creado Escritorie, un prototipo de mesa digital basada en el paradigma del pen computer para realizar transcripción interactiva de documentos manuscritos. En segundo lugar, se ha desarrollado "Gestures à Go Go", una aplicación web para generar datos sintéticos y empaquetarlos con un reconocedor de forma rápida y sencilla. Por último, se presenta un sistema interactivo real bajo el paradigma del pen computer. En este caso, se estudia como la revisión de traducciones automáticas se puede realizar de forma más ergonómica.[CA] Avui en dia, els ordinadors són presents a tot arreu i es comunament acceptat que la seva utilització proporciona beneficis. No obstant això, moltes vegades el seu potencial no s'aprofita totalment. En aquesta tesi s'adopta el paradigma del pen computer, on la idea fonamental és substituir tots els dispositius d'entrada per un llapis electrònic, o, directament, pels dits. Aquest paradigma postula que l'origen del rebuig als ordinadors prové de l'ús d'interfícies poc amigables per a l'humà, que han de ser substituïdes per alguna cosa més coneguda. Per tant, la interacció amb l'ordinador sota aquest paradigma es realitza per mitjà de text manuscrit i/o gestos. L'origen d'aquest paradigma data de fa més de 40 anys, però només recentment s'ha començat a implementar en dispositius mòbils. La lenta i tardana implantació probablement es degui al fet que és necessari incloure un reconeixedor que "tradueixi" els traços de l'usuari (text manuscrit o gestos) a alguna cosa comprensible per l'ordinador, i el resultat d'aquest reconeixement, actualment, és lluny de ser òptim. Per pensar de forma realista en la implantació del pen computer, cal millorar la precisió del reconeixement de text i gestos. L'objectiu d'aquesta tesi és l'estudi de diferents estratègies per millorar aquesta precisió. En primer lloc, aquesta tesi investiga com aprofitar informació derivada de la interacció per millorar el reconeixement, en concret, en la transcripció interactiva d'imatges amb text manuscrit. En la transcripció interactiva, el sistema i l'usuari treballen "braç a braç" per generar la transcripció. L'usuari valida la sortida del sistema donant certes correccions, que el sistema ha d'usar per millorar la transcripció. En aquesta tesi es proposa utilitzar correccions manuscrites, que el sistema ha de reconèixer primer. La qualitat del reconeixement d'aquesta interacció és millorada, tenint en compte informació contextual, com per exemple, el prefix validat per l'usuari. Després d'això, la tesi se centra en l'estudi del moviment humà en particular del moviment de les mans, des del punt de vista generatiu, utilitzant la Teoria Cinemàtica i el model Sigma-Lognormal. Entendre com es mouen les mans en escriure és important per al desenvolupament d'un sistema de reconeixement, en particular, per entendre l'origen de la variabilitat de l'escriptura. La contribució d'aquesta tesi a aquest tòpic és important, atès que es presenta una nova tècnica (que millora els resultats previs) per extreure el model Sigma- Lognormal de traços manuscrits. De forma molt relacionada amb el treball anterior, s'estudia el benefici d'utilitzar dades sintètiques per a l'entrenament. La forma més fàcil d'entrenar un reconeixedor és proporcionar un conjunt de dades "infinit" que representin totes les possibles variacions. En general, com més dades d'entrenament, menor serà l'error del reconeixedor. No obstant això, moltes vegades no és possible proporcionar més dades, o fer-ho és molt car. Per això, s'ha estudiat com crear i utilitzar dades sintètiques que s'assemblin a les reals. Les diferents contribucions d'aquesta tesi han obtingut bons resultats, produint diverses publicacions en conferències internacionals i revistes. Finalment, també s'han explorat tres aplicacions relacionades amb el treball d'aquesta tesi. En primer lloc, s'ha creat Escritorie, un prototip de taula digital basada en el paradigma del pen computer per realitzar transcripció interactiva de documents manuscrits. En segon lloc, s'ha desenvolupat "Gestures à Go Go", una aplicació web per a generar dades sintètiques i empaquetar-les amb un reconeixedor de forma ràpida i senzilla. Finalment, es presenta un altre sistema inter- actiu sota el paradigma del pen computer. En aquest cas, s'estudia com la revisió de traduccions automàtiques es pot realitzar de forma més ergonòmica.Martín-Albo Simón, D. (2016). Contributions to Pen & Touch Human-Computer Interaction [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/68482TESI

    Human or Robot?: Investigating voice, appearance and gesture motion realism of conversational social agents

    Get PDF
    Research on creation of virtual humans enables increasing automatization of their behavior, including synthesis of verbal and nonverbal behavior. As the achievable realism of different aspects of agent design evolves asynchronously, it is important to understand if and how divergence in realism between behavioral channels can elicit negative user responses. Specifically, in this work, we investigate the question of whether autonomous virtual agents relying on synthetic text-to-speech voices should portray a corresponding level of realism in the non-verbal channels of motion and visual appearance, or if, alternatively, the best available realism of each channel should be used. In two perceptual studies, we assess how realism of voice, motion, and appearance influence the perceived match of speech and gesture motion, as well as the agent\u27s likability and human-likeness. Our results suggest that maximizing realism of voice and motion is preferable even when this leads to realism mismatches, but for visual appearance, lower realism may be preferable. (A video abstract can be found at https://youtu.be/arfZZ-hxD1Y.

    Physical Diagnosis and Rehabilitation Technologies

    Get PDF
    The book focuses on the diagnosis, evaluation, and assistance of gait disorders; all the papers have been contributed by research groups related to assistive robotics, instrumentations, and augmentative devices
    corecore