251 research outputs found

    Contributions to Pen & Touch Human-Computer Interaction

    Full text link
    [EN] Computers are now present everywhere, but their potential is not fully exploited due to some lack of acceptance. In this thesis, the pen computer paradigm is adopted, whose main idea is to replace all input devices by a pen and/or the fingers, given that the origin of the rejection comes from using unfriendly interaction devices that must be replaced by something easier for the user. This paradigm, that was was proposed several years ago, has been only recently fully implemented in products, such as the smartphones. But computers are actual illiterates that do not understand gestures or handwriting, thus a recognition step is required to "translate" the meaning of these interactions to computer-understandable language. And for this input modality to be actually usable, its recognition accuracy must be high enough. In order to realistically think about the broader deployment of pen computing, it is necessary to improve the accuracy of handwriting and gesture recognizers. This thesis is devoted to study different approaches to improve the recognition accuracy of those systems. First, we will investigate how to take advantage of interaction-derived information to improve the accuracy of the recognizer. In particular, we will focus on interactive transcription of text images. Here the system initially proposes an automatic transcript. If necessary, the user can make some corrections, implicitly validating a correct part of the transcript. Then the system must take into account this validated prefix to suggest a suitable new hypothesis. Given that in such application the user is constantly interacting with the system, it makes sense to adapt this interactive application to be used on a pen computer. User corrections will be provided by means of pen-strokes and therefore it is necessary to introduce a recognizer in charge of decoding this king of nondeterministic user feedback. However, this recognizer performance can be boosted by taking advantage of interaction-derived information, such as the user-validated prefix. Then, this thesis focuses on the study of human movements, in particular, hand movements, from a generation point of view by tapping into the kinematic theory of rapid human movements and the Sigma-Lognormal model. Understanding how the human body generates movements and, particularly understand the origin of the human movement variability, is important in the development of a recognition system. The contribution of this thesis to this topic is important, since a new technique (which improves the previous results) to extract the Sigma-lognormal model parameters is presented. Closely related to the previous work, this thesis study the benefits of using synthetic data as training. The easiest way to train a recognizer is to provide "infinite" data, representing all possible variations. In general, the more the training data, the smaller the error. But usually it is not possible to infinitely increase the size of a training set. Recruiting participants, data collection, labeling, etc., necessary for achieving this goal can be time-consuming and expensive. One way to overcome this problem is to create and use synthetically generated data that looks like the human. We study how to create these synthetic data and explore different approaches on how to use them, both for handwriting and gesture recognition. The different contributions of this thesis have obtained good results, producing several publications in international conferences and journals. Finally, three applications related to the work of this thesis are presented. First, we created Escritorie, a digital desk prototype based on the pen computer paradigm for transcribing handwritten text images. Second, we developed "Gestures à Go Go", a web application for bootstrapping gestures. Finally, we studied another interactive application under the pen computer paradigm. In this case, we study how translation reviewing can be done more ergonomically using a pen.[ES] Hoy en día, los ordenadores están presentes en todas partes pero su potencial no se aprovecha debido al "miedo" que se les tiene. En esta tesis se adopta el paradigma del pen computer, cuya idea fundamental es sustituir todos los dispositivos de entrada por un lápiz electrónico o, directamente, por los dedos. El origen del rechazo a los ordenadores proviene del uso de interfaces poco amigables para el humano. El origen de este paradigma data de hace más de 40 años, pero solo recientemente se ha comenzado a implementar en dispositivos móviles. La lenta y tardía implantación probablemente se deba a que es necesario incluir un reconocedor que "traduzca" los trazos del usuario (texto manuscrito o gestos) a algo entendible por el ordenador. Para pensar de forma realista en la implantación del pen computer, es necesario mejorar la precisión del reconocimiento de texto y gestos. El objetivo de esta tesis es el estudio de diferentes estrategias para mejorar esta precisión. En primer lugar, esta tesis investiga como aprovechar información derivada de la interacción para mejorar el reconocimiento, en concreto, en la transcripción interactiva de imágenes con texto manuscrito. En la transcripción interactiva, el sistema y el usuario trabajan "codo con codo" para generar la transcripción. El usuario valida la salida del sistema proporcionando ciertas correcciones, mediante texto manuscrito, que el sistema debe tener en cuenta para proporcionar una mejor transcripción. Este texto manuscrito debe ser reconocido para ser utilizado. En esta tesis se propone aprovechar información contextual, como por ejemplo, el prefijo validado por el usuario, para mejorar la calidad del reconocimiento de la interacción. Tras esto, la tesis se centra en el estudio del movimiento humano, en particular del movimiento de las manos, utilizando la Teoría Cinemática y su modelo Sigma-Lognormal. Entender como se mueven las manos al escribir, y en particular, entender el origen de la variabilidad de la escritura, es importante para el desarrollo de un sistema de reconocimiento, La contribución de esta tesis a este tópico es importante, dado que se presenta una nueva técnica (que mejora los resultados previos) para extraer el modelo Sigma-Lognormal de trazos manuscritos. De forma muy relacionada con el trabajo anterior, se estudia el beneficio de utilizar datos sintéticos como entrenamiento. La forma más fácil de entrenar un reconocedor es proporcionar un conjunto de datos "infinito" que representen todas las posibles variaciones. En general, cuanto más datos de entrenamiento, menor será el error del reconocedor. No obstante, muchas veces no es posible proporcionar más datos, o hacerlo es muy caro. Por ello, se ha estudiado como crear y usar datos sintéticos que se parezcan a los reales. Las diferentes contribuciones de esta tesis han obtenido buenos resultados, produciendo varias publicaciones en conferencias internacionales y revistas. Finalmente, también se han explorado tres aplicaciones relaciones con el trabajo de esta tesis. En primer lugar, se ha creado Escritorie, un prototipo de mesa digital basada en el paradigma del pen computer para realizar transcripción interactiva de documentos manuscritos. En segundo lugar, se ha desarrollado "Gestures à Go Go", una aplicación web para generar datos sintéticos y empaquetarlos con un reconocedor de forma rápida y sencilla. Por último, se presenta un sistema interactivo real bajo el paradigma del pen computer. En este caso, se estudia como la revisión de traducciones automáticas se puede realizar de forma más ergonómica.[CA] Avui en dia, els ordinadors són presents a tot arreu i es comunament acceptat que la seva utilització proporciona beneficis. No obstant això, moltes vegades el seu potencial no s'aprofita totalment. En aquesta tesi s'adopta el paradigma del pen computer, on la idea fonamental és substituir tots els dispositius d'entrada per un llapis electrònic, o, directament, pels dits. Aquest paradigma postula que l'origen del rebuig als ordinadors prové de l'ús d'interfícies poc amigables per a l'humà, que han de ser substituïdes per alguna cosa més coneguda. Per tant, la interacció amb l'ordinador sota aquest paradigma es realitza per mitjà de text manuscrit i/o gestos. L'origen d'aquest paradigma data de fa més de 40 anys, però només recentment s'ha començat a implementar en dispositius mòbils. La lenta i tardana implantació probablement es degui al fet que és necessari incloure un reconeixedor que "tradueixi" els traços de l'usuari (text manuscrit o gestos) a alguna cosa comprensible per l'ordinador, i el resultat d'aquest reconeixement, actualment, és lluny de ser òptim. Per pensar de forma realista en la implantació del pen computer, cal millorar la precisió del reconeixement de text i gestos. L'objectiu d'aquesta tesi és l'estudi de diferents estratègies per millorar aquesta precisió. En primer lloc, aquesta tesi investiga com aprofitar informació derivada de la interacció per millorar el reconeixement, en concret, en la transcripció interactiva d'imatges amb text manuscrit. En la transcripció interactiva, el sistema i l'usuari treballen "braç a braç" per generar la transcripció. L'usuari valida la sortida del sistema donant certes correccions, que el sistema ha d'usar per millorar la transcripció. En aquesta tesi es proposa utilitzar correccions manuscrites, que el sistema ha de reconèixer primer. La qualitat del reconeixement d'aquesta interacció és millorada, tenint en compte informació contextual, com per exemple, el prefix validat per l'usuari. Després d'això, la tesi se centra en l'estudi del moviment humà en particular del moviment de les mans, des del punt de vista generatiu, utilitzant la Teoria Cinemàtica i el model Sigma-Lognormal. Entendre com es mouen les mans en escriure és important per al desenvolupament d'un sistema de reconeixement, en particular, per entendre l'origen de la variabilitat de l'escriptura. La contribució d'aquesta tesi a aquest tòpic és important, atès que es presenta una nova tècnica (que millora els resultats previs) per extreure el model Sigma- Lognormal de traços manuscrits. De forma molt relacionada amb el treball anterior, s'estudia el benefici d'utilitzar dades sintètiques per a l'entrenament. La forma més fàcil d'entrenar un reconeixedor és proporcionar un conjunt de dades "infinit" que representin totes les possibles variacions. En general, com més dades d'entrenament, menor serà l'error del reconeixedor. No obstant això, moltes vegades no és possible proporcionar més dades, o fer-ho és molt car. Per això, s'ha estudiat com crear i utilitzar dades sintètiques que s'assemblin a les reals. Les diferents contribucions d'aquesta tesi han obtingut bons resultats, produint diverses publicacions en conferències internacionals i revistes. Finalment, també s'han explorat tres aplicacions relacionades amb el treball d'aquesta tesi. En primer lloc, s'ha creat Escritorie, un prototip de taula digital basada en el paradigma del pen computer per realitzar transcripció interactiva de documents manuscrits. En segon lloc, s'ha desenvolupat "Gestures à Go Go", una aplicació web per a generar dades sintètiques i empaquetar-les amb un reconeixedor de forma ràpida i senzilla. Finalment, es presenta un altre sistema inter- actiu sota el paradigma del pen computer. En aquest cas, s'estudia com la revisió de traduccions automàtiques es pot realitzar de forma més ergonòmica.Martín-Albo Simón, D. (2016). Contributions to Pen & Touch Human-Computer Interaction [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/68482TESI

    The Dollar General: Continuous Custom Gesture Recognition Techniques At Everyday Low Prices

    Get PDF
    Humans use gestures to emphasize ideas and disseminate information. Their importance is apparent in how we continuously augment social interactions with motion—gesticulating in harmony with nearly every utterance to ensure observers understand that which we wish to communicate, and their relevance has not escaped the HCI community\u27s attention. For almost as long as computers have been able to sample human motion at the user interface boundary, software systems have been made to understand gestures as command metaphors. Customization, in particular, has great potential to improve user experience, whereby users map specific gestures to specific software functions. However, custom gesture recognition remains a challenging problem, especially when training data is limited, input is continuous, and designers who wish to use customization in their software are limited by mathematical attainment, machine learning experience, domain knowledge, or a combination thereof. Data collection, filtering, segmentation, pattern matching, synthesis, and rejection analysis are all non-trivial problems a gesture recognition system must solve. To address these issues, we introduce The Dollar General (TDG), a complete pipeline composed of several novel continuous custom gesture recognition techniques. Specifically, TDG comprises an automatic low-pass filter tuner that we use to improve signal quality, a segmenter for identifying gesture candidates in a continuous input stream, a classifier for discriminating gesture candidates from non-gesture motions, and a synthetic data generation module we use to train the classifier. Our system achieves high recognition accuracy with as little as one or two training samples per gesture class, is largely input device agnostic, and does not require advanced mathematical knowledge to understand and implement. In this dissertation, we motivate the importance of gestures and customization, describe each pipeline component in detail, and introduce strategies for data collection and prototype selection

    Enabling Techniques to support Reliable Smartphone-Based Motion Gesture Interaction

    Get PDF
    When using motion gestures - 3D movements of a mobile phone - as an input modality, one significant challenge is how to teach end users the movement parameters necessary to successfully issue a command. Is a simple video or image depicting movement of a smartphone sufficient? Or do we need three-dimensional depictions of movement on external screens to train users? In this thesis, we explore mechanisms to teach end users motion gestures and analyze the user’s perceived reliability of motion gesture recognition. Regarding teaching motion gestures, two factors were examined. The first factor is how to represent motion gestures: as icons that describe movement, video that depicts movement using the smartphone screen, or a Kinect-based teaching mechanism that captures and depicts the gesture on an external display in three-dimensional space. The second factor explored is recognizer feedback, i.e. a simple representation of the proximity of a motion gesture to the desired motion gesture based on a distance metric extracted from the recognizer. Our results show that, by combining video with recognizer feedback, participants master motion gestures equally quickly as end users that learn using a Kinect. These results demonstrate the viability of training end users to perform motion gestures using only the smartphone display. Regarding user’s perceived reliability of the gesture recognizer, the effects of bi-level thresholding on the workload and acceptance of end-users were examined. Bi-level thresholding is a motion gesture recognition technique that mediates between false positives, and false negatives by using two threshold levels: a tighter threshold that limits false positives and recognition errors and a looser threshold that prevents repeated errors (false negatives) by analyzing movements in sequence. By holding recognition rates constant but adjusting for fixed versus bi-level thresholding, we show that systems using bi-level thresholding result in significantly lower workload scores on the NASA-TLX. Overall, these results argue for the viability of bi-level thresholding as an effective technique for balancing between different types of recognizer errors

    BRAILLESHAPES : efficient text input on smartwatches for blind people

    Get PDF
    Tese de Mestrado, Engenharia Informática, 2023, Universidade de Lisboa, Faculdade de CiênciasMobile touchscreen devices like smartphones or smartwatches are a predominant part of our lives. They have evolved, and so have their applications. Due to the constant growth and advancements in technology, using such devices as a means to accomplish a vast amount of tasks has become common practice. Nonetheless, relying on touch-based interactions, requiring good spatial ability and memorization inherent to mobile devices, and lacking sufficient tactile cues, makes these devices visually demanding, thus providing a strenuous interaction modality for visually impaired people. In scenarios occurring in movement-based contexts or where onehanded use is required, it is even more apparent. We believe devices like smartwatches can provide numerous advantages when addressing such topics. However, they lack accessible solutions for several tasks, with most of the existing ones for mobile touchscreen devices targeting smartphones. With communication being of the utmost importance and intrinsic to humankind, one task, in particular, for which it is imperative to provide solutions addressing its surrounding accessibility concerns is text entry. Since Braille is a reading standard for blind people and provided positive results in prior work regarding accessible text entry approaches, we believe using it as the basis for an accessible text entry solution can help solidify a standardization for this type of interaction modality. It can also allow users to leverage previous knowledge, reducing possible extra cognitive load. Yet, even though Braille-based chording solutions achieved good results, due to the reduced space of the smartwatch’s touchscreen, a tapping approach is not the most feasible. Hence, we found the best option to be a gesture-based solution. Therefore, with this thesis, we explored and validated the concept and feasibility of Braille-based shapes as the foundation for an accessible gesture-based smartwatch text entry method for visually impaired people

    WatchTrace: Design and Evaluation of an At-Your-Side Gesture Paradigm

    Get PDF
    In this thesis, we present the exploration and evaluation of a gesture interaction paradigm performed with arms at rest at the side of one's body. This gesture stance is informed persisting challenges in mid-air arm gesture interactions in relation to fatigue and social acceptability. The proposed arms-down posture reduces physical effort by minimizing the shoulder torque placed on the user. While this interaction posture has been previously explored, the gesture vocabulary in previous research has been small and limited. The design of this gesture interaction is motivated by the ability to provide rich and expressive input; the user performs gestures by moving the whole arm at the side of the body to create two-dimensional visual traces, as in hand-drawing in a bounded plane parallel to the ground. Within this space, we present the results of two studies that investigate the use of side-gesture input for interaction. First, we explore the users' mental model for using this interaction by conducting an elicitation study on a set of everyday tasks one would perform on a large display in public to semi-public contexts. The takeaway from this study presents the need for a dynamic and expressive set of gesture vocabulary including ideographic and alphanumeric gesture constructs that can be combined or chained together. We then explore the feasibility of designing such a gesture recognition system using commodity hardware and recognition techniques, dubbed WatchTrace, which supports alphanumeric gestures of up to length three, providing a vibrant, dynamic, and feasible gestural vocabulary. Finally, we explore potential approaches to improve the recognition through the use of adaptive thresholds, n-best lists, and changing reject rates among other conventional techniques in the field of gesture classification

    Sketch Recognition on Mobile Devices

    Get PDF
    Sketch recognition allows computers to understand and model hand drawn sketches and diagrams. Traditionally sketch recognition systems required a pen based PC interface, but powerful mobile devices such as tablets and smartphones can provide a new platform for sketch recognition systems. We describe a new sketch recognition library, Strontium (SrL) that combines several existing sketch recognition libraries modified to run on both personal computers and on the Android platform. We analyzed the recognition speed and accuracy implications of performing low-level shape recognition on smartphones with touch screens. We found that there is a large gap in recognition speed on mobile devices between recognizing simple shapes and more complex ones, suggesting that mobile sketch interface designers limit the complexity of their sketch domains. We also found that a low sampling rate on mobile devices can affect recognition accuracy of complex and curved shapes. Despite this, we found no evidence to suggest that using a finger as an input implement leads to a decrease in simple shape recognition accuracy. These results show that the same geometric shape recognizers developed for pen applications can be used in mobile applications, provided that developers keep shape domains simple and ensure that input sampling rate is kept as high as possible

    Exploring At-Your-Side Gestural Interaction for Ubiquitous Environments

    Get PDF
    International audienceFree-space gestural systems are faced with two major issues: a lack of subtlety due to explicit mid-air arm movements, and the highly effortful nature of such interactions. With an ever-growing ubiquity of interactive devices, displays, and appliances with non-standard interfaces, lower-effort and more socially acceptable interaction paradigms are essential. To address these issues, we explore at-one's-side gestural input. Within this space, we present the results of two studies that investigate the use of side-gesture input for interaction. First, we investigate end-user preference through a gesture elicitation study, present a gesture set, and validate the need for dynamic, diverse, and variable-length gestures. We then explore the feasibility of designing such a gesture recognition system, dubbed WatchTrace, which supports alphanumeric gestures of up to length three with an average accuracy of up to 82%, providing a rich, dynamic, and feasible gestural vocabulary

    Building and evaluating an inconspicuous smartphone authentication method

    Get PDF
    Tese de mestrado em Engenharia Informática, apresentada à Universidade de Lisboa, através da Faculdade de Ciências, 2013Os smartphones que trazemos connosco estão cada vez mais entranhados nas nossas vidas intimas. Estes dispositivos possibilitam novas formas de trabalhar, de socializar, e ate de nos divertirmos. No entanto, também criaram novos riscos a nossa privacidade. Uma forma comum de mitigar estes riscos e configurar o dispositivo para bloquear apos um período de inatividade. Para voltar a utiliza-lo, e então necessário superar uma barreira de autenticação. Desta forma, se o aparelho cair das mãos de outra pessoa, esta não poderá utiliza-lo de forma a que tal constitua uma ameaça. O desbloqueio com autenticação e, assim, o mecanismo que comummente guarda a privacidade dos utilizadores de smartphones. Porem, os métodos de autenticação atualmente utilizados são maioritariamente um legado dos computadores de mesa. As palavras-passe e códigos de identificação pessoal são tornados menos seguros pelo facto de as pessoas criarem mecanismos para os memorizarem mais facilmente. Alem disso, introduzir estes códigos e inconveniente, especialmente no contexto móvel, em que as interações tendem a ser curtas e a necessidade de autenticação atrapalha a prossecução de outras tarefas. Recentemente, os smartphones Android passaram a oferecer outro método de autenticação, que ganhou um grau de adoção assinalável. Neste método, o código secreto do utilizador e uma sucessão de traços desenhados sobre uma grelha de 3 por 3 pontos apresentada no ecrã táctil. Contudo, quer os códigos textuais/numéricos, quer os padrões Android, são suscetíveis a ataques rudimentares. Em ambos os casos, o canal de entrada e o toque no ecrã táctil; e o canal de saída e o visual. Tal permite que outras pessoas possam observar diretamente a introdução da chave; ou que mais tarde consigam distinguir as marcas deixadas pelos dedos na superfície de toque. Alem disso, estes métodos não são acessíveis a algumas classes de utilizadores, nomeadamente os cegos. Nesta dissertação propõe-se que os métodos de autenticação em smartphones podem ser melhor adaptados ao contexto móvel. Nomeadamente, que a possibilidade de interagir com o dispositivo de forma inconspícua poderá oferecer aos utilizadores um maior grau de controlo e a capacidade de se auto-protegerem contra a observação do seu código secreto. Nesse sentido, foi identificada uma modalidade de entrada que não requer o canal visual: sucessões de toques independentes de localização no ecrã táctil. Estes padrões podem assemelhar-se (mas não estão limitados) a ritmos ou código Morse. A primeira contribuição deste trabalho e uma técnica algorítmica para a deteção destas sucessões de toques, ou frases de toque, como chaves de autenticação. Este reconhecedor requer apenas uma demonstração para configuração, o que o distingue de outras abordagens que necessitam de vários exemplos para treinar o algoritmo. O reconhecedor foi avaliado e demonstrou ser preciso e computacionalmente eficiente. Esta contribuição foi enriquecida com o desenvolvimento de uma aplicação Android que demonstra o conceito. A segunda contribuição e uma exploração de fatores humanos envolvidos no uso de frases de toque para autenticação. E consubstanciada em três estudos com utilizadores, em que o método de autenticação proposto e comparado com as alternativas mais comuns: PIN e o padrão Android. O primeiro estudo (N=30) compara os três métodos no que que diz respeito a resistência a observação e à usabilidade, entendida num sentido lato, que inclui a experiencia de utilização (UX). Os resultados sugerem que a usabilidade das três abordagens e comparável, e que em condições de observação perfeitas, nos três casos existe grande viabilidade de sucesso para um atacante. O segundo estudo (N=19) compara novamente os três métodos mas, desta feita, num cenário de autenticação inconspícua. Com efeito, os participantes tentaram introduzir os códigos com o dispositivo situado por baixo de uma mesa, fora do alcance visual. Neste caso, demonstra-se que a autenticação com frases de toque continua a ser usável. Já com as restantes alternativas existe uma diminuição substancial das medidas de usabilidade. Tal sugere que a autenticação por frases de toque suporta a capacidade de interação inconspícua, criando assim a possibilidade de os utilizadores se protegerem contra possíveis atacantes. O terceiro estudo (N=16) e uma avaliação de usabilidade e aceitação do método de autenticação com utilizadores cegos. Neste estudo, são também elicitadas estratégias de ocultação suportadas pela autenticação por frases de toque. Os resultados sugerem que a técnica e também adequada a estes utilizadores.As our intimate lives become more tangled with the smartphones we carry, privacy has become an increasing concern. A widely available option to mitigate security risks is to set a device so that it locks after a period of inactivity, requiring users to authenticate for subsequent use. Current methods for establishing one's identity are known to be susceptible to even rudimentary observation attacks. The mobile context in which interactions with smartphones are prone to occur further facilitates shoulder-surfing. We submit that smartphone authentication methods can be better adapted to the mobile context. Namely, the ability to interact with the device in an inconspicuous manner could offer users more control and the ability to self-protect against observation. Tapping is a communication modality between a user and a device that can be appropriated for that purpose. This work presents a technique for employing sequences of taps, or tap phrases, as authentication codes. An efficient and accurate tap phrase recognizer, that does not require training, is presented. Three user studies were conducted to compare this approach to the current leading methods. Results indicate that the tapping method remains usable even under inconspicuous authentications scenarios. Furthermore, we found that it is appropriate for blind users, to whom usability barriers and security risks are of special concern
    corecore