22 research outputs found

    Human-Computer Interaction

    Get PDF
    In this book the reader will find a collection of 31 papers presenting different facets of Human Computer Interaction, the result of research projects and experiments as well as new approaches to design user interfaces. The book is organized according to the following main topics in a sequential order: new interaction paradigms, multimodality, usability studies on several interaction mechanisms, human factors, universal design and development methodologies and tools

    Autonomous interactive intermediaries : social intelligence for mobile communication agents

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2005.Includes bibliographical references (p. 151-167).Today's cellphones are passive communication portals. They are neither aware of our conversational settings, nor of the relationship between caller and callee, and often interrupt us at inappropriate times. This thesis is about adding elements of human style social intelligence to our mobile communication devices in order to make them more socially acceptable to both user and local others. I suggest the concept of an Autonomous Interactive Intermediary that assumes the role of an actively mediating party between caller, callee, and co-located people. In order to behave in a socially appropriate way, the Intermediary interrupts with non-verbal cues and attempts to harvest 'residual social intelligence' from the calling party, the called person, the people close by, and its current location. For example, the Intermediary obtains the user's conversational status from a decentralized network of autonomous body-worn sensor nodes. These nodes detect conversational groupings in real time, and provide the Intermediary with the user's conversation size and talk-to-listen ratio. The Intermediary can 'poll' all participants of a face-to-face conversation about the appropriateness of a possible interruption by slightly vibrating their wirelessly actuated finger rings.(cont.) Although the alerted people do not know if it is their own cellphone that is about to interrupt, each of them can veto the interruption anonymously by touching his/her ring. If no one vetoes, the Intermediary may interrupt. A user study showed significantly more vetoes during a collaborative group-focused setting than during a less group oriented setting. The Intermediary is implemented as a both a conversational agent and an animatronic device. The animatronics is a small wireless robotic stuffed animal in the form of a squirrel, bunny, or parrot. The purpose of the embodiment is to employ intuitive non-verbal cues such as gaze and gestures to attract attention, instead of ringing or vibration. Evidence suggests that such subtle yet public alerting by animatronics evokes significantly different reactions than ordinary telephones and are seen as less invasive by others present when we receive phone calls. The Intermediary is also a dual conversational agent that can whisper and listen to the user, and converse with a caller, mediating between them in real time.(cont.) The Intermediary modifies its conversational script depending on caller identity, caller and user choices, and the conversational status of the user. It interrupts and communicates with the user when it is socially appropriate, and may break down a synchronous phone call into chunks of voice instant messages.by Stefan Johannes Walter Marti.Ph.D

    A Body-and-Mind-Centric Approach to Wearable Personal Assistants

    Get PDF

    Designing gaze-based interaction for pervasive public displays

    Get PDF
    The last decade witnessed an increasing adoption of public interactive displays. Displays can now be seen in many public areas, such as shopping malls, and train stations. There is also a growing trend towards using large public displays especially in airports, urban areas, universities and libraries. Meanwhile, advances in eye tracking and visual computing promise straightforward integration of eye tracking on these displays for both: 1) monitoring the user's visual behavior to evaluate different aspects of the display, such as measuring the visual attention of passersby, and for 2) interaction purposes, such as allowing users to provide input, retrieve content, or transfer data using their eye movements. Gaze is particularly useful for pervasive public displays. In addition to being natural and intuitive, eye gaze can be detected from a distance, bringing interactivity to displays that are physically unreachable. Gaze reflects the user's intention and visual interests, and its subtle nature makes it well-suited for public interactions where social embarrassment and privacy concerns might hinder the experience. On the downside, eye tracking technologies have traditionally been developed for desktop settings, where a user interacts from a stationary position and for a relatively long period of time. Interaction with public displays is fundamentally different and hence poses unique challenges when employing eye tracking. First, users of public displays are dynamic; users could approach the display from different directions, and interact from different positions or even while moving. This means that gaze-enabled displays should not expect users to be stationary at a specific position, but instead adapt to users' ever-changing position in front of the display. Second, users of public displays typically interact for short durations, often for a few seconds only. This means that contrary to desktop settings, public displays cannot afford requiring users to perform time-consuming calibration prior to interaction. In this publications-based dissertation, we first report on a review of challenges of interactive public displays, and discuss the potential of gaze in addressing these challenges. We then showcase the implementation and in-depth evaluation of two applications where gaze is leveraged to address core problems in today's public displays. The first presents an eye-based solution, EyePACT, that tackles the parallax effect which is often experienced on today's touch-based public displays. We found that EyePACT significantly improves accuracy even with varying degrees of parallax. The second is a novel multimodal system, GTmoPass, that combines gaze and touch input for secure user authentication on public displays. GTmoPass was found to be highly resilient to shoulder surfing, thermal attacks and smudge attacks, thereby offering a secure solution to an important problem on public displays. The second part of the dissertation explores specific challenges of gaze-based interaction with public displays. First, we address the user positioning problem by means of active eye tracking. More specifically, we built a novel prototype, EyeScout, that dynamically moves the eye tracker based on the user's position without augmenting the user. This, in turn, allowed us to study and understand gaze-based interaction with public displays while walking, and when approaching the display from different positions. An evaluation revealed that EyeScout is well perceived by users, and improves the time needed to initiate gaze interaction by 62% compared to state-of-the-art. Second, we propose a system, Read2Calibrate, for calibrating eye trackers implicitly while users read text on displays. We found that although text-based calibration is less accurate than traditional methods, it integrates smoothly while reading and thereby more suitable for public displays. Finally, through our prototype system, EyeVote, we show how to allow users to select textual options on public displays via gaze without calibration. In a field deployment of EyeVote, we studied the trade-off between accuracy and selection speed when using calibration-free selection techniques. We found that users of public displays value faster interactions over accurate ones, and are willing to correct system errors in case of inaccuracies. We conclude by discussing the implications of our findings on the design of gaze-based interaction for public displays, and how our work can be adapted for other domains apart from public displays, such as on handheld mobile devices.In den letzten zehn Jahren wurden vermehrt interaktive Displays in öffentlichen Bereichen wie Einkaufszentren, Flughäfen und Bahnhöfen eingesetzt. Große öffentliche Displays finden sich zunehmend in städtischen Gebieten, beispielsweise in Universitäten und Bibliotheken. Fortschritte in der Eye-Tracking-Technologie und der Bildverarbeitung versprechen eine einfache Integration von Eye-Tracking auf diesen Displays. So kann zum einen das visuelle Verhalten der Benutzer verfolgt und damit ein Display nach verschiedenen Aspekten evaluiert werden. Zum anderen eröffnet Eye-Tracking auf öffentlichen Displays neue Interaktionsmöglichkeiten. Blickbasierte Interaktion ist besonders nützlich für Bildschirme im allgegenwärtigen öffentlichen Raum. Der Blick bietet mehr als eine natürliche und intuitive Interaktionsmethode: Blicke können aus der Ferne erkannt und somit für Interaktion mit sonst unerreichbaren Displays genutzt werden. Aus der Interaktion mit dem Blick (Gaze) lassen sich Absichten und visuelle Interessen der Benutzer ableiten. Dadurch eignet es sich besonders für den öffentlichen Raum, wo Nutzer möglicherweise Datenschutzbedenken haben könnten oder sich bei herkömmlichen Methoden gehemmt fühlen würden in der Öffentlichkeit mit den Displays zu interagieren. Dadurch wird ein uneingeschränktes Nutzererlebnis ermöglicht. Eye-Tracking-Technologien sind jedoch in erster Linie für Desktop-Szenarien entwickelt worden, bei denen ein Benutzer für eine relativ lange Zeitspanne in einer stationären Position mit dem System interagiert. Die Interaktion mit öffentlichen Displays ist jedoch grundlegend anders. Daher gilt es völlig neuartige Herausforderungen zu bewältigen, wenn Eye-Tracking eingesetzt wird. Da sich Nutzer von öffentlichen Displays bewegen, können sie sich dem Display aus verschiedenen Richtungen nähern und sogar währenddessen mit dem Display interagieren. Folglich sollten "Gaze-enabled Displays" nicht davon ausgehen, dass Nutzer sich stets an einer bestimmten Position befinden, sondern sollten sich an die ständig wechselnde Position des Nutzers anpassen können. Zum anderen interagieren Nutzer von öffentlichen Displays üblicherweise nur für eine kurze Zeitspannen von ein paar Sekunden. Eine zeitaufwändige Kalibrierung durch den Nutzer vor der eigentlichen Interaktion ist hier im Gegensatz zu Desktop-Szenarien also nicht adäquat. Diese kumulative Dissertation überprüft zunächst die Herausforderungen interaktiver öffentlicher Displays und diskutiert das Potenzial von blickbasierter Interaktion zu deren Bewältigung. Anschließend wird die Implementierung und eingehende Evaluierung von zwei beispielhaften Anwendungen vorgestellt, bei denen Nutzer durch den Blick mit öffentlichen Displays interagieren. Daraus ergeben sich weitere greifbare Vorteile der blickbasierten Interaktion für öffentliche Display-Kontexte. Bei der ersten Anwendung, EyePACT, steht der Parallaxeneffekt im Fokus, der heutzutage häufig ein Problem auf öffentlichen Displays darstellt, die über Berührung (Touch) gesteuert werden. Die zweite Anwendung ist ein neuartiges multimodales System, GTmoPass, das Gaze- und Touch-Eingabe zur sicheren Benutzerauthentifizierung auf öffentlichen Displays kombiniert. GTmoPass ist sehr widerstandsfähig sowohl gegenüber unerwünschten fremden Blicken als auch gegenüber sogenannten thermischen Angriffen und Schmierangriffen. Es bietet damit eine sichere Lösung für ein wichtiges Sicherheits- und Datenschutzproblem auf öffentlichen Displays. Der zweite Teil der Dissertation befasst sich mit spezifischen Herausforderungen der Gaze-Interaktion mit öffentlichen Displays. Zuerst wird der Aspekt der Benutzerpositionierung durch aktives Eye-Tracking adressiert. Der neuartige Prototyp EyeScout bewegt den Eye-Tracker passend zur Position des Nutzers, ohne dass dieser dafür mit weiteren Geräten oder Sensoren ausgestattet werden muss. Dies ermöglicht blickbasierte Interaktion mit öffentlichen Displays auch in jenen Situationen zu untersuchen und zu verstehen, in denen Nutzer in Bewegung sind und sich dem Display von verschiedenen Positionen aus nähern. Zweitens wird das System Read2Calibrate präsentiert, das Eye-Tracker implizit kalibriert, während Nutzer Texte auf Displays lesen. Der Prototyp EyeVote zeigt, wie man die Auswahl von Textantworten auf öffentlichen Displays per Blick ohne Kalibrierung ermöglichen kann. In einer Feldstudie mit EyeVote wird der Kompromiss zwischen Genauigkeit und Auswahlgeschwindigkeit unter der Verwendung kalibrierungsfreier Auswahltechniken untersucht. Die Implikationen der Ergebnisse für das Design von blickbasierter Interaktion öffentlicher Displays werden diskutiert. Abschließend wird erörtert wie die verwendete Methodik auf andere Bereiche, z.B. auf mobilie Geräte, angewendet werden kann

    Gaze estimation and interaction in real-world environments

    Get PDF
    Human eye gaze has been widely used in human-computer interaction, as it is a promising modality for natural, fast, pervasive, and non-verbal interaction between humans and computers. As the foundation of gaze-related interactions, gaze estimation has been a hot research topic in recent decades. In this thesis, we focus on developing appearance-based gaze estimation methods and corresponding attentive user interfaces with a single webcam for challenging real-world environments. First, we collect a large-scale gaze estimation dataset, MPIIGaze, the first of its kind, outside of controlled laboratory conditions. Second, we propose an appearance-based method that, in stark contrast to a long-standing tradition in gaze estimation, only takes the full face image as input. Second, we propose an appearance-based method that, in stark contrast to a long-standing tradition in gaze estimation, only takes the full face image as input. Third, we study data normalisation for the first time in a principled way, and propose a modification that yields significant performance improvements. Fourth, we contribute an unsupervised detector for human-human and human-object eye contact. Finally, we study personal gaze estimation with multiple personal devices, such as mobile phones, tablets, and laptops.Der Blick des menschlichen Auges wird in Mensch-Computer-Interaktionen verbreitet eingesetzt, da dies eine vielversprechende Möglichkeit für natürliche, schnelle, allgegenwärtige und nonverbale Interaktion zwischen Mensch und Computer ist. Als Grundlage von blickbezogenen Interaktionen ist die Blickschätzung in den letzten Jahrzehnten ein wichtiges Forschungsthema geworden. In dieser Arbeit konzentrieren wir uns auf die Entwicklung Erscheinungsbild-basierter Methoden zur Blickschätzung und entsprechender “attentive user interfaces” (die Aufmerksamkeit des Benutzers einbeziehende Benutzerschnittstellen) mit nur einer Webcam für anspruchsvolle natürliche Umgebungen. Zunächst sammeln wir einen umfangreichen Datensatz zur Blickschätzung, MPIIGaze, der erste, der außerhalb von kontrollierten Laborbedingungen erstellt wurde. Zweitens schlagen wir eine Erscheinungsbild-basierte Methode vor, die im Gegensatz zur langjährigen Tradition in der Blickschätzung nur eine vollständige Aufnahme des Gesichtes als Eingabe verwendet. Drittens untersuchen wir die Datennormalisierung erstmals grundsätzlich und schlagen eine Modifizierung vor, die zu signifikanten Leistungsverbesserungen führt. Viertens stellen wir einen unüberwachten Detektor für Augenkontakte zwischen Mensch und Mensch und zwischen Mensch und Objekt vor. Abschließend untersuchen wir die persönliche Blickschätzung mit mehreren persönlichen Geräten wie Handy, Tablet und Laptop

    Affect-based information retrieval

    Get PDF
    One of the main challenges Information Retrieval (IR) systems face nowadays originates from the semantic gap problem: the semantic difference between a user’s query representation and the internal representation of an information item in a collection. The gap is further widened when the user is driven by an ill-defined information need, often the result of an anomaly in his/her current state of knowledge. The formulated search queries, which are submitted to the retrieval systems to locate relevant items, produce poor results that do not address the users’ information needs. To deal with information need uncertainty IR systems have employed in the past a range of feedback techniques, which vary from explicit to implicit. The first category of feedback techniques necessitates the communication of explicit relevance judgments, in return for better query reformulations and recommendations of relevant results. However, the latter happens at the expense of users’ cognitive resources and, furthermore, introduces an additional layer of complexity to the search process. On the other hand, implicit feedback techniques make inferences on what is relevant based on observations of user search behaviour. By doing so, they disengage users from the cognitive burden of document rating and relevance assessments. However, both categories of RF techniques determine topical relevance with respect to the cognitive and situational levels of interaction, failing to acknowledge the importance of emotions in cognition and decision making. In this thesis I investigate the role of emotions in the information seeking process and develop affective feedback techniques for interactive IR. This novel feedback framework aims to aid the search process and facilitate a more natural and meaningful interaction. I develop affective models that determine topical relevance based on information gathered from various sensory channels, and enhance their performance using personalisation techniques. Furthermore, I present an operational video retrieval system that employs affective feedback to enrich user profiles and offers meaningful recommendations of unseen videos. The use of affective feedback as a surrogate for the information need is formalised as the Affective Model of Browsing. This is a cognitive model that motivates the use of evidence extracted from the psycho-somatic mobilisation that occurs during cognitive appraisal. Finally, I address some of the ethical and privacy issues that arise from the social-emotional interaction between users and computer systems. This study involves questionnaire data gathered over three user studies, from 74 participants of different educational background, ethnicity and search experience. The results show that affective feedback is a promising area of research and it can improve many aspects of the information seeking process, such as indexing, ranking and recommendation. Eventually, it may be that relevance inferences obtained from affective models will provide a more robust and personalised form of feedback, which will allow us to deal more effectively with issues such as the semantic gap

    Eye Gaze Tracking for Human Computer Interaction

    Get PDF
    With a growing number of computer devices around us, and the increasing time we spend for interacting with such devices, we are strongly interested in finding new interaction methods which ease the use of computers or increase interaction efficiency. Eye tracking seems to be a promising technology to achieve this goal. This thesis researches interaction methods based on eye-tracking technology. After a discussion of the limitations of the eyes regarding accuracy and speed, including a general discussion on Fitts’ law, the thesis follows three different approaches on how to utilize eye tracking for computer input. The first approach researches eye gaze as pointing device in combination with a touch sensor for multimodal input and presents a method using a touch sensitive mouse. The second approach examines people’s ability to perform gestures with the eyes for computer input and the separation of gaze gestures from natural eye movements. The third approach deals with the information inherent in the movement of the eyes and its application to assist the user. The thesis presents a usability tool for recording of interaction and gaze activity. It also describes algorithms for reading detection. All approaches present results based on user studies conducted with prototypes developed for the purpose

    Learning in mobile context-aware applications

    Get PDF
    This thesis explores and proposes solutions to the challenges in deploying context-aware systems that make decisions or take actions based on the predictions of a machine learner over long periods of time. In particular, this work focuses on mobile context-aware applications which are intrinsically personal, requiring a specific solution for each individual that takes into account user preferences and changes in user behaviour as time passes. While there is an abundance of research on mobile context-aware applications which employ machine learning, most does not address the three core challenges required to be deployable over indefinite periods of time. Namely, (1) user-friendly and longitudinal collection and labelling of data, (2) measuring a user’s experienced performance and (3) adaptation to changes in a user’s behaviour, also known as concept drift. This thesis addresses these challenges by introducing (1) an infer-and-confirm data collection strategy which passively collects data and infers data labels using the user’s natural response to target events, (2) a weighted accuracy measure Aw as the objective function for underlying machine learners in mobile context-aware applications and (3) two training instance selection algorithms, Training Grid and Training Clusters which only forget data points in areas of the data space where newer evidence is available, moving away from the traditional time window based techniques. We also propose a new way of measuring concept drift indicating which type of concept drift adaption strategy is likely to be beneficial for any given dataset. This thesis also shows the extent to which the requirements posed by the use of machine learning in deployable mobile context-aware applications influences its overall design by evaluating a mobile context-aware application prototype called RingLearn, which was developed to mitigate disruptive incoming calls. Finally, we benchmark our training instance selection algorithms over 8 data corpuses including the RingLearn corpus collected over 16 weeks and the Device Analyzer corpus which logs several years of smartphone usage for a large set of users. Results show that our algorithms perform at least as well as state-of-the-art solutions and many times significantly better with performance delta ranging from -0.2% to +11.3% compared to the best existing solutions over our experiments.Open Acces

    Understanding receptivity to interruptions in mobile human-computer interaction

    Get PDF
    Interruptions have a profound impact on our attentional orientation in everyday life. Recent advances in mobile information technology increase the number of potentially disruptive notifications on mobile devices by an increasing availability of services. Understanding the contextual intricacies that make us receptive to these interruptions is paramount to devising technology that supports interruption management. This thesis makes a number of contributions to the methodology of studying mobile experiences in situ, understanding receptivity to interruptions, and designing context-sensitive systems. This thesis presents a series of real-world studies that investigate opportune moments for interruptions in mobile settings. In order to facilitate the study of the multi-faceted ways opportune moments surface from participants' involvement in the world this thesis develops: - a model of the contextual factors that interact to guide receptivity to interruptions, and - an adaptation of the Experience-Sampling Method (ESM) to capture behavioural response to interruptions in situ. In two naturalistic experiments, participants' experiences of being interrupted on a mobile phone are sampled as they go about their everyday lives. In a field study, participants' experiences are observed and recorded as they use a notification-driven mobile application to create photo-stories in a theme park. Experiment 1 explores the effects of content and time of delivery of the interruption. The results show that receptivity to text messages is significantly affected by message content, while scheduling one's own interruption times in advance does not improve receptivity over randomly timed interruptions. Experiment 2 investigates the hypothesis that opportune moments to deliver notifications are located at the endings of episodes of mobile interaction such as texting and calling. This notification strategy is supported by significant effects in behavioural measures of receptivity, while self-reports and interviews reveal complexities in the subjective experience of the interruption. By employing a mixed methods approach of interviews, observations and an analysis of system logs in the field study, it is shown that participants appreciated location-based notifications as prompts to foreground the application during relative 'downtimes' from other activities. However, an unexpected quantity of redundant notifications meant that visitors soon habituated to and eventually ignored them, which suggests careful, sparing use of notifications in interactive experiences. Overall, the studies showed that contextual mediation of the timing of interruptions (e.g. by phone activity in Experiment 2 and opportune places in the field study) is more likely to lead to interruptions at opportune moments than when participants schedule their own interruptions. However, momentary receptivity and responsiveness to an interruption is determined by the complex and situated interactions of local and relational contextual factors. These contextual factors are captured in a model of receptivity that underlies the interruption process. The studies highlight implications for the design of systems that seek to manage interruptions by adapting the timing of interruptions to the user's situation. In particular, applications to manage interruptions in personal communication and pervasive experiences are considered

    Understanding receptivity to interruptions in mobile human-computer interaction

    Get PDF
    Interruptions have a profound impact on our attentional orientation in everyday life. Recent advances in mobile information technology increase the number of potentially disruptive notifications on mobile devices by an increasing availability of services. Understanding the contextual intricacies that make us receptive to these interruptions is paramount to devising technology that supports interruption management. This thesis makes a number of contributions to the methodology of studying mobile experiences in situ, understanding receptivity to interruptions, and designing context-sensitive systems. This thesis presents a series of real-world studies that investigate opportune moments for interruptions in mobile settings. In order to facilitate the study of the multi-faceted ways opportune moments surface from participants' involvement in the world this thesis develops: - a model of the contextual factors that interact to guide receptivity to interruptions, and - an adaptation of the Experience-Sampling Method (ESM) to capture behavioural response to interruptions in situ. In two naturalistic experiments, participants' experiences of being interrupted on a mobile phone are sampled as they go about their everyday lives. In a field study, participants' experiences are observed and recorded as they use a notification-driven mobile application to create photo-stories in a theme park. Experiment 1 explores the effects of content and time of delivery of the interruption. The results show that receptivity to text messages is significantly affected by message content, while scheduling one's own interruption times in advance does not improve receptivity over randomly timed interruptions. Experiment 2 investigates the hypothesis that opportune moments to deliver notifications are located at the endings of episodes of mobile interaction such as texting and calling. This notification strategy is supported by significant effects in behavioural measures of receptivity, while self-reports and interviews reveal complexities in the subjective experience of the interruption. By employing a mixed methods approach of interviews, observations and an analysis of system logs in the field study, it is shown that participants appreciated location-based notifications as prompts to foreground the application during relative 'downtimes' from other activities. However, an unexpected quantity of redundant notifications meant that visitors soon habituated to and eventually ignored them, which suggests careful, sparing use of notifications in interactive experiences. Overall, the studies showed that contextual mediation of the timing of interruptions (e.g. by phone activity in Experiment 2 and opportune places in the field study) is more likely to lead to interruptions at opportune moments than when participants schedule their own interruptions. However, momentary receptivity and responsiveness to an interruption is determined by the complex and situated interactions of local and relational contextual factors. These contextual factors are captured in a model of receptivity that underlies the interruption process. The studies highlight implications for the design of systems that seek to manage interruptions by adapting the timing of interruptions to the user's situation. In particular, applications to manage interruptions in personal communication and pervasive experiences are considered
    corecore