10 research outputs found

    Jotmail: A voicemail interface that enables you to see what was said

    Get PDF
    stevew/julia/urs @ research.att.com Voicemail is a pervasive, but under-researched tool for workplace communication. Despite potential advantages of voicemail over email, current phone-based voicemail UIs are highly problematic for users. We present a novel, Web-based, voicemail interface, Jotmail. The design was based on data from several studies of voicemail tasks and user strategies. The GUI has two main elements: (a) personal annotations that serve as a visual analogue to underlying speech; (b) automatically derived message header information. We evaluated Jotmail in an 8-week field trial, where people used it as their only means for accessing voicemail. Jotmail was successful in supporting most key voicemail tasks, although users ' electronic annotation and archiving behaviors were different from our initial predictions. Our results argue for the utility of a combination of annotation based indexing and automatically derived information, as a general technique for accessing speech archives

    Enabling large-scale asynchronous audio discussions on mobile devices

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2006.Includes bibliographical references (p. 61-63).Current mobile technology works well to connect individuals together at any time or place. However, general focus on one-to-one conversations has overlooked the potential of always-on group and community links. I hypothesize that asynchronous persistent audio is a superior medium to support scalable always-on group communication for mobile devices. To evaluate this claim, one must first have an adequate interaction design before its possible to investigate the qualities and usage patterns over the long-term. This design does not exist for mobile devices. This thesis takes the first step in this direction by creating and evaluating an initial design called RadioActive. RadioActive is a technological and interaction design for persistent mobile audio chat spaces, focusing on the key issue of navigating asynchronous audio. If RadioActive is shown to be a good design in the long-term, I hope to prove with additional studies the value of asynchronous persistent audio. In this thesis I examine related work, describe RadioActive from a methodologically constrained bottom-up approach, discuss the theoretical rationale behind the design, what seems to work, what doesn't, and suggestions for the future.Aaron Zinman.S.M

    Navigating a spatialized speech environment throught simultaneous listening within a hallway metaphor

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 1998.Includes bibliographical references (leaves 69-71).by Brenden Courtney Maher.M.S

    NewsComm--a hand-held device for interactive access to structured audio

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1995.Includes bibliographical references (leaves 74-76).Deb Kumar Roy.M.S

    AudioStreamer--leveraging the cocktail party effect for efficient listening

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1996.Includes bibliographical references (p. 89-94).by Atty Thomas Mullins.M.S

    The annotation of continuous media

    Get PDF
    In principle, the presentation of continuous media is time-dependent. Examples of con­tinuous media are audio, video and graphics animation. This work is on the support for the annotation of continuous media, or the integration of voice comments with continuous- media documents like music and video clips. This application has strict synchronisation requirements, both with respect to the media involved and to user interaction. The applica­tion involves functions such as storage, management, control of GUIs, and of continuous- medium devices. These are realised by components which can be distributed across a network. New models and architectures have been defined to enable open distributed processing of applications, that is, distributed processing independent of operating systems. Abstractions are provided, which facilitate the development of applications, and these execute supported by platforms that implement such open architectures. These architectures have been based on an object-based client/server model. Our work aims at exploring object-orientation, open distributed processing and some characteristics of continuous media, through the development and use of the proposed application. The application is designed as a set of objects with well-defined functions and which interact between themselves. A distinguishing feature of the application is that it involves reusable components and mechanisms. For example, a mechanism, which enables components to control logical clocks and synchronise them, is incorporated in the application in response to its synchronisation requirements. The implementation is based on ANSAware, a platform that supports open distributed processing and allows distributed objects to bind to each other, to interact with one another, and to exhibit concurrent activities. The performance of the implementation is examined with respect to the application’s response to user requests. Response times of operations such as play, pause, etc., are measured, and the final results are better than a defined maximum tolerance. An analysis of the development approach is made with respect to support for real-time activities in the application, and to software reuse in the model proposed. This thesis concludes by reviewing the suitability of the object-oriented approach for the development of distributed continuous media applications

    Interactively skimming recorded speech

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1994.Includes bibliographical references (p. 143-156).Barry Michael Arons.Ph.D

    Capturing, structuring and representing ubiquitous audio

    No full text
    Although talking is an integral part of collaboration, there has been little computer support for acqmrmg and accessing the contents of conversations. Our approach has focused on ubzquztous a udLo, or the unobtrusive capture of speech interactions in everyday work environments. Speech recognition technology cannot yet transcribe fluent conversational speech, so the words them-selves are not available for organizing the captured interactions. Instead, the structure of an interaction M derived from acoustical information inherent in the stored speech and augmented by user interaction during or after capture. This article describes apphcations for capturing and structuring audio from office discussions and telephone calls, and mechanisms for later retrieval of these stored interactions. An Important aspect of retrieval is choosing an appropriate visual representation, and this article describes the evolution of a family of representations across a range of applications. Finally, this work is placed within the broader context of desktop audio, mobde audio applications, and social implications

    The audio notebook : paper and pen interaction with structured speech

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1997.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (leaves 143-150).Lisa Joy Stifelman.Ph.D

    Speaking on the record

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2005.Includes bibliographical references (p. 258-273).Reading and writing have become the predominant way of acquiring and expressing intellect in Western culture. Somewhere along the way, the ability to write has become completely identified with intellectual power, creating a graphocentric myopia concerning the very nature and transfer of knowledge. One of the effects of graphocentrism is a conflation of concepts proper to knowledge in general with concepts specific to written expression. The words 'literate' and 'literacy' themselves are a simple case: their connotations sometimes focus on the process of reading text and sometimes on the kinds of knowledge that happen to be associated in our culture with people who read many books. This thesis has a conceptual and an empirical component. On the conceptual side a central task is to disengage certain concepts that have become conflated by defining new terms. Our vocabulary is insufficient to describe alternatives that serve some or all of the functions of writing and reading in a different modality. As a first step, I introduce a new word to provide a counterpart to writing in a spoken modality: speak + write = sprite. Spriting in its general form is the activity of speaking 'on the record' that yields a technologically-supported representation of oral speech with essential properties of writing such as permanence of record, possibilities of editing, indexing, and scanning, but without the difficult transition to a deeply different form of representation such as writing itself. This thesis considers a particular (still primitive compared with might come in the future) version of spriting in the form of two technology-supported representations of speech: (1) the speech ·in audible form, and (2) the speech in visible form.(cont.) The product of spriting is a kind of 'spoken' document, or talkument. As one reads a text, one may likewise aude a talkument. In contrast, I use the word writing for the manual activity of making marks, while text refers to the marks made. Making these distinctions is a small step towards envisioning a deep change in the world that might go beyond graphocentrism and come to appreciate spriting as the first step--but just the first--towards developing ways of manipulating spoken language, exemplified by turning it into a permanent record, permitting editing, indexing, searching and more. The empirical side of the thesis is confined to exploring implications of spriting in educational settings. I study one group of urban adults who are at elementary levels of reading and writing, and two groups of urban elementary school children who are of different ages, cultures and socioeconomic status, and who have appropriated writing as a tool for thought and expression to greater or lesser extents. One effect of graphocentrism in our culture is the very limited and constrained developmental path of literacy and learning. This has not always been the case. And it does not need to be so in the future. This thesis discusses some small ways in which we might re-value modes of expression in education closer to oral language than to writing. This thesis recognizes three ways in which spriting is relevant to education: (1) spriting can serve as a stepping stone to writing skills, (2) it can in some circumstances serve as a substitute for writing, and (3) it provides a window onto cognitive processes that are present but less apparent in the context of producing text.Tara Michelle Rosenberger Shankar.Ph.D
    corecore