88 research outputs found

    Challenges and Opportunities for the Design of Smart Speakers

    Full text link
    Advances in voice technology and voice user interfaces (VUIs) -- such as Alexa, Siri, and Google Home -- have opened up the potential for many new types of interaction. However, despite the potential of these devices reflected by the growing market and body of VUI research, there is a lingering sense that the technology is still underused. In this paper, we conducted a systematic literature review of 35 papers to identify and synthesize 127 VUI design guidelines into five themes. Additionally, we conducted semi-structured interviews with 15 smart speaker users to understand their use and non-use of the technology. From the interviews, we distill four design challenges that contribute the most to non-use. Based on their (non-)use, we identify four opportunity spaces for designers to explore such as focusing on information support while multitasking (cooking, driving, childcare, etc), incorporating users' mental models for smart speakers, and integrating calm design principles.Comment: 15 pages, 7 figure

    PopNet: a Pop Culture Knowledge Association Network for Supporting Creative Connections

    Full text link
    Pop culture is a pervasive and important aspect of communication and self-expression. When people wish to communicate using pop culture references, they need to find connections between their message and the things, people, location and actions of a movie, tv series, or other pop culture domain. However, finding an appropriate match from memory is challenging and search engines are not specific enough to the task. Often domain-specific knowledge graphs provide the structure, specificity and search capabilities that people need. We introduce PopNet - a Pop Culture Knowledge Association Network automatically created from plain text using state-of-the art NLP methods to extract entities and actions from text summaries of movies and tv shows. The interface allows people to browse and search the entries to find connections. We conduct a study showing that this system is accurate and helpful for finding multiple connections between a message and a pop culture domain

    Design Guidelines for Prompt Engineering Text-to-Image Generative Models

    Full text link
    Text-to-image generative models are a new and powerful way to generate visual artwork. However, the open-ended nature of text as interaction is double-edged; while users can input anything and have access to an infinite range of generations, they also must engage in brute-force trial and error with the text prompt when the result quality is poor. We conduct a study exploring what prompt keywords and model hyperparameters can help produce coherent outputs. In particular, we study prompts structured to include subject and style keywords and investigate success and failure modes of these prompts. Our evaluation of 5493 generations over the course of five experiments spans 51 abstract and concrete subjects as well as 51 abstract and figurative styles. From this evaluation, we present design guidelines that can help people produce better outcomes from text-to-image generative models

    Eliciting Topic Hierarchies from Large Language Models

    Full text link
    Finding topics to write about can be a mentally demanding process. However, topic hierarchies can help writers explore topics of varying levels of specificity. In this paper, we use large language models (LLMs) to help construct topic hierarchies. Although LLMs have access to such knowledge, it can be difficult to elicit due to issues of specificity, scope, and repetition. We designed and tested three different prompting techniques to find one that maximized accuracy. We found that prepending the general topic area to a prompt yielded the most accurate results with 85% accuracy. We discuss applications of this research including STEM writing, education, and content creation.Comment: 4 pages, 4 figure

    Generative Disco: Text-to-Video Generation for Music Visualization

    Full text link
    Visuals are a core part of our experience of music, owing to the way they can amplify the emotions and messages conveyed through the music. However, creating music visualization is a complex, time-consuming, and resource-intensive process. We introduce Generative Disco, a generative AI system that helps generate music visualizations with large language models and text-to-image models. Users select intervals of music to visualize and then parameterize that visualization by defining start and end prompts. These prompts are warped between and generated according to the beat of the music for audioreactive video. We introduce design patterns for improving generated videos: "transitions", which express shifts in color, time, subject, or style, and "holds", which encourage visual emphasis and consistency. A study with professionals showed that the system was enjoyable, easy to explore, and highly expressive. We conclude on use cases of Generative Disco for professionals and how AI-generated content is changing the landscape of creative work

    TurKit: Tools for iterative tasks on mechanical Turk

    Get PDF
    Mechanical Turk (MTurk) is an increasingly popular web service for paying people small rewards to do human computation tasks. Current uses of MTurk typically post independent parallel tasks. We are exploring an alternative iterative paradigm, in which workers build on or evaluate each other's work. We describe TurKit, a new toolkit for deploying iterative tasks to MTurk, with a familiar imperative programming paradigm that effectively uses MTurk workers as subroutines.National Science Foundation (U.S.). (Grant number IIS-0447800)Quanta Computer (Firm)Massachusetts Institute of Technology. Center for Collective Intelligenc

    Task search in a human computation market

    Get PDF
    In order to understand how a labor market for human computation functions, it is important to know how workers search for tasks. This paper uses two complementary methods to gain insight into how workers search for tasks on Mechanical Turk. First, we perform a high frequency scrape of 36 pages of search results and analyze it by looking at the rate of disappearance of tasks across key ways Mechanical Turk allows workers to sort tasks. Second, we present the results of a survey in which we paid workers for self-reported information about how they search for tasks. Our main findings are that on a large scale, workers sort by which tasks are most recently posted and which have the largest number of tasks available. Furthermore, we find that workers look mostly at the first page of the most recently posted tasks and the first two pages of the tasks with the most available instances but in both categories the position on the result page is unimportant to workers. We observe that at least some employers try to manipulate the position of their task in the search results to exploit the tendency to search for recently posted tasks. On an individual level, we observed workers searching by almost all the possible categories and looking more than 10 pages deep. For a task we posted to Mechanical Turk, we confirmed that a favorable position in the search results do matter: our task with favorable positioning was completed 30 times faster and for less money than when its position was unfavorable.National Science Foundation (U.S.). Integrative Graduate Education and Research Traineeship (Multidisciplinary Program in Inequality & Social Policy) (Grant Number 033340

    TurKit: Human Computation Algorithms on Mechanical Turk

    Get PDF
    Mechanical Turk (MTurk) provides an on-demand source of human computation. This provides a tremendous opportunity to explore algorithms which incorporate human computation as a function call. However, various systems challenges make this difficult in practice, and most uses of MTurk post large numbers of independent tasks. TurKit is a toolkit for prototyping and exploring algorithmic human computation, while maintaining a straight-forward imperative programming style. We present the crash-and-rerun programming model that makes TurKit possible, along with a variety of applications for human computation algorithms. We also present case studies of TurKit used for real experiments across different fields.Xerox CorporationNational Science Foundation (U.S.) (Grant No. IIS- 0447800)Quanta ComputerMassachusetts Institute of Technology. Center for Collective Intelligenc
    • …
    corecore