25,000 research outputs found
Convo: What does conversational programming need? An exploration of machine learning interface design
Vast improvements in natural language understanding and speech recognition
have paved the way for conversational interaction with computers. While
conversational agents have often been used for short goal-oriented dialog, we
know little about agents for developing computer programs. To explore the
utility of natural language for programming, we conducted a study (=45)
comparing different input methods to a conversational programming system we
developed. Participants completed novice and advanced tasks using voice-based,
text-based, and voice-or-text-based systems. We found that users appreciated
aspects of each system (e.g., voice-input efficiency, text-input precision) and
that novice users were more optimistic about programming using voice-input than
advanced users. Our results show that future conversational programming tools
should be tailored to users' programming experience and allow users to choose
their preferred input mode. To reduce cognitive load, future interfaces can
incorporate visualizations and possess custom natural language understanding
and speech recognition models for programming.Comment: 9 pages, 7 figures, submitted to VL/HCC 2020, for associated user
study video: https://youtu.be/TC5P3OO5ex
A system design for human factors studies of speech-enabled Web browsing
This paper describes the design of a system which will subsequently be used as the basis of a range of empirical studies aimed at discovering how best to harness speech recognition capabilities in multimodal multimedia computing. Initial work focuses on speech-enabled browsing of the World Wide Web, which was never designed for such use. System design is complete, and is being evaluated via usability testing
Design and implementation of a user-oriented speech recognition interface: the synergy of technology and human factors
The design and implementation of a user-oriented speech recognition interface are described. The interface enables the use of speech recognition in so-called interactive voice response systems which can be accessed via a telephone connection. In the design of the interface a synergy of technology and human factors is achieved. This synergy is very important for making speech interfaces a natural and acceptable form of human-machine interaction. Important concepts such as interfaces, human factors and speech recognition are discussed. Additionally, an indication is given as to how the synergy of human factors and technology can be realised by a sketch of the interface's implementation. An explanation is also provided of how the interface might be integrated in different applications fruitfully
A study of the very high order natural user language (with AI capabilities) for the NASA space station common module
The requirements are identified for a very high order natural language to be used by crew members on board the Space Station. The hardware facilities, databases, realtime processes, and software support are discussed. The operations and capabilities that will be required in both normal (routine) and abnormal (nonroutine) situations are evaluated. A structure and syntax for an interface (front-end) language to satisfy the above requirements are recommended
Developing a comprehensive framework for multimodal feature extraction
Feature extraction is a critical component of many applied data science
workflows. In recent years, rapid advances in artificial intelligence and
machine learning have led to an explosion of feature extraction tools and
services that allow data scientists to cheaply and effectively annotate their
data along a vast array of dimensions---ranging from detecting faces in images
to analyzing the sentiment expressed in coherent text. Unfortunately, the
proliferation of powerful feature extraction services has been mirrored by a
corresponding expansion in the number of distinct interfaces to feature
extraction services. In a world where nearly every new service has its own API,
documentation, and/or client library, data scientists who need to combine
diverse features obtained from multiple sources are often forced to write and
maintain ever more elaborate feature extraction pipelines. To address this
challenge, we introduce a new open-source framework for comprehensive
multimodal feature extraction. Pliers is an open-source Python package that
supports standardized annotation of diverse data types (video, images, audio,
and text), and is expressly with both ease-of-use and extensibility in mind.
Users can apply a wide range of pre-existing feature extraction tools to their
data in just a few lines of Python code, and can also easily add their own
custom extractors by writing modular classes. A graph-based API enables rapid
development of complex feature extraction pipelines that output results in a
single, standardized format. We describe the package's architecture, detail its
major advantages over previous feature extraction toolboxes, and use a sample
application to a large functional MRI dataset to illustrate how pliers can
significantly reduce the time and effort required to construct sophisticated
feature extraction workflows while increasing code clarity and maintainability
- …