38,092 research outputs found
An Introduction to 3D User Interface Design
3D user interface design is a critical component of any virtual environment (VE) application. In this paper, we present a broad overview of three-dimensional (3D) interaction and user interfaces. We discuss the effect of common VE hardware devices on user interaction, as well as interaction techniques for generic 3D tasks and the use of traditional two-dimensional interaction styles in 3D environments. We divide most user interaction tasks into three categories: navigation, selection/manipulation, and system control. Throughout the paper, our focus is on presenting not only the available techniques, but also practical guidelines for 3D interaction design and widely held myths. Finally, we briefly discuss two approaches to 3D interaction design, and some example applications with complex 3D interaction requirements. We also present an annotated online bibliography as a reference companion to this article
ImageSpirit: Verbal Guided Image Parsing
Humans describe images in terms of nouns and adjectives while algorithms
operate on images represented as sets of pixels. Bridging this gap between how
humans would like to access images versus their typical representation is the
goal of image parsing, which involves assigning object and attribute labels to
pixel. In this paper we propose treating nouns as object labels and adjectives
as visual attribute labels. This allows us to formulate the image parsing
problem as one of jointly estimating per-pixel object and attribute labels from
a set of training images. We propose an efficient (interactive time) solution.
Using the extracted labels as handles, our system empowers a user to verbally
refine the results. This enables hands-free parsing of an image into pixel-wise
object/attribute labels that correspond to human semantics. Verbally selecting
objects of interests enables a novel and natural interaction modality that can
possibly be used to interact with new generation devices (e.g. smart phones,
Google Glass, living room devices). We demonstrate our system on a large number
of real-world images with varying complexity. To help understand the tradeoffs
compared to traditional mouse based interactions, results are reported for both
a large scale quantitative evaluation and a user study.Comment: http://mmcheng.net/imagespirit
Protecting Voice Controlled Systems Using Sound Source Identification Based on Acoustic Cues
Over the last few years, a rapidly increasing number of Internet-of-Things
(IoT) systems that adopt voice as the primary user input have emerged. These
systems have been shown to be vulnerable to various types of voice spoofing
attacks. Existing defense techniques can usually only protect from a specific
type of attack or require an additional authentication step that involves
another device. Such defense strategies are either not strong enough or lower
the usability of the system. Based on the fact that legitimate voice commands
should only come from humans rather than a playback device, we propose a novel
defense strategy that is able to detect the sound source of a voice command
based on its acoustic features. The proposed defense strategy does not require
any information other than the voice command itself and can protect a system
from multiple types of spoofing attacks. Our proof-of-concept experiments
verify the feasibility and effectiveness of this defense strategy.Comment: Proceedings of the 27th International Conference on Computer
Communications and Networks (ICCCN), Hangzhou, China, July-August 2018. arXiv
admin note: text overlap with arXiv:1803.0915
An information assistant system for the prevention of tunnel vision in crisis management
In the crisis management environment, tunnel vision is a set of bias in decision makers’ cognitive process which often leads to incorrect understanding of the real crisis situation, biased perception of information, and improper decisions. The tunnel vision phenomenon is a consequence of both the challenges in the task and the natural limitation in a human being’s cognitive process. An information assistant system is proposed with the purpose of preventing tunnel vision. The system serves as a platform for monitoring the on-going crisis event. All information goes through the system before arrives at the user. The system enhances the data quality, reduces the data quantity and presents the crisis information in a manner that prevents or repairs the user’s cognitive overload. While working with such a system, the users (crisis managers) are expected to be more likely to stay aware of the actual situation, stay open minded to possibilities, and make proper decisions
Socially-distributed cognition and cognitive architectures: towards an ACT-R-based cognitive social simulation capability
ACT-R is one of the most widely used cognitive architectures, and it has been used to model hundreds of phenomena described in the cognitive psychology literature. In spite of this, there are relatively few studies that have attempted to apply ACT-R to situations involving social interaction. This is an important omission since the social aspects of cognition have been a growing area of interest in the cognitive science community, and an understanding of the dynamics of collective cognition is of particular importance in many organizational settings. In order to support the computational modeling and simulation of socially-distributed cognitive processes, a simulation capability based on the ACT-R architecture is described. This capability features a number of extensions to the core ACT-R architecture that are intended to support social interaction and collaborative problem solving. The core features of a number of supporting applications and services are also described. These applications/services support the execution, monitoring and analysis of simulation experiments. Finally, a system designed to record human behavioral data in a collective problem-solving task is described. This system is being used to undertake a range of experiments with teams of human subjects, and it will ultimately support the development of high fidelity ACT-R cognitive models. Such models can be used in conjunction with the ACT-R simulation capability to test hypotheses concerning the interaction between cognitive, social and technological factors in tasks involving socially-distributed information processing
DolphinAtack: Inaudible Voice Commands
Speech recognition (SR) systems such as Siri or Google Now have become an
increasingly popular human-computer interaction method, and have turned various
systems into voice controllable systems(VCS). Prior work on attacking VCS shows
that the hidden voice commands that are incomprehensible to people can control
the systems. Hidden voice commands, though hidden, are nonetheless audible. In
this work, we design a completely inaudible attack, DolphinAttack, that
modulates voice commands on ultrasonic carriers (e.g., f > 20 kHz) to achieve
inaudibility. By leveraging the nonlinearity of the microphone circuits, the
modulated low frequency audio commands can be successfully demodulated,
recovered, and more importantly interpreted by the speech recognition systems.
We validate DolphinAttack on popular speech recognition systems, including
Siri, Google Now, Samsung S Voice, Huawei HiVoice, Cortana and Alexa. By
injecting a sequence of inaudible voice commands, we show a few
proof-of-concept attacks, which include activating Siri to initiate a FaceTime
call on iPhone, activating Google Now to switch the phone to the airplane mode,
and even manipulating the navigation system in an Audi automobile. We propose
hardware and software defense solutions. We validate that it is feasible to
detect DolphinAttack by classifying the audios using supported vector machine
(SVM), and suggest to re-design voice controllable systems to be resilient to
inaudible voice command attacks.Comment: 15 pages, 17 figure
- …