11,384 research outputs found
ImageSpirit: Verbal Guided Image Parsing
Humans describe images in terms of nouns and adjectives while algorithms
operate on images represented as sets of pixels. Bridging this gap between how
humans would like to access images versus their typical representation is the
goal of image parsing, which involves assigning object and attribute labels to
pixel. In this paper we propose treating nouns as object labels and adjectives
as visual attribute labels. This allows us to formulate the image parsing
problem as one of jointly estimating per-pixel object and attribute labels from
a set of training images. We propose an efficient (interactive time) solution.
Using the extracted labels as handles, our system empowers a user to verbally
refine the results. This enables hands-free parsing of an image into pixel-wise
object/attribute labels that correspond to human semantics. Verbally selecting
objects of interests enables a novel and natural interaction modality that can
possibly be used to interact with new generation devices (e.g. smart phones,
Google Glass, living room devices). We demonstrate our system on a large number
of real-world images with varying complexity. To help understand the tradeoffs
compared to traditional mouse based interactions, results are reported for both
a large scale quantitative evaluation and a user study.Comment: http://mmcheng.net/imagespirit
Computer-aided boundary delineation of agricultural lands
The National Agricultural Statistics Service of the United States Department of Agriculture (USDA) presently uses labor-intensive aerial photographic interpretation techniques to divide large geographical areas into manageable-sized units for estimating domestic crop and livestock production. Prototype software, the computer-aided stratification (CAS) system, was developed to automate the procedure, and currently runs on a Sun-based image processing system. With a background display of LANDSAT Thematic Mapper and United States Geological Survey Digital Line Graph data, the operator uses a cursor to delineate agricultural areas, called sampling units, which are assigned to strata of land-use and land-cover types. The resultant stratified sampling units are used as input into subsequent USDA sampling procedures. As a test, three counties in Missouri were chosen for application of the CAS procedures. Subsequent analysis indicates that CAS was five times faster in creating sampling units than the manual techniques were
A study of the very high order natural user language (with AI capabilities) for the NASA space station common module
The requirements are identified for a very high order natural language to be used by crew members on board the Space Station. The hardware facilities, databases, realtime processes, and software support are discussed. The operations and capabilities that will be required in both normal (routine) and abnormal (nonroutine) situations are evaluated. A structure and syntax for an interface (front-end) language to satisfy the above requirements are recommended
Object Referring in Visual Scene with Spoken Language
Object referring has important applications, especially for human-machine
interaction. While having received great attention, the task is mainly attacked
with written language (text) as input rather than spoken language (speech),
which is more natural. This paper investigates Object Referring with Spoken
Language (ORSpoken) by presenting two datasets and one novel approach. Objects
are annotated with their locations in images, text descriptions and speech
descriptions. This makes the datasets ideal for multi-modality learning. The
approach is developed by carefully taking down ORSpoken problem into three
sub-problems and introducing task-specific vision-language interactions at the
corresponding levels. Experiments show that our method outperforms competing
methods consistently and significantly. The approach is also evaluated in the
presence of audio noise, showing the efficacy of the proposed vision-language
interaction methods in counteracting background noise.Comment: 10 pages, Submitted to WACV 201
Implementation of a Human-Computer Interface for Computer Assisted Translation and Handwritten Text Recognition
A human-computer interface is developed to provide services of computer assisted machine translation (CAT) and computer assisted transcription of handwritten text images (CATTI). The back-end machine translation (MT) and handwritten text recognition (HTR) systems are provided by the Pattern Recognition and Human Language Technology (PRHLT) research group. The idea is to provide users with easy to use tools to convert interactive translation and transcription feasible tasks. The assisted service is provided by remote servers with CAT or CATTI capabilities. The interface supplies the user with tools for efficient local edition: deletion, insertion and substitution.Ocampo Sepúlveda, JC. (2009). Implementation of a Human-Computer Interface for Computer Assisted Translation and Handwritten Text Recognition. http://hdl.handle.net/10251/14318Archivo delegad
Semi-Automated SVG Programming via Direct Manipulation
Direct manipulation interfaces provide intuitive and interactive features to
a broad range of users, but they often exhibit two limitations: the built-in
features cannot possibly cover all use cases, and the internal representation
of the content is not readily exposed. We believe that if direct manipulation
interfaces were to (a) use general-purpose programs as the representation
format, and (b) expose those programs to the user, then experts could customize
these systems in powerful new ways and non-experts could enjoy some of the
benefits of programmable systems.
In recent work, we presented a prototype SVG editor called Sketch-n-Sketch
that offered a step towards this vision. In that system, the user wrote a
program in a general-purpose lambda-calculus to generate a graphic design and
could then directly manipulate the output to indirectly change design
parameters (i.e. constant literals) in the program in real-time during the
manipulation. Unfortunately, the burden of programming the desired
relationships rested entirely on the user.
In this paper, we design and implement new features for Sketch-n-Sketch that
assist in the programming process itself. Like typical direct manipulation
systems, our extended Sketch-n-Sketch now provides GUI-based tools for drawing
shapes, relating shapes to each other, and grouping shapes together. Unlike
typical systems, however, each tool carries out the user's intention by
transforming their general-purpose program. This novel, semi-automated
programming workflow allows the user to rapidly create high-level, reusable
abstractions in the program while at the same time retaining direct
manipulation capabilities. In future work, our approach may be extended with
more graphic design features or realized for other application domains.Comment: In 29th ACM User Interface Software and Technology Symposium (UIST
2016
- …