5,539 research outputs found
Sketch and attribute based query interfaces
In this thesis, machine learning algorithms to improve human computer interaction are designed. The two areas of interest are (i) sketched symbol recognition and (ii) object recognition from images. Specifically, auto-completion of sketched symbols and attribute-centric recognition of objects from images are the main focus of this thesis. In the former task, the aim is to be able to recognize partially drawn symbols before they are fully completed. Auto-completion during sketching is desirable since it eliminates the need for the user to draw symbols in their entirety if they can be recognized while they are partially drawn. It can thus be used to increase the sketching throughput; to facilitate sketching by offering possible alternatives to the user; and to reduce user-originated errors by providing continuous feedback. The latter task, allows machine learning algorithms to describe objects with visual attributes such as “square”, “metallic” and “red”. Attributes as intermediate representations can be used to create systems with human interpretable image indexes, zero-shot learning capability where only textual descriptions are available or capability to annotate images with textual descriptions
Stroke-based sketched symbol reconstruction and segmentation
Hand-drawn objects usually consist of multiple semantically meaningful parts.
For example, a stick figure consists of a head, a torso, and pairs of legs and
arms. Efficient and accurate identification of these subparts promises to
significantly improve algorithms for stylization, deformation, morphing and
animation of 2D drawings. In this paper, we propose a neural network model that
segments symbols into stroke-level components. Our segmentation framework has
two main elements: a fixed feature extractor and a Multilayer Perceptron (MLP)
network that identifies a component based on the feature. As the feature
extractor we utilize an encoder of a stroke-rnn, which is our newly proposed
generative Variational Auto-Encoder (VAE) model that reconstructs symbols on a
stroke by stroke basis. Experiments show that a single encoder could be reused
for segmenting multiple categories of sketched symbols with negligible effects
on segmentation accuracies. Our segmentation scores surpass existing
methodologies on an available small state of the art dataset. Moreover,
extensive evaluations on our newly annotated big dataset demonstrate that our
framework obtains significantly better accuracies as compared to baseline
models. We release the dataset to the community
New methods, techniques and applications for sketch recognition
2012-2013The use of diagrams is common in various disciplines. Typical examples
include maps, line graphs, bar charts, engineering blueprints, architects’
sketches, hand drawn schematics, etc.. In general, diagrams can be created
either by using pen and paper, or by using specific computer programs. These
programs provide functions to facilitate the creation of the diagram, such as
copy-and-paste, but the classic WIMP interfaces they use are unnatural when
compared to pen and paper. Indeed, it is not rare that a designer prefers
to use pen and paper at the beginning of the design, and then transfer the
diagram to the computer later.
To avoid this double step, a solution is to allow users to sketch directly on
the computer. This requires both specific hardware and sketch recognition
based software. As regards hardware, many pen/touch based devices such as
tablets, smartphones, interactive boards and tables, etc. are available today,
also at reasonable costs. Sketch recognition is needed when the sketch must
be processed and not considered as a simple image and it is crucial to the
success of this new modality of interaction. It is a difficult problem due to the
inherent imprecision and ambiguity of a freehand drawing and to the many
domains of applications. The aim of this thesis is to propose new methods
and applications regarding the sketch recognition. The presentation of the
results is divided into several contributions, facing problems such as corner
detection, sketched symbol recognition and autocompletion, graphical context
detection, sketched Euler diagram interpretation.
The first contribution regards the problem of detecting the corners present
in a stroke. Corner detection is often performed during preprocessing to
segment a stroke in single simple geometric primitives such as lines or curves.
The corner recognizer proposed in this thesis, RankFrag, is inspired by the
method proposed by Ouyang and Davis in 2011 and improves the accuracy
percentages compared to other methods recently proposed in the literature.
The second contribution is a new method to recognize multi-stroke hand
drawn symbols, which is invariant with respect to scaling and supports symbol
recognition independently from the number and order of strokes. The method
is an adaptation of the algorithm proposed by Belongie et al. in 2002 to the
case of sketched images. This is achieved by using stroke related information.
The method has been evaluated on a set of more than 100 symbols from
the Military Course of Action domain and the results show that the new
recognizer outperforms the original one.
The third contribution is a new method for recognizing multi-stroke partially
hand drawn symbols which is invariant with respect to scale, and
supports symbol recognition independently from the number and order of
strokes. The recognition technique is based on subgraph isomorphism and
exploits a novel spatial descriptor, based on polar histograms, to represent
relations between two stroke primitives. The tests show that the approach
gives a satisfactory recognition rate with partially drawn symbols, also with
a very low level of drawing completion, and outperforms the existing approaches
proposed in the literature. Furthermore, as an application, a system
presenting a user interface to draw symbols and implementing the proposed
autocompletion approach has been developed. Moreover a user study aimed
at evaluating the human performance in hand drawn symbol autocompletion
has been presented. Using the set of symbols from the Military Course of
Action domain, the user study evaluates the conditions under which the
users are willing to exploit the autocompletion functionality and those under
which they can use it efficiently. The results show that the autocompletion
functionality can be used in a profitable way, with a drawing time saving of
about 18%.
The fourth contribution regards the detection of the graphical context of
hand drawn symbols, and in particular, the development of an approach for
identifying attachment areas on sketched symbols. In the field of syntactic
recognition of hand drawn visual languages, the recognition of the relations
among graphical symbols is one of the first important tasks to be accomplished
and is usually reduced to recognize the attachment areas of each symbol and
the relations among them. The approach is independent from the method used
to recognize symbols and assumes that the symbol has already been recognized.
The approach is evaluated through a user study aimed at comparing the
attachment areas detected by the system to those devised by the users. The
results show that the system can identify attachment areas with a reasonable
accuracy.
The last contribution is EulerSketch, an interactive system for the sketching
and interpretation of Euler diagrams (EDs). The interpretation of a hand
drawn ED produces two types of text encodings of the ED topology called
static code and ordered Gauss paragraph (OGP) code, and a further encoding
of its regions. Given the topology of an ED expressed through static or OGP
code, EulerSketch automatically generates a new topologically equivalent ED
in its graphical representation. [edited by author]XII n.s
To Draw or Not to Draw: Recognizing Stroke-Hover Intent in Gesture-Free Bare-Hand Mid-Air Drawing Tasks
Over the past several decades, technological advancements have introduced new modes of communication
with the computers, introducing a shift from traditional mouse and keyboard interfaces.
While touch based interactions are abundantly being used today, latest developments in computer
vision, body tracking stereo cameras, and augmented and virtual reality have now enabled communicating
with the computers using spatial input in the physical 3D space. These techniques are now
being integrated into several design critical tasks like sketching, modeling, etc. through sophisticated
methodologies and use of specialized instrumented devices. One of the prime challenges in
design research is to make this spatial interaction with the computer as intuitive as possible for the
users.
Drawing curves in mid-air with fingers, is a fundamental task with applications to 3D sketching,
geometric modeling, handwriting recognition, and authentication. Sketching in general, is a
crucial mode for effective idea communication between designers. Mid-air curve input is typically
accomplished through instrumented controllers, specific hand postures, or pre-defined hand gestures,
in presence of depth and motion sensing cameras. The user may use any of these modalities
to express the intention to start or stop sketching. However, apart from suffering with issues like
lack of robustness, the use of such gestures, specific postures, or the necessity of instrumented
controllers for design specific tasks further result in an additional cognitive load on the user.
To address the problems associated with different mid-air curve input modalities, the presented
research discusses the design, development, and evaluation of data driven models for intent recognition
in non-instrumented, gesture-free, bare-hand mid-air drawing tasks.
The research is motivated by a behavioral study that demonstrates the need for such an approach
due to the lack of robustness and intuitiveness while using hand postures and instrumented
devices. The main objective is to study how users move during mid-air sketching, develop qualitative
insights regarding such movements, and consequently implement a computational approach to
determine when the user intends to draw in mid-air without the use of an explicit mechanism (such
as an instrumented controller or a specified hand-posture). By recording the user’s hand trajectory,
the idea is to simply classify this point as either hover or stroke. The resulting model allows for
the classification of points on the user’s spatial trajectory.
Drawing inspiration from the way users sketch in mid-air, this research first specifies the necessity
for an alternate approach for processing bare hand mid-air curves in a continuous fashion.
Further, this research presents a novel drawing intent recognition work flow for every recorded
drawing point, using three different approaches. We begin with recording mid-air drawing data
and developing a classification model based on the extracted geometric properties of the recorded
data. The main goal behind developing this model is to identify drawing intent from critical geometric
and temporal features. In the second approach, we explore the variations in prediction
quality of the model by improving the dimensionality of data used as mid-air curve input. Finally,
in the third approach, we seek to understand the drawing intention from mid-air curves using
sophisticated dimensionality reduction neural networks such as autoencoders. Finally, the broad
level implications of this research are discussed, with potential development areas in the design
and research of mid-air interactions
Automatic Adjacency Grammar Generation from User Drawn Sketches
http://www.ieee.orgIn this paper we present an innovative approach to automatically generate adjacency grammars describing graphical symbols. A grammar production is formulated in terms of rulesets of geometrical constraints among symbol primitives. Given a set of symbol instances sketched by a user using a digital pen, our approach infers the grammar productions consisting of the ruleset most likely to occur. The performance of our work is evaluated using a comprehensive benchmarking database of on-line symbols
Constructivism, epistemology and information processing
The author analyzes the main models of artificial intelligence which deal with the transition from one stage to another, a central problem in development. He describes the contributions of rule-based systems and connectionist systems to an explanation of this transition. He considers that Artificial Intelligence models, in spite of their limitations, establish fruitful points of contact with the constructivist position.El autor analiza los principales modelos de inteligencia artificial que dan cuenta del paso de la transición de un estudio a otro, problema central del desarrollo. Describe y señala las aportaciones de los sistemas basados en reglas así como de los sistemas conexionistas para explicar dicha transición. Considera que los modelos de inteligencia artificial, a pesar de sus limitaciones, permiten establecer puntos de contacto muy fructiferos con la posición constructivista
BROADCAST AUTOMATA: A COMPUTATIONAL MODEL FOR MASSIVELY PARALLEL SYMBOLIC PROCESSING
This paper presents a suitable formalism for the Broadcast Automata System, a model
of massively parallel computation, introduced by the authors for prototyping of scientific
applications. The model consists of a collection of identical entities, modelled as finite
state automata, a global synchroniser providing coordination between the automata and
a broadcast communication system, to which each automaton is connected, granting information
exchange among the automata. The formalism is based on an extension of the
classical formalism for finite state automata. The application to a case study concerning
the recognition of first order propositional formulae is illustrated and the correctness proof
is sketched
- …