5,539 research outputs found

    Sketch and attribute based query interfaces

    Get PDF
    In this thesis, machine learning algorithms to improve human computer interaction are designed. The two areas of interest are (i) sketched symbol recognition and (ii) object recognition from images. Specifically, auto-completion of sketched symbols and attribute-centric recognition of objects from images are the main focus of this thesis. In the former task, the aim is to be able to recognize partially drawn symbols before they are fully completed. Auto-completion during sketching is desirable since it eliminates the need for the user to draw symbols in their entirety if they can be recognized while they are partially drawn. It can thus be used to increase the sketching throughput; to facilitate sketching by offering possible alternatives to the user; and to reduce user-originated errors by providing continuous feedback. The latter task, allows machine learning algorithms to describe objects with visual attributes such as “square”, “metallic” and “red”. Attributes as intermediate representations can be used to create systems with human interpretable image indexes, zero-shot learning capability where only textual descriptions are available or capability to annotate images with textual descriptions

    Stroke-based sketched symbol reconstruction and segmentation

    Full text link
    Hand-drawn objects usually consist of multiple semantically meaningful parts. For example, a stick figure consists of a head, a torso, and pairs of legs and arms. Efficient and accurate identification of these subparts promises to significantly improve algorithms for stylization, deformation, morphing and animation of 2D drawings. In this paper, we propose a neural network model that segments symbols into stroke-level components. Our segmentation framework has two main elements: a fixed feature extractor and a Multilayer Perceptron (MLP) network that identifies a component based on the feature. As the feature extractor we utilize an encoder of a stroke-rnn, which is our newly proposed generative Variational Auto-Encoder (VAE) model that reconstructs symbols on a stroke by stroke basis. Experiments show that a single encoder could be reused for segmenting multiple categories of sketched symbols with negligible effects on segmentation accuracies. Our segmentation scores surpass existing methodologies on an available small state of the art dataset. Moreover, extensive evaluations on our newly annotated big dataset demonstrate that our framework obtains significantly better accuracies as compared to baseline models. We release the dataset to the community

    New methods, techniques and applications for sketch recognition

    Get PDF
    2012-2013The use of diagrams is common in various disciplines. Typical examples include maps, line graphs, bar charts, engineering blueprints, architects’ sketches, hand drawn schematics, etc.. In general, diagrams can be created either by using pen and paper, or by using specific computer programs. These programs provide functions to facilitate the creation of the diagram, such as copy-and-paste, but the classic WIMP interfaces they use are unnatural when compared to pen and paper. Indeed, it is not rare that a designer prefers to use pen and paper at the beginning of the design, and then transfer the diagram to the computer later. To avoid this double step, a solution is to allow users to sketch directly on the computer. This requires both specific hardware and sketch recognition based software. As regards hardware, many pen/touch based devices such as tablets, smartphones, interactive boards and tables, etc. are available today, also at reasonable costs. Sketch recognition is needed when the sketch must be processed and not considered as a simple image and it is crucial to the success of this new modality of interaction. It is a difficult problem due to the inherent imprecision and ambiguity of a freehand drawing and to the many domains of applications. The aim of this thesis is to propose new methods and applications regarding the sketch recognition. The presentation of the results is divided into several contributions, facing problems such as corner detection, sketched symbol recognition and autocompletion, graphical context detection, sketched Euler diagram interpretation. The first contribution regards the problem of detecting the corners present in a stroke. Corner detection is often performed during preprocessing to segment a stroke in single simple geometric primitives such as lines or curves. The corner recognizer proposed in this thesis, RankFrag, is inspired by the method proposed by Ouyang and Davis in 2011 and improves the accuracy percentages compared to other methods recently proposed in the literature. The second contribution is a new method to recognize multi-stroke hand drawn symbols, which is invariant with respect to scaling and supports symbol recognition independently from the number and order of strokes. The method is an adaptation of the algorithm proposed by Belongie et al. in 2002 to the case of sketched images. This is achieved by using stroke related information. The method has been evaluated on a set of more than 100 symbols from the Military Course of Action domain and the results show that the new recognizer outperforms the original one. The third contribution is a new method for recognizing multi-stroke partially hand drawn symbols which is invariant with respect to scale, and supports symbol recognition independently from the number and order of strokes. The recognition technique is based on subgraph isomorphism and exploits a novel spatial descriptor, based on polar histograms, to represent relations between two stroke primitives. The tests show that the approach gives a satisfactory recognition rate with partially drawn symbols, also with a very low level of drawing completion, and outperforms the existing approaches proposed in the literature. Furthermore, as an application, a system presenting a user interface to draw symbols and implementing the proposed autocompletion approach has been developed. Moreover a user study aimed at evaluating the human performance in hand drawn symbol autocompletion has been presented. Using the set of symbols from the Military Course of Action domain, the user study evaluates the conditions under which the users are willing to exploit the autocompletion functionality and those under which they can use it efficiently. The results show that the autocompletion functionality can be used in a profitable way, with a drawing time saving of about 18%. The fourth contribution regards the detection of the graphical context of hand drawn symbols, and in particular, the development of an approach for identifying attachment areas on sketched symbols. In the field of syntactic recognition of hand drawn visual languages, the recognition of the relations among graphical symbols is one of the first important tasks to be accomplished and is usually reduced to recognize the attachment areas of each symbol and the relations among them. The approach is independent from the method used to recognize symbols and assumes that the symbol has already been recognized. The approach is evaluated through a user study aimed at comparing the attachment areas detected by the system to those devised by the users. The results show that the system can identify attachment areas with a reasonable accuracy. The last contribution is EulerSketch, an interactive system for the sketching and interpretation of Euler diagrams (EDs). The interpretation of a hand drawn ED produces two types of text encodings of the ED topology called static code and ordered Gauss paragraph (OGP) code, and a further encoding of its regions. Given the topology of an ED expressed through static or OGP code, EulerSketch automatically generates a new topologically equivalent ED in its graphical representation. [edited by author]XII n.s

    Communication of the Master Site Plan of Rochester Institute of Technology

    Get PDF
    Not Include

    To Draw or Not to Draw: Recognizing Stroke-Hover Intent in Gesture-Free Bare-Hand Mid-Air Drawing Tasks

    Get PDF
    Over the past several decades, technological advancements have introduced new modes of communication with the computers, introducing a shift from traditional mouse and keyboard interfaces. While touch based interactions are abundantly being used today, latest developments in computer vision, body tracking stereo cameras, and augmented and virtual reality have now enabled communicating with the computers using spatial input in the physical 3D space. These techniques are now being integrated into several design critical tasks like sketching, modeling, etc. through sophisticated methodologies and use of specialized instrumented devices. One of the prime challenges in design research is to make this spatial interaction with the computer as intuitive as possible for the users. Drawing curves in mid-air with fingers, is a fundamental task with applications to 3D sketching, geometric modeling, handwriting recognition, and authentication. Sketching in general, is a crucial mode for effective idea communication between designers. Mid-air curve input is typically accomplished through instrumented controllers, specific hand postures, or pre-defined hand gestures, in presence of depth and motion sensing cameras. The user may use any of these modalities to express the intention to start or stop sketching. However, apart from suffering with issues like lack of robustness, the use of such gestures, specific postures, or the necessity of instrumented controllers for design specific tasks further result in an additional cognitive load on the user. To address the problems associated with different mid-air curve input modalities, the presented research discusses the design, development, and evaluation of data driven models for intent recognition in non-instrumented, gesture-free, bare-hand mid-air drawing tasks. The research is motivated by a behavioral study that demonstrates the need for such an approach due to the lack of robustness and intuitiveness while using hand postures and instrumented devices. The main objective is to study how users move during mid-air sketching, develop qualitative insights regarding such movements, and consequently implement a computational approach to determine when the user intends to draw in mid-air without the use of an explicit mechanism (such as an instrumented controller or a specified hand-posture). By recording the user’s hand trajectory, the idea is to simply classify this point as either hover or stroke. The resulting model allows for the classification of points on the user’s spatial trajectory. Drawing inspiration from the way users sketch in mid-air, this research first specifies the necessity for an alternate approach for processing bare hand mid-air curves in a continuous fashion. Further, this research presents a novel drawing intent recognition work flow for every recorded drawing point, using three different approaches. We begin with recording mid-air drawing data and developing a classification model based on the extracted geometric properties of the recorded data. The main goal behind developing this model is to identify drawing intent from critical geometric and temporal features. In the second approach, we explore the variations in prediction quality of the model by improving the dimensionality of data used as mid-air curve input. Finally, in the third approach, we seek to understand the drawing intention from mid-air curves using sophisticated dimensionality reduction neural networks such as autoencoders. Finally, the broad level implications of this research are discussed, with potential development areas in the design and research of mid-air interactions

    Automatic Adjacency Grammar Generation from User Drawn Sketches

    Get PDF
    http://www.ieee.orgIn this paper we present an innovative approach to automatically generate adjacency grammars describing graphical symbols. A grammar production is formulated in terms of rulesets of geometrical constraints among symbol primitives. Given a set of symbol instances sketched by a user using a digital pen, our approach infers the grammar productions consisting of the ruleset most likely to occur. The performance of our work is evaluated using a comprehensive benchmarking database of on-line symbols

    Constructivism, epistemology and information processing

    Get PDF
    The author analyzes the main models of artificial intelligence which deal with the transition from one stage to another, a central problem in development. He describes the contributions of rule-based systems and connectionist systems to an explanation of this transition. He considers that Artificial Intelligence models, in spite of their limitations, establish fruitful points of contact with the constructivist position.El autor analiza los principales modelos de inteligencia artificial que dan cuenta del paso de la transición de un estudio a otro, problema central del desarrollo. Describe y señala las aportaciones de los sistemas basados en reglas así como de los sistemas conexionistas para explicar dicha transición. Considera que los modelos de inteligencia artificial, a pesar de sus limitaciones, permiten establecer puntos de contacto muy fructiferos con la posición constructivista

    BROADCAST AUTOMATA: A COMPUTATIONAL MODEL FOR MASSIVELY PARALLEL SYMBOLIC PROCESSING

    Get PDF
    This paper presents a suitable formalism for the Broadcast Automata System, a model of massively parallel computation, introduced by the authors for prototyping of scientific applications. The model consists of a collection of identical entities, modelled as finite state automata, a global synchroniser providing coordination between the automata and a broadcast communication system, to which each automaton is connected, granting information exchange among the automata. The formalism is based on an extension of the classical formalism for finite state automata. The application to a case study concerning the recognition of first order propositional formulae is illustrated and the correctness proof is sketched
    corecore