    Integrating Multiple Sketch Recognition Methods to Improve Accuracy and Speed

    Sketch recognition is the computer understanding of hand drawn diagrams. Recognizing sketches instantaneously is necessary to build beautiful interfaces with real time feedback. There are various techniques to quickly recognize sketches into ten or twenty classes. However for much larger datasets of sketches from a large number of classes, these existing techniques can take an extended period of time to accurately classify an incoming sketch and require significant computational overhead. Thus, to make classification of large datasets feasible, we propose using multiple stages of recognition. In the initial stage, gesture-based feature values are calculated and the trained model is used to classify the incoming sketch. Sketches with an accuracy less than a threshold value, go through a second stage of geometric recognition techniques. In the second geometric stage, the sketch is segmented, and sent to shape-specific recognizers. The sketches are matched against predefined shape descriptions, and confidence values are calculated. The system outputs a list of classes that the sketch could be classified as, along with the accuracy, and precision for each sketch. This process both significantly reduces the time taken to classify such huge datasets of sketches, and increases both the accuracy and precision of the recognition

    Evaluation of Conceptual Sketches on Stylus-Based Devices

    Design Sketching is an important tool for designers and creative professionals to express their ideas and thoughts onto visual medium. Being a very critical and versatile skill for engineering students, this course is often taught in universities on pen and paper. However, this traditional pedagogy is limited by the availability of human instructors for their feedback. Also, students having low self-efficacy do not learn efficiently in traditional learning environment. Using intelligent interfaces this problem can be solved where we try to mimic the feedback given by an instructor and assess the student drawn sketches to give them insight of the areas they need to improve on. PerSketchTivity is an intelligent tutoring system which allows students to practice their drawing fundamentals and gives them real-time assessment and feedback. This research deals with finding the evaluation metrics that will enable us to grade students from their sketch data. There are seven metrics that we will work with to analyse how each of them contribute in deciding the quality of the sketches. The main contribution of this research is to identify the features of the sketch that can distinguish a good quality sketch from a poor one and design a grading metric for the sketches that can give a final score between 0 and 1 to the user sketches. Using these obtained features and our grading metric method, we grade all the sketches of students and experts

    Extracción automática de modelos UML contenidos en imágenes

    Aunque parezca extraño, pese a no poder encontrar sitios web especializados en ofertar diseños de software representados mediante diagramas UML, existe una ingente cantidad de documentación a disposición de cualquiera, y que contiene dichos modelos: como imágenes en documentos textuales. Este universo de información no se encuentra fácilmente accesible para los desarrolladores porque no es posible, con la tecnología actual, buscar de forma precisa información semántica dentro de imágenes. Lo único que pueden hacer los desarrolladores es intentar buscar documentos relevantes, leerlos, y decidir si los diseños le sirven a sus intereses. Para evitar este problema, y conseguir poner a disposición de toda la comunidad de desarrolladores centenas de miles de diseños, este trabajo pretende desarrollar la metodología necesaria para poder extraer la información textual y gráfica de las imágenes que representen diagramas UML, y convertirla en información pura UML (es decir, en modelos UML reales). El poner a disposición de los analistas, desarrolladores de software, o interesados tal cantidad de diagramas y modelos de software permitirá la aplicación de técnicas modernas de reutilización de software basadas en la búsqueda de diagramas UML. La búsqueda de diagramas UML de todo tipo (estáticos, dinámicos, arquitecturales, de Casos de Uso etc.) mediante similitud a uno dado permitirá potenciar los desarrollos de software de calidad, controlados en el coste, y en el tiempo de desarrollo: las tres virtudes de la reutilización de software. La complejidad de esta propuesta radica en muchos aspectos, todos ellos entrelazados: por un lado hay que considerar que la información de partida se encuentra representada con diferentes tipos de calidad, mediante bits de colores o tonos de grises. Por otro lado su semántica viene dada por la combinación de texto en lenguaje natural y estructuras gráficas. Estas estructuras gráficas tienen asociada una información semántica, accesible a la interpretación humana, que depende del tipo de diagrama. Los diagramas que representan diseños de software son documentos en formato visual con alta estructuración y contenido semántico, que se deben distinguir unos de otros. Debido a su formato en forma de imagen requieren un preprocesado mediante técnicas de visión artificial, OCR y técnicas de clusterización o clasificación basadas en aprendizaje automático. Precisamente este será el principal cometido de esta tesis: la extracción de la semántica de los diagramas en forma de imágenes encontrados en la web. La información obtenida de estos diagramas, principalmente UML, debe incluir información textual e información estructural. A la información textual se obtendrá mediante técnicas de OCR mientras que la información estructurada será detectada mediante reconocimiento de formas combinado con Inteligencia Artificial. El resultado de esta propuesta sería una metodología que podría ser aplicada para cargar repositorios de diagramas UML a partir de imágenes existentes en internet, con vistas a su posterior aplicación y puesta a disposición de los usuarios: un GOOGLE de diagramas UML.There are many interesting sites in the web offering reuse of source code, but no one giving the choice to identify, find and reuse design models using UML. However, even if this data seems to be sad, a simple web search can give you astonishing results: Get into GOOGLE images and search for “UML Class diagram”. Thousands of images will suddenly be available for you. The bad news: they are images. You cannot find anything on them, you cannot find them by content. You cannot compare them. You can, simple, download them. Could you be interested in working with those images, finding similar ones, etc.? In order to solve this problem, and reach hundreds of thousands of UML designs, this work intends to develop the necessary methodology to extract the textual and graphical information contented in UML based images, and convert them in, exactly, UML information (real UML Models represented in a UML object model) The possibility to offer such amount of diagrams to software analysts, software developers, or simply interested stakeholders will allow them to apply real, systematic and modern software reuse based on UML diagrams information retrieval. The possibility to find all kinds of diagrams (static, dynamic, architectural, Use Case, etc.) by similar content will strengthen software development based on the best quality, controlled cost and time to market principles: the three real benefits of Software reuse. This proposal has several difficulties in different fronts: to start with, one must consider that all the information is usually stored in low resolution images, where texts are difficult to read and understand and boxes and arrows are not properly drawn. And, on the other side, the semantics comes from the combination of text represented in Natural Language and graphical structures. These structures have associated semantic information, understandable by humans, which depend and change with the diagram types. Due to these problems, Artificial Vision, OCR, classification and automatic learning algorithm must be used in this thesis. This thesis, therefore, will attempt to extract semantic information for images representing UML Diagrams found in the web. The information extracted will be both textual and graphical. OCR technology (existing already) would be used to textual information. In order to extract graphical information a semantic model combined with AI will be used. The result of the proposal will be a methodology that will allow repositories (in the web or private) to offer UML diagrams based on (and pointing to) images found in the web, for further reuse.Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: Antonio de Amescua Seco.- Secretario: Susana Irene Díaz Rodríguez.- Vocal: Pascual Campoy Cerver

    Pen-based Methods For Recognition and Animation of Handwritten Physics Solutions

    There has been considerable interest in constructing pen-based intelligent tutoring systems due to the natural interaction metaphor and low cognitive load afforded by pen-based interaction. We believe that pen-based intelligent tutoring systems can be further enhanced by integrating animation techniques. In this work, we explore methods for recognizing and animating sketched physics diagrams. Our methodologies enable an Intelligent Tutoring System (ITS) to understand the scenario and requirements posed by a given problem statement and to couple this knowledge with a computational model of the student\u27s handwritten solution. These pieces of information are used to construct meaningful animations and feedback mechanisms that can highlight errors in student solutions. We have constructed a prototype ITS that can recognize mathematics and diagrams in a handwritten solution and infer implicit relationships among diagram elements, mathematics and annotations such as arrows and dotted lines. We use natural language processing to identify the domain of a given problem, and use this information to select one or more of four domain-specific physics simulators to animate the user\u27s sketched diagram. We enable students to use their answers to guide animation behavior and also describe a novel algorithm for checking recognized student solutions. We provide examples of scenarios that can be modeled using our prototype system and discuss the strengths and weaknesses of our current prototype. Additionally, we present the findings of a user study that aimed to identify animation requirements for physics tutoring systems. We describe a taxonomy for categorizing different types of animations for physics problems and highlight how the taxonomy can be used to define requirements for 50 physics problems chosen from a university textbook. We also present a discussion of 56 handwritten solutions acquired from physics students and describe how suitable animations could be constructed for each of them

    Rethinking Pen Input Interaction: Enabling Freehand Sketching Through Improved Primitive Recognition

    Online sketch recognition uses machine learning and artificial intelligence techniques to interpret markings made by users via an electronic stylus or pen. The goal of sketch recognition is to understand the intention and meaning of a particular user's drawing. Diagramming applications have been the primary beneficiaries of sketch recognition technology, as it is commonplace for the users of these tools to rst create a rough sketch of a diagram on paper before translating it into a machine understandable model, using computer-aided design tools, which can then be used to perform simulations or other meaningful tasks. Traditional methods for performing sketch recognition can be broken down into three distinct categories: appearance-based, gesture-based, and geometric-based. Although each approach has its advantages and disadvantages, geometric-based methods have proven to be the most generalizable for multi-domain recognition. Tools, such as the LADDER symbol description language, have shown to be capable of recognizing sketches from over 30 different domains using generalizable, geometric techniques. The LADDER system is limited, however, in the fact that it uses a low-level recognizer that supports only a few primitive shapes, the building blocks for describing higher-level symbols. Systems which support a larger number of primitive shapes have been shown to have questionable accuracies as the number of primitives increase, or they place constraints on how users must input shapes (e.g. circles can only be drawn in a clockwise motion; rectangles must be drawn starting at the top-left corner). This dissertation allows for a significant growth in the possibility of free-sketch recognition systems, those which place little to no drawing constraints on users. In this dissertation, we describe multiple techniques to recognize upwards of 18 primitive shapes while maintaining high accuracy. We also provide methods for producing confidence values and generating multiple interpretations, and explore the difficulties of recognizing multi-stroke primitives. In addition, we show the need for a standardized data repository for sketch recognition algorithm testing and propose SOUSA (sketch-based online user study application), our online system for performing and sharing user study sketch data. Finally, we will show how the principles we have learned through our work extend to other domains, including activity recognition using trained hand posture cues

    Sketch recognition of digital ink diagrams : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Palmerston North, New Zealand

    Figures are either re-used with permission, or abstracted with permission from the source article.Sketch recognition of digital ink diagrams is the process of automatically identifying hand-drawn elements in a diagram. This research focuses on the simultaneous grouping and recognition of shapes in digital ink diagrams. In order to recognise a shape, we need to group strokes belonging to a shape, however, strokes cannot be grouped until the shape is identified. Therefore, we treat grouping and recognition as a simultaneous task. Our grouping technique uses spatial proximity to hypothesise shape candidates. Many of the hypothesised shape candidates are invalid, therefore we need a way to reject them. We present a novel rejection technique based on novelty detection. The rejection method uses proximity measures to validate a shape candidate. In addition, we investigate on improving the accuracy of the current shape recogniser by adding extra features. We also present a novel connector recognition system that localises connector heads around recognised shapes. We perform a full comparative study on two datasets. The results show that our approach is significantly more accurate in finding shapes and faster on process diagram compared to Stahovich et al. (2014), which the results show the superiority of our approach in terms of computation time and accuracy. Furthermore, we evaluate our system on two public datasets and compare our results with other approaches reported in the literature that have used these dataset. The results show that our approach is more accurate in finding and recognising the shapes in the FC dataset (by finding and recognising 91.7% of the shapes) compared to the reported results in the literature