12 research outputs found

    A visual approach to sketched symbol recognition

    Get PDF
    There is increasing interest in building systems that can automatically interpret hand-drawn sketches. However, many challenges remain in terms of recognition accuracy, robustness to different drawing styles, and ability to generalize across multiple domains. To address these challenges, we propose a new approach to sketched symbol recognition that focuses on the visual appearance of the symbols. This allows us to better handle the range of visual and stroke-level variations found in freehand drawings. We also present a new symbol classifier that is computationally efficient and invariant to rotation and local deformations. We show that our method exceeds state-of-the-art performance on all three domains we evaluated, including handwritten digits, PowerPoint shapes, and electrical circuit symbols

    Toward natural interaction in the real world: real-time gesture recognition

    Get PDF
    Using a new hand tracking technology capable of tracking 3D hand postures in real-time, we developed a recognition system for continuous natural gestures. By natural gestures, we mean those encountered in spontaneous interaction, rather than a set of artificial gestures chosen to simplify recognition. To date we have achieved 95.6% accuracy on isolated gesture recognition, and 73% recognition rate on continuous gesture recognition, with data from 3 users and twelve gesture classes. We connected our gesture recognition system to Google Earth, enabling real time gestural control of a 3D map. We describe the challenges of signal accuracy and signal interpretation presented by working in a real-world environment, and detail how we overcame them.National Science Foundation (U.S.) (award IIS-1018055)Pfizer Inc.Foxconn Technolog

    Clock drawing test digit recognition using static and dynamic features

    Get PDF
    The clock drawing test (CDT) is a standard neurological test for detection of cognitive impairment. A computerised version of the test promises to improve the accessibility of the test in addition to obtaining more detailed data about the subject's performance. Automatic handwriting recognition is one of the first stages in the analysis of the computerised test, which produces a set of recognized digits and symbols together with their positions on the clock face. Subsequently, these are used in the test scoring. This is a challenging problem because the average CDT taker has a high likelihood of cognitive impairment, and writing is one of the first functional activities to be affected. Current handwritten digit recognition system perform less well on this kind of data due to its unintelligibility. In this paper, a new system for numeral handwriting recognition in the CDT is proposed. The system is based on two complementary sources of data, namely static and dynamic features extracted from handwritten data. The main novelty of this paper is the new handwriting digit recognition system, which combines two classifiers—fuzzy k-nearest neighbour for dynamic stroke-based features and convolutional neural network for static image- based features, which can take advantage of both static and dynamic data. The proposed digit recognition system is tested on two sets of data: first, Pendigits online handwriting digits; and second, digits from the actual CDTs. The latter data set came from 65 drawings made by healthy people and 100 drawings reproduced from the drawings by dementia patients. The test on both data sets shows that the proposed combination system can outperform each classifier individually in terms of recognition accuracy, especially when assessing the handwriting of people with dementi

    ChemInk: A Natural Real-Time Recognition System for Chemical Drawings

    Get PDF
    We describe a new sketch recognition framework for chemical structure drawings that combines multiple levels of visual features using a jointly trained conditional random field. This joint model of appearance at different levels of detail makes our framework less sensitive to noise and drawing variations, improving accuracy and robustness. In addition, we present a novel learning-based approach to corner detection that achieves nearly perfect accuracy in our domain. The result is a recognizer that is better able to handle the wide range of drawing styles found in messy freehand sketches. Our system handles both graphics and text, producing a complete molecular structure as output. It works in real time, providing visual feedback about the recognition progress. On a dataset of chemical drawings our system achieved an accuracy rate of 97.4%, an improvement over the best reported results in literature. A preliminary user study also showed that participants were on average over twice as fast using our sketch-based system compared to ChemDraw, a popular CAD-based tool for authoring chemical diagrams. This was the case even though most of the users had years of experience using ChemDraw and little or no experience using Tablet PCs.National Science Foundation (U.S.) (Grant 0729422)United States. Dept. of Homeland Security (Graduate Research Fellowship)Pfizer Inc

    Stroke-based sketched symbol reconstruction and segmentation

    Full text link
    Hand-drawn objects usually consist of multiple semantically meaningful parts. For example, a stick figure consists of a head, a torso, and pairs of legs and arms. Efficient and accurate identification of these subparts promises to significantly improve algorithms for stylization, deformation, morphing and animation of 2D drawings. In this paper, we propose a neural network model that segments symbols into stroke-level components. Our segmentation framework has two main elements: a fixed feature extractor and a Multilayer Perceptron (MLP) network that identifies a component based on the feature. As the feature extractor we utilize an encoder of a stroke-rnn, which is our newly proposed generative Variational Auto-Encoder (VAE) model that reconstructs symbols on a stroke by stroke basis. Experiments show that a single encoder could be reused for segmenting multiple categories of sketched symbols with negligible effects on segmentation accuracies. Our segmentation scores surpass existing methodologies on an available small state of the art dataset. Moreover, extensive evaluations on our newly annotated big dataset demonstrate that our framework obtains significantly better accuracies as compared to baseline models. We release the dataset to the community

    Automatic interpretation of clock drawings for computerised assessment of dementia

    Get PDF
    The clock drawing test (CDT) is a standard neurological test for detection of cognitive impairment. A computerised version of the test has potential to improve test accessibility and accuracy. CDT sketch interpretation is one of the first stages in the analysis of the computerised test. It produces a set of recognised digits and symbols together with their positions on the clock face. Subsequently, these are used in the test scoring. This is a challenging problem because the average CDT taker has a high likelihood of cognitive impairment, and writing is one of the first functional activities to be affected. Current interpretation systems perform less well on this kind of data due to its unintelligibility. In this thesis, a novel automatic interpretation system for CDT sketch is proposed and developed. The proposed interpretation system and all the related algorithms developed in this thesis are evaluated using a CDT data set collected for this study. This data consist of two sets, the first set consisting of 65 drawings made by healthy people, and the second consisting of 100 drawings reproduced from drawings of dementia patients. This thesis has four main contributions. The first is a conceptual model of the proposed CDT sketch interpretation system based on integrating prior knowledge of the expected CDT sketch structure and human reasoning into the drawing interpretation system. The second is a novel CDT sketch segmentation algorithm based on supervised machine learning and a new set of temporal and spatial features automatically extracted from the CDT data. The evaluation of the proposed method shows that it outperforms the current state-of-the-art method for CDT drawing segmentation. The third contribution is a new v handwritten digit recognition algorithm based on a set of static and dynamic features extracted from handwritten data. The algorithm combines two classifiers, fuzzy k-nearest neighbour’s classifier with a Convolutional Neural Network (CNN), which take advantage both of static and dynamic data representation. The proposed digit recognition algorithm is shown to outperform each classifier individually in terms of recognition accuracy. The final contribution of this study is the probabilistic Situational Bayesian Network (SBN), which is a new hierarchical probabilistic model for addressing the problem of fusing diverse data sources, such as CDT sketches created by healthy volunteers and dementia patients, in a probabilistic Bayesian network. The evaluation of the proposed SBN-based CDT sketch interpretation system on CDT data shows highly promising results, with 100% recognition accuracy for heathy CDT drawings and 97.15% for dementia data. To conclude, the proposed automatic CDT sketch interpretation system shows high accuracy in terms of recognising different sketch objects and thus paves the way for further research in dementia and clinical computer-assisted diagnosis of dementia

    Pen-based Methods For Recognition and Animation of Handwritten Physics Solutions

    Get PDF
    There has been considerable interest in constructing pen-based intelligent tutoring systems due to the natural interaction metaphor and low cognitive load afforded by pen-based interaction. We believe that pen-based intelligent tutoring systems can be further enhanced by integrating animation techniques. In this work, we explore methods for recognizing and animating sketched physics diagrams. Our methodologies enable an Intelligent Tutoring System (ITS) to understand the scenario and requirements posed by a given problem statement and to couple this knowledge with a computational model of the student\u27s handwritten solution. These pieces of information are used to construct meaningful animations and feedback mechanisms that can highlight errors in student solutions. We have constructed a prototype ITS that can recognize mathematics and diagrams in a handwritten solution and infer implicit relationships among diagram elements, mathematics and annotations such as arrows and dotted lines. We use natural language processing to identify the domain of a given problem, and use this information to select one or more of four domain-specific physics simulators to animate the user\u27s sketched diagram. We enable students to use their answers to guide animation behavior and also describe a novel algorithm for checking recognized student solutions. We provide examples of scenarios that can be modeled using our prototype system and discuss the strengths and weaknesses of our current prototype. Additionally, we present the findings of a user study that aimed to identify animation requirements for physics tutoring systems. We describe a taxonomy for categorizing different types of animations for physics problems and highlight how the taxonomy can be used to define requirements for 50 physics problems chosen from a university textbook. We also present a discussion of 56 handwritten solutions acquired from physics students and describe how suitable animations could be constructed for each of them

    Eye Tracking Methods for Analysis of Visuo-Cognitive Behavior in Medical Imaging

    Get PDF
    Predictive modeling of human visual search behavior and the underlying metacognitive processes is now possible thanks to significant advances in bio-sensing device technology and machine intelligence. Eye tracking bio-sensors, for example, can measure psycho-physiological response through change events in configuration of the human eye. These events include positional changes such as visual fixation, saccadic movements, and scanpath, and non-positional changes such as blinks and pupil dilation and constriction. Using data from eye-tracking sensors, we can model human perception, cognitive processes, and responses to external stimuli. In this study, we investigated the visuo-cognitive behavior of clinicians during the diagnostic decision process for breast cancer screening under clinically equivalent experimental conditions involving multiple monitors and breast projection views. Using a head-mounted eye tracking device and a customized user interface, we recorded eye change events and diagnostic decisions from 10 clinicians (three breast-imaging radiologists and seven Radiology residents) for a corpus of 100 screening mammograms (comprising cases of varied pathology and breast parenchyma density). We proposed novel features and gaze analysis techniques, which help to encode discriminative pattern changes in positional and non-positional measures of eye events. These changes were shown to correlate with individual image readers' identity and experience level, mammographic case pathology and breast parenchyma density, and diagnostic decision. Furthermore, our results suggest that a combination of machine intelligence and bio-sensing modalities can provide adequate predictive capability for the characterization of a mammographic case and image readers diagnostic performance. Lastly, features characterizing eye movements can be utilized for biometric identification purposes. These findings are impactful in real-time performance monitoring and personalized intelligent training and evaluation systems in screening mammography. Further, the developed algorithms are applicable in other application domains involving high-risk visual tasks

    Sketch recognition of digital ink diagrams : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Palmerston North, New Zealand

    Get PDF
    Figures are either re-used with permission, or abstracted with permission from the source article.Sketch recognition of digital ink diagrams is the process of automatically identifying hand-drawn elements in a diagram. This research focuses on the simultaneous grouping and recognition of shapes in digital ink diagrams. In order to recognise a shape, we need to group strokes belonging to a shape, however, strokes cannot be grouped until the shape is identified. Therefore, we treat grouping and recognition as a simultaneous task. Our grouping technique uses spatial proximity to hypothesise shape candidates. Many of the hypothesised shape candidates are invalid, therefore we need a way to reject them. We present a novel rejection technique based on novelty detection. The rejection method uses proximity measures to validate a shape candidate. In addition, we investigate on improving the accuracy of the current shape recogniser by adding extra features. We also present a novel connector recognition system that localises connector heads around recognised shapes. We perform a full comparative study on two datasets. The results show that our approach is significantly more accurate in finding shapes and faster on process diagram compared to Stahovich et al. (2014), which the results show the superiority of our approach in terms of computation time and accuracy. Furthermore, we evaluate our system on two public datasets and compare our results with other approaches reported in the literature that have used these dataset. The results show that our approach is more accurate in finding and recognising the shapes in the FC dataset (by finding and recognising 91.7% of the shapes) compared to the reported results in the literature
    corecore