3 research outputs found

    Combining representations for improved sketch recognition

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 89-96).Sketching is a common means of conveying, representing, and preserving information, and it has become a subject of research as a method for human-computer interaction, specifically in the area of computer-aided design. Digitally collected sketches contain both spatial and temporal information; additionally, they may contain a conceptual structure of shapes and sub shapes. These multiple aspects suggest several ways of representing sketches, each with advantages and disadvantages for recognition. Most existing sketch recognitions systems are based on a single representation and do not use all available information. We propose combining several representations and systems as a way to improve recognition accuracy. This thesis presents two methods for combining recognition systems. The first improves recognition by improving segmentation, while the second seeks to predict how well systems will recognize a given domain or symbol and combine their outputs accordingly. We show that combining several recognition systems based on different representations can improve the accuracy of existing recognition methods.by Sonya J. Cates.Ph.D

    Using computer vision to categorize tyres and estimate the number of visible tyres in tyre stockpile images

    Get PDF
    Pressures from environmental agencies contribute to the challenges associated with the disposal of waste tyres, particularly in South Africa. Recycling of waste tyres in South Africa is in its infancy resulting in the historically undocumented and uncontrolled existence of waste tyre stockpiles across the country. The remote and distant locations of such stockpiles typically complicate the logistics associated with the collection, transport and storage of waste tyres prior to entering the recycling process. In order to optimize the logistics associated with the collection of waste tyres from stockpiles, useful information about such stockpiles would include estimates of the types of tyres as well as the quantity of specific tyre types found in particular stockpiles. This research proposes the use of computer vision for categorizing individual tyres and estimating the number of visible tyres in tyre stockpile images to support the logistics in tyre recycling efforts. The study begins with a broad review of image processing and computer vision algorithms for categorization and counting objects in images. The bag of visual words (BoVW) model for categorization is tested on two small data sets of tread tyre images using a random sub-sampling holdout method. The categorization results are evaluated using performance metrics for multiclass classifiers, namely the average accuracy, precision, and recall. The results indicated that corner-based local feature detectors combined with speeded up robust features (SURF) descriptors in a BoVW model provide moderately accurate categorization of tyres based on tread images. Two feature extraction methods for extracting features for use in training neural networks (NNs) for tyre count estimations in tyre stockpiles are proposed. The two feature extraction methods are used to describe images in terms of feature vectors that can be used as input for NNs. The first feature extraction method uses the BoVW model with histograms of oriented gradients (HOG) features collected from overlapping sub-images to create a visual vocabulary and describe the images in terms of their visual word occurrence histogram. The second feature extraction method uses the image gradient magnitude, gradient orientation, and edge orientations of edges detected using the Canny edge detector. A concatenated histogram is constructed from individual histograms of gradient orientations and gradient magnitude. The histograms are then used to train NNs using backpropogation to approximate functions from the feature vectors describing the images to scalar count estimations. The accuracy of visible object count predictions are evaluated using NN evaluation techniques to determine the accuracy of predictions and the generalization ability of the fit model. The count estimation experiments using the two feature extraction methods for input to NNs showed that fairly accurate count estimations can be obtained and that the fit model could generalize fairly well to unseen images

    An Efficient Graph-Based Symbol Recognizer

    No full text
    We describe a trainable symbol recognizer for pen-based user interfaces. Symbols are represented internally as attributed relational graphs that describe both the geometry and topology of the symbols. Symbol recognition reduces to the task of finding the definition symbol whose attributed relational graph best matches that of the unknown symbol. One challenge addressed in the current work is how to perform this graph matching in an efficient fashion so as to achieve interactive performance. We present four approximate graph matching techniques: Stochastic Matching, which is based on stochastic search; Error-driven Matching, which uses local matching errors to drive the solution to an optimal match; Greedy Matching, which uses greedy search; and Sort Matching, which relies on geometric information to accelerate the matching. Finally, we present promising results of initial user studies, and discuss the tradeoffs between the various matching techniques
    corecore