80 research outputs found

    A tree grammar-based visual password scheme

    Get PDF
    A thesis submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Doctor of Philosophy. Johannesburg, August 31, 2015.Visual password schemes can be considered as an alternative to alphanumeric passwords. Studies have shown that alphanumeric passwords can, amongst others, be eavesdropped, shoulder surfed, or guessed, and are susceptible to brute force automated attacks. Visual password schemes use images, in place of alphanumeric characters, for authentication. For example, users of visual password schemes either select images (Cognometric) or points on an image (Locimetric) or attempt to redraw their password image (Drawmetric), in order to gain authentication. Visual passwords are limited by the so-called password space, i.e., by the size of the alphabet from which users can draw to create a password and by susceptibility to stealing of passimages by someone looking over your shoulders, referred to as shoulder surfing in the literature. The use of automatically generated highly similar abstract images defeats shoulder surfing and means that an almost unlimited pool of images is available for use in a visual password scheme, thus also overcoming the issue of limited potential password space. This research investigated visual password schemes. In particular, this study looked at the possibility of using tree picture grammars to generate abstract graphics for use in a visual password scheme. In this work, we also took a look at how humans determine similarity of abstract computer generated images, referred to as perceptual similarity in the literature. We drew on the psychological idea of similarity and matched that as closely as possible with a mathematical measure of image similarity, using Content Based Image Retrieval (CBIR) and tree edit distance measures. To this end, an online similarity survey was conducted with respondents ordering answer images in order of similarity to question images, involving 661 respondents and 50 images. The survey images were also compared with eight, state of the art, computer based similarity measures to determine how closely they model perceptual similarity. Since all the images were generated with tree grammars, the most popular measure of tree similarity, the tree edit distance, was also used to compare the images. Eight different types of tree edit distance measures were used in order to cover the broad range of tree edit distance and tree edit distance approximation methods. All the computer based similarity methods were then correlated with the online similarity survey results, to determine which ones more closely model perceptual similarity. The results were then analysed in the light of some modern psychological theories of perceptual similarity. This work represents a novel approach to the Passfaces type of visual password schemes using dynamically generated pass-images and their highly similar distractors, instead of static pictures stored in an online database. The results of the online survey were then accurately modelled using the most suitable tree edit distance measure, in order to automate the determination of similarity of our generated distractor images. The information gathered from our various experiments was then used in the design of a prototype visual password scheme. The generated images were similar, but not identical, in order to defeat shoulder surfing. This approach overcomes the following problems with this category of visual password schemes: shoulder surfing, bias in image selection, selection of easy to guess pictures and infrastructural limitations like large picture databases, network speed and database security issues. The resulting prototype developed is highly secure, resilient to shoulder surfing and easy for humans to use, and overcomes the aforementioned limitations in this category of visual password schemes

    Image Annotation and Topic Extraction Using Super-Word Latent Dirichlet

    Get PDF
    This research presents a multi-domain solution that uses text and images to iteratively improve automated information extraction. Stage I uses local text surrounding an embedded image to provide clues that help rank-order possible image annotations. These annotations are forwarded to Stage II, where the image annotations from Stage I are used as highly-relevant super-words to improve extraction of topics. The model probabilities from the super-words in Stage II are forwarded to Stage III where they are used to refine the automated image annotation developed in Stage I. All stages demonstrate improvement over existing equivalent algorithms in the literature

    Local selection of features and its applications to image search and annotation

    Get PDF
    In multimedia applications, direct representations of data objects typically involve hundreds or thousands of features. Given a query object, the similarity between the query object and a database object can be computed as the distance between their feature vectors. The neighborhood of the query object consists of those database objects that are close to the query object. The semantic quality of the neighborhood, which can be measured as the proportion of neighboring objects that share the same class label as the query object, is crucial for many applications, such as content-based image retrieval and automated image annotation. However, due to the existence of noisy or irrelevant features, errors introduced into similarity measurements are detrimental to the neighborhood quality of data objects. One way to alleviate the negative impact of noisy features is to use feature selection techniques in data preprocessing. From the original vector space, feature selection techniques select a subset of features, which can be used subsequently in supervised or unsupervised learning algorithms for better performance. However, their performance on improving the quality of data neighborhoods is rarely evaluated in the literature. In addition, most traditional feature selection techniques are global, in the sense that they compute a single set of features across the entire database. As a consequence, the possibility that the feature importance may vary across different data objects or classes of objects is neglected. To compute a better neighborhood structure for objects in high-dimensional feature spaces, this dissertation proposes several techniques for selecting features that are important to the local neighborhood of individual objects. These techniques are then applied to image applications such as content-based image retrieval and image label propagation. Firstly, an iterative K-NN graph construction method for image databases is proposed. A local variant of the Laplacian Score is designed for the selection of features for individual images. Noisy features are detected and sparsified iteratively from the original standardized feature vectors. This technique is incorporated into an approximate K-NN graph construction method so as to improve the semantic quality of the graph. Secondly, in a content-based image retrieval system, a generalized version of the Laplacian Score is used to compute different feature subspaces for images in the database. For online search, a query image is ranked in the feature spaces of database images. Those database images for which the query image is ranked highly are selected as the query results. Finally, a supervised method for the local selection of image features is proposed, for refining the similarity graph used in an image label propagation framework. By using only the selected features to compute the edges leading from labeled image nodes to unlabeled image nodes, better annotation accuracy can be achieved. Experimental results on several datasets are provided in this dissertation, to demonstrate the effectiveness of the proposed techniques for the local selection of features, and for the image applications under consideration

    Evaluation of Alternative Face Detection Techniques and Video Segment Lengths on Sign Language Detection

    Get PDF
    Sign language is the primary medium of communication for people who are hearing impaired. Sign language videos are hard to discover in video sharing sites as the text-based search is based on metadata rather than the content of the videos. The sign language community currently shares content through ad-hoc mechanisms as no library meets their requirements. Low cost or even real-time classification techniques are valuable to create a sign language digital library with its content being updated as new videos are uploaded to YouTube and other video sharing sites. Prior research was able to detect sign language videos using face detection and background subtraction with recall and precision that is suitable to create a digital library. This approach analyzed one minute of each video being classified. Polar Motion Profiles achieved better recall with videos containing multiple signers but at a significant computational cost as it included five face trackers. This thesis explores techniques to reduce the computation time involved in feature extraction without overly impacting precision and recall deeply. This thesis explores three optimizations to the above techniques. First, we compared the individual performance of the five face detectors and determined the best performing single face detector. Second, we evaluated the performance detection using Polar Motion Profiles when face detection was performed on sampled frames rather than detecting in every frame. From our results, Polar Motion Profiles performed well even when the information between frames is sacrificed. Finally, we looked at the effect of using shorter video segment lengths for feature extraction. We found that the drop in precision is minor as video segments were made shorter from the initial empirical length of a minute. Through our work, we found an empirical configuration that can classify videos with close to two orders of magnitude less computation but with precision and recall not too much below the original voting scheme. Our model improves detection time of sign language videos that in turn would help enrich the digital library with fresh content quickly. Future work can be focused on enabling diarization by segmenting the video to find sign language content and non-sign language content with effective background subtraction techniques for shorter videos

    Open Access to Cataloguing Rules

    Get PDF
    The possibility for librarians and developers to have access to cataloguing rules is not a minor issue. There are many open access movements all over the world, and involving all kinds of contents, not only research and data, but also standards. Librarians are ahead of these struggles when it comes to access to information. However, as stated in Terry’s Worklog: Can We Have Open Library Standards, Please? Free RDA/AACR2 (2012)43, when it comes to our work, we librarians: “refuse to follow the same open access principles that we preach”

    Automatic caption generation for content-based image information retrieval.

    Get PDF
    Ma, Ka Ho.Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.Includes bibliographical references (leaves 82-87).Abstract and appendix in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Objective of This Research --- p.4Chapter 1.2 --- Organization of This Thesis --- p.5Chapter 2 --- Background --- p.6Chapter 2.1 --- Textual - Image Query Approach --- p.7Chapter 2.1.1 --- Yahoo! Image Surfer --- p.7Chapter 2.1.2 --- QBIC (Query By Image Content) --- p.8Chapter 2.2 --- Feature-based Approach --- p.9Chapter 2.2.1 --- Texture Thesaurus for Aerial Photos --- p.9Chapter 2.3 --- Caption-aided Approach --- p.10Chapter 2.3.1 --- PICTION (Picture and capTION) --- p.10Chapter 2.3.2 --- MARIE --- p.11Chapter 2.4 --- Summary --- p.11Chapter 3 --- Caption Generation --- p.13Chapter 3.1 --- System Architecture --- p.13Chapter 3.2 --- Domain Pool --- p.15Chapter 3.3 --- Image Feature Extraction --- p.16Chapter 3.3.1 --- Preprocessing --- p.16Chapter 3.3.2 --- Image Segmentation --- p.17Chapter 3.4 --- Classification --- p.24Chapter 3.4.1 --- Self-Organizing Map (SOM) --- p.26Chapter 3.4.2 --- Learning Vector Quantization (LVQ) --- p.28Chapter 3.4.3 --- Output of the Classification --- p.30Chapter 3.5 --- Caption Generation --- p.30Chapter 3.5.1 --- Phase One: Logical Form Generation --- p.31Chapter 3.5.2 --- Phase Two: Simplification --- p.32Chapter 3.5.3 --- Phase Three: Captioning --- p.33Chapter 3.6 --- Summary --- p.35Chapter 4 --- Query Examples --- p.37Chapter 4.1 --- Query Types --- p.37Chapter 4.1.1 --- Non-content-based Retrieval --- p.38Chapter 4.1.2 --- Content-based Retrieval --- p.38Chapter 4.2 --- Hierarchy Graph --- p.41Chapter 4.3 --- Matching --- p.42Chapter 4.4 --- Summary --- p.48Chapter 5 --- Evaluation --- p.49Chapter 5.1 --- Experimental Set-up --- p.50Chapter 5.2 --- Experimental Results --- p.51Chapter 5.2.1 --- Segmentation --- p.51Chapter 5.2.2 --- Classification --- p.53Chapter 5.2.3 --- Captioning --- p.55Chapter 5.2.4 --- Overall Performance --- p.56Chapter 5.3 --- Observations --- p.57Chapter 5.4 --- Summary --- p.58Chapter 6 --- Another Application --- p.59Chapter 6.1 --- Police Force Crimes Investigation --- p.59Chapter 6.1.1 --- Image Feature Extraction --- p.61Chapter 6.1.2 --- Caption Generation --- p.64Chapter 6.1.3 --- Query --- p.66Chapter 6.2 --- An Illustrative Example --- p.68Chapter 6.3 --- Summary --- p.72Chapter 7 --- Conclusions --- p.74Chapter 7.1 --- Contribution --- p.77Chapter 7.2 --- Future Work --- p.78Bibliography --- p.81Appendices --- p.88Chapter A --- Segmentation Result Under Different Parametes --- p.89Chapter B --- Segmentation Time of 10 Randomly Selected Images --- p.90Chapter C --- Sample Captions --- p.9

    The BOLD Project: The BOLD Translator

    Get PDF
    The Lloyd and Bleek Collection contains over 14000 dictionary pages with both an English word and its Bushman language translation. The notebooks in the Lloyd and Bleek Collection contain Bushman stories where in many cases English translations do not exist or are not clear. It is natural to assume that people making use of the notebooks would like to make use of the dictionary to translate words which appear in the notebooks. This, however, is not practical simply due to the magnitude of the dictionary. A need therefore exists to build a tool for interaction between the dictionary pages and the notebooks to allow for translation. A content based image retrieval (CBIR) system was built to do this and it was shown that it is possible to find the corresponding words in the dictionary by providing a single word from the notebooks as a search key. The system shows promising potential with well selected search keys returning relevant results

    Computer Vision and Image Processing Techniques for Mobile Applications

    Get PDF
    Camera phones have penetrated every corner of society and have become a focal point for communications. In our research we extend the traditional use of such devices to help bridge the gap between physical and digital worlds. Their combined image acquisition, processing, storage, and communication capabilities in a compact, portable device make them an ideal platform for embedding computer vision and image processing capabilities in the pursuit of new mobile applications. This dissertation is presented as a series of computer vision and image processing techniques together with their applications on the mobile device. We have developed a set of techniques for ego-motion estimation, enhancement, feature extraction, perspective correction, object detection, and document retrieval that serve as a basis for such applications. Our applications include a dynamic video barcode that can transfer significant amounts of information visually, a document retrieval system that can retrieve documents from low resolution snapshots, and a series of applications for the users with visual disabilities such as a currency reader. Solutions for mobile devices require a fundamentally different approach than traditional vision techniques that run on traditional computers, so we consider user-device interaction and the fact that these algorithms must execute in a resource constrained environment. For each problem we perform both theoretical and empirical analysis in an attempt to optimize performance and usability. The thesis makes contributions related to efficient implementation of image processing and computer vision techniques, analysis of information theory, feature extraction and analysis of low quality images, and device usability
    • …
    corecore