80 research outputs found
A tree grammar-based visual password scheme
A thesis submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Doctor of Philosophy. Johannesburg, August 31, 2015.Visual password schemes can be considered as an alternative to alphanumeric
passwords. Studies have shown that alphanumeric passwords
can, amongst others, be eavesdropped, shoulder surfed, or
guessed, and are susceptible to brute force automated attacks. Visual
password schemes use images, in place of alphanumeric characters,
for authentication. For example, users of visual password schemes either
select images (Cognometric) or points on an image (Locimetric)
or attempt to redraw their password image (Drawmetric), in order
to gain authentication. Visual passwords are limited by the so-called
password space, i.e., by the size of the alphabet from which users can
draw to create a password and by susceptibility to stealing of passimages
by someone looking over your shoulders, referred to as shoulder
surfing in the literature. The use of automatically generated highly
similar abstract images defeats shoulder surfing and means that an almost
unlimited pool of images is available for use in a visual password
scheme, thus also overcoming the issue of limited potential password
space.
This research investigated visual password schemes. In particular,
this study looked at the possibility of using tree picture grammars to
generate abstract graphics for use in a visual password scheme. In this
work, we also took a look at how humans determine similarity of abstract
computer generated images, referred to as perceptual similarity
in the literature. We drew on the psychological idea of similarity and
matched that as closely as possible with a mathematical measure of
image similarity, using Content Based Image Retrieval (CBIR) and
tree edit distance measures. To this end, an online similarity survey
was conducted with respondents ordering answer images in order
of similarity to question images, involving 661 respondents and 50
images. The survey images were also compared with eight, state of
the art, computer based similarity measures to determine how closely
they model perceptual similarity. Since all the images were generated
with tree grammars, the most popular measure of tree similarity, the
tree edit distance, was also used to compare the images. Eight different
types of tree edit distance measures were used in order to cover
the broad range of tree edit distance and tree edit distance approximation
methods. All the computer based similarity methods were
then correlated with the online similarity survey results, to determine
which ones more closely model perceptual similarity. The results were
then analysed in the light of some modern psychological theories of
perceptual similarity.
This work represents a novel approach to the Passfaces type of visual
password schemes using dynamically generated pass-images and their
highly similar distractors, instead of static pictures stored in an online
database. The results of the online survey were then accurately
modelled using the most suitable tree edit distance measure, in order
to automate the determination of similarity of our generated distractor
images. The information gathered from our various experiments
was then used in the design of a prototype visual password scheme.
The generated images were similar, but not identical, in order to defeat
shoulder surfing. This approach overcomes the following problems
with this category of visual password schemes: shoulder surfing,
bias in image selection, selection of easy to guess pictures and infrastructural
limitations like large picture databases, network speed and
database security issues. The resulting prototype developed is highly
secure, resilient to shoulder surfing and easy for humans to use, and
overcomes the aforementioned limitations in this category of visual
password schemes
Recommended from our members
Interactive Imaging via Hand Gesture Recognition.
With the growth of computer power, Digital Image Processing plays a more and more important role in the modern world, including the field of industry, medical, communications, spaceflight technology etc. As a sub-field, Interactive Image Processing emphasizes particularly on the communications between machine and human. The basic flowchart is definition of object, analysis and training phase, recognition and feedback. Generally speaking, the core issue is how we define the interesting object and track them more accurately in order to complete the interaction process successfully.
This thesis proposes a novel dynamic simulation scheme for interactive image processing. The work consists of two main parts: Hand Motion Detection and Hand Gesture recognition. Within a hand motion detection processing, movement of hand will be identified and extracted. In a specific detection period, the current image is compared with the previous image in order to generate the difference between them. If the generated difference exceeds predefined threshold alarm, a typical hand motion movement is detected. Furthermore, in some particular situations, changes of hand gesture are also desired to be detected and classified. This task requires features extraction and feature comparison among each type of gestures. The essentials of hand gesture are including some low level features such as color, shape etc. Another important feature is orientation histogram. Each type of hand gestures has its particular representation in the domain of orientation histogram. Because Gaussian Mixture Model has great advantages to represent the object with essential feature elements and the Expectation-Maximization is the efficient procedure to compute the maximum likelihood between testing images and predefined standard sample of each different gesture, the comparability between testing image and samples of each type of gestures will be estimated by Expectation-Maximization algorithm in Gaussian Mixture Model. The performance of this approach in experiments shows the proposed method works well and accurately
Image Annotation and Topic Extraction Using Super-Word Latent Dirichlet
This research presents a multi-domain solution that uses text and images to iteratively improve automated information extraction. Stage I uses local text surrounding an embedded image to provide clues that help rank-order possible image annotations. These annotations are forwarded to Stage II, where the image annotations from Stage I are used as highly-relevant super-words to improve extraction of topics. The model probabilities from the super-words in Stage II are forwarded to Stage III where they are used to refine the automated image annotation developed in Stage I. All stages demonstrate improvement over existing equivalent algorithms in the literature
Local selection of features and its applications to image search and annotation
In multimedia applications, direct representations of data objects typically involve hundreds or thousands of features. Given a query object, the similarity between the query object and a database object can be computed as the distance between their feature vectors. The neighborhood of the query object consists of those database objects that are close to the query object. The semantic quality of the neighborhood, which can be measured as the proportion of neighboring objects that share the same class label as the query object, is crucial for many applications, such as content-based image retrieval and automated image annotation. However, due to the existence of noisy or irrelevant features, errors introduced into similarity measurements are detrimental to the neighborhood quality of data objects.
One way to alleviate the negative impact of noisy features is to use feature selection techniques in data preprocessing. From the original vector space, feature selection techniques select a subset of features, which can be used subsequently in supervised or unsupervised learning algorithms for better performance. However, their performance on improving the quality of data neighborhoods is rarely evaluated in the literature. In addition, most traditional feature selection techniques are global, in the sense that they compute a single set of features across the entire database. As a consequence, the possibility that the feature importance may vary across different data objects or classes of objects is neglected.
To compute a better neighborhood structure for objects in high-dimensional feature spaces, this dissertation proposes several techniques for selecting features that are important to the local neighborhood of individual objects. These techniques are then applied to image applications such as content-based image retrieval and image label propagation. Firstly, an iterative K-NN graph construction method for image databases is proposed. A local variant of the Laplacian Score is designed for the selection of features for individual images. Noisy features are detected and sparsified iteratively from the original standardized feature vectors. This technique is incorporated into an approximate K-NN graph construction method so as to improve the semantic quality of the graph. Secondly, in a content-based image retrieval system, a generalized version of the Laplacian Score is used to compute different feature subspaces for images in the database. For online search, a query image is ranked in the feature spaces of database images. Those database images for which the query image is ranked highly are selected as the query results. Finally, a supervised method for the local selection of image features is proposed, for refining the similarity graph used in an image label propagation framework. By using only the selected features to compute the edges leading from labeled image nodes to unlabeled image nodes, better annotation accuracy can be achieved.
Experimental results on several datasets are provided in this dissertation, to demonstrate the effectiveness of the proposed techniques for the local selection of features, and for the image applications under consideration
Evaluation of Alternative Face Detection Techniques and Video Segment Lengths on Sign Language Detection
Sign language is the primary medium of communication for people who are hearing impaired. Sign language videos are hard to discover in video sharing sites as the text-based search is based on metadata rather than the content of the videos. The sign language community currently shares content through ad-hoc mechanisms as no library meets their requirements. Low cost or even real-time classification techniques are valuable to create a sign language digital library with its content being updated as new videos are uploaded to YouTube and other video sharing sites.
Prior research was able to detect sign language videos using face detection and background subtraction with recall and precision that is suitable to create a digital library. This approach analyzed one minute of each video being classified. Polar Motion Profiles achieved better recall with videos containing multiple signers but at a significant computational cost as it included five face trackers. This thesis explores techniques to reduce the computation time involved in feature extraction without overly impacting precision and recall deeply.
This thesis explores three optimizations to the above techniques. First, we compared the individual performance of the five face detectors and determined the best performing single face detector. Second, we evaluated the performance detection using Polar Motion Profiles when face detection was performed on sampled frames rather than detecting in every frame. From our results, Polar Motion Profiles performed well even when the information between frames is sacrificed. Finally, we looked at the effect of using shorter video segment lengths for feature extraction. We found that the drop in precision is minor as video segments were made shorter from the initial empirical length of a minute.
Through our work, we found an empirical configuration that can classify videos with close to two orders of magnitude less computation but with precision and recall not too much below the original voting scheme. Our model improves detection time of sign language videos that in turn would help enrich the digital library with fresh content quickly. Future work can be focused on enabling diarization by segmenting the video to find sign language content and non-sign language content with effective background subtraction techniques for shorter videos
Open Access to Cataloguing Rules
The possibility for librarians and developers to have access to cataloguing rules is not a minor issue. There are many open access movements all over the world, and involving all kinds of contents, not only research and data, but also standards. Librarians are ahead of these struggles when it comes to access to information. However, as stated in Terry’s Worklog: Can We Have Open Library Standards, Please? Free RDA/AACR2 (2012)43, when it comes to our work, we librarians: “refuse to follow the same open access principles that we preach”
Automatic caption generation for content-based image information retrieval.
Ma, Ka Ho.Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.Includes bibliographical references (leaves 82-87).Abstract and appendix in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Objective of This Research --- p.4Chapter 1.2 --- Organization of This Thesis --- p.5Chapter 2 --- Background --- p.6Chapter 2.1 --- Textual - Image Query Approach --- p.7Chapter 2.1.1 --- Yahoo! Image Surfer --- p.7Chapter 2.1.2 --- QBIC (Query By Image Content) --- p.8Chapter 2.2 --- Feature-based Approach --- p.9Chapter 2.2.1 --- Texture Thesaurus for Aerial Photos --- p.9Chapter 2.3 --- Caption-aided Approach --- p.10Chapter 2.3.1 --- PICTION (Picture and capTION) --- p.10Chapter 2.3.2 --- MARIE --- p.11Chapter 2.4 --- Summary --- p.11Chapter 3 --- Caption Generation --- p.13Chapter 3.1 --- System Architecture --- p.13Chapter 3.2 --- Domain Pool --- p.15Chapter 3.3 --- Image Feature Extraction --- p.16Chapter 3.3.1 --- Preprocessing --- p.16Chapter 3.3.2 --- Image Segmentation --- p.17Chapter 3.4 --- Classification --- p.24Chapter 3.4.1 --- Self-Organizing Map (SOM) --- p.26Chapter 3.4.2 --- Learning Vector Quantization (LVQ) --- p.28Chapter 3.4.3 --- Output of the Classification --- p.30Chapter 3.5 --- Caption Generation --- p.30Chapter 3.5.1 --- Phase One: Logical Form Generation --- p.31Chapter 3.5.2 --- Phase Two: Simplification --- p.32Chapter 3.5.3 --- Phase Three: Captioning --- p.33Chapter 3.6 --- Summary --- p.35Chapter 4 --- Query Examples --- p.37Chapter 4.1 --- Query Types --- p.37Chapter 4.1.1 --- Non-content-based Retrieval --- p.38Chapter 4.1.2 --- Content-based Retrieval --- p.38Chapter 4.2 --- Hierarchy Graph --- p.41Chapter 4.3 --- Matching --- p.42Chapter 4.4 --- Summary --- p.48Chapter 5 --- Evaluation --- p.49Chapter 5.1 --- Experimental Set-up --- p.50Chapter 5.2 --- Experimental Results --- p.51Chapter 5.2.1 --- Segmentation --- p.51Chapter 5.2.2 --- Classification --- p.53Chapter 5.2.3 --- Captioning --- p.55Chapter 5.2.4 --- Overall Performance --- p.56Chapter 5.3 --- Observations --- p.57Chapter 5.4 --- Summary --- p.58Chapter 6 --- Another Application --- p.59Chapter 6.1 --- Police Force Crimes Investigation --- p.59Chapter 6.1.1 --- Image Feature Extraction --- p.61Chapter 6.1.2 --- Caption Generation --- p.64Chapter 6.1.3 --- Query --- p.66Chapter 6.2 --- An Illustrative Example --- p.68Chapter 6.3 --- Summary --- p.72Chapter 7 --- Conclusions --- p.74Chapter 7.1 --- Contribution --- p.77Chapter 7.2 --- Future Work --- p.78Bibliography --- p.81Appendices --- p.88Chapter A --- Segmentation Result Under Different Parametes --- p.89Chapter B --- Segmentation Time of 10 Randomly Selected Images --- p.90Chapter C --- Sample Captions --- p.9
The BOLD Project: The BOLD Translator
The Lloyd and Bleek Collection contains over 14000 dictionary pages with both an English word and its Bushman language translation. The notebooks in the Lloyd and Bleek Collection contain Bushman stories where in many cases English translations do not exist or are not clear. It is natural to assume that people making use of the notebooks would like to make use of the dictionary to translate words which appear in the notebooks. This, however, is not practical simply due to the magnitude of the dictionary. A need therefore exists to build a tool for interaction between the dictionary pages and the notebooks to allow for translation. A content based image retrieval (CBIR) system was built to do this and it was shown that it is possible to find the corresponding words in the dictionary by providing a single word from the notebooks as a search key. The system shows promising potential with well selected search keys returning relevant results
Computer Vision and Image Processing Techniques for Mobile Applications
Camera phones have penetrated every corner of society and have become a focal point for communications. In our research we extend the traditional use of such devices to help bridge the gap between physical and digital worlds. Their combined image acquisition, processing, storage, and communication capabilities in a compact, portable device make them an ideal platform for embedding computer vision and image processing capabilities in the pursuit of new mobile applications. This dissertation is presented as a series of computer vision and image processing techniques together with their applications on the mobile device. We have developed a set of techniques for ego-motion estimation, enhancement, feature extraction, perspective correction, object detection, and document retrieval that serve as a basis for such applications. Our applications include a dynamic video barcode that can transfer significant amounts of information visually, a document retrieval system that can retrieve documents from low resolution snapshots, and a series of applications for the users with visual disabilities such as a currency reader. Solutions for mobile devices require a fundamentally different approach than traditional vision techniques that run on traditional computers, so we consider user-device interaction and the fact that these algorithms must execute in a resource constrained environment. For each problem we perform both theoretical and empirical analysis in an attempt to optimize performance and usability. The thesis makes contributions related to efficient implementation of image processing and computer vision techniques, analysis of information theory, feature extraction and analysis of low quality images, and device usability
- …