53 research outputs found

    Compressing Word Embeddings

    Full text link
    Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic. However, these vector space representations (created through large-scale text analysis) are typically stored verbatim, since their internal structure is opaque. Using word-analogy tests to monitor the level of detail stored in compressed re-representations of the same vector space, the trade-offs between the reduction in memory usage and expressiveness are investigated. A simple scheme is outlined that can reduce the memory footprint of a state-of-the-art embedding by a factor of 10, with only minimal impact on performance. Then, using the same `bit budget', a binary (approximate) factorisation of the same space is also explored, with the aim of creating an equivalent representation with better interpretability.Comment: 10 pages, 0 figures, submitted to ICONIP-2016. Previous experimental results were submitted to ICLR-2016, but the paper has been significantly updated, since a new experimental set-up worked much bette

    Learning to hash for large scale image retrieval

    Get PDF
    This thesis is concerned with improving the effectiveness of nearest neighbour search. Nearest neighbour search is the problem of finding the most similar data-points to a query in a database, and is a fundamental operation that has found wide applicability in many fields. In this thesis the focus is placed on hashing-based approximate nearest neighbour search methods that generate similar binary hashcodes for similar data-points. These hashcodes can be used as the indices into the buckets of hashtables for fast search. This work explores how the quality of search can be improved by learning task specific binary hashcodes. The generation of a binary hashcode comprises two main steps carried out sequentially: projection of the image feature vector onto the normal vectors of a set of hyperplanes partitioning the input feature space followed by a quantisation operation that uses a single threshold to binarise the resulting projections to obtain the hashcodes. The degree to which these operations preserve the relative distances between the datapoints in the input feature space has a direct influence on the effectiveness of using the resulting hashcodes for nearest neighbour search. In this thesis I argue that the retrieval effectiveness of existing hashing-based nearest neighbour search methods can be increased by learning the thresholds and hyperplanes based on the distribution of the input data. The first contribution is a model for learning multiple quantisation thresholds. I demonstrate that the best threshold positioning is projection specific and introduce a novel clustering algorithm for threshold optimisation. The second contribution extends this algorithm by learning the optimal allocation of quantisation thresholds per hyperplane. In doing so I argue that some hyperplanes are naturally more effective than others at capturing the distribution of the data and should therefore attract a greater allocation of quantisation thresholds. The third contribution focuses on the complementary problem of learning the hashing hyperplanes. I introduce a multi-step iterative model that, in the first step, regularises the hashcodes over a data-point adjacency graph, which encourages similar data-points to be assigned similar hashcodes. In the second step, binary classifiers are learnt to separate opposing bits with maximum margin. This algorithm is extended to learn hyperplanes that can generate similar hashcodes for similar data-points in two different feature spaces (e.g. text and images). Individually the performance of these algorithms is often superior to competitive baselines. I unify my contributions by demonstrating that learning hyperplanes and thresholds as part of the same model can yield an additive increase in retrieval effectiveness

    Exploring geometrical structures in high-dimensional computer vision data

    Get PDF
    In computer vision, objects such as local features, images and video sequences are often represented as high dimensional data points, although it is commonly believed that there are low dimensional geometrical structures that underline the data set. The low dimensional geometric information enables us to have a better understanding of the high dimensional data sets and is useful in solving computer vision problems. In this thesis, the geometrical structures are investigated from different perspectives according to different computer vision applications. For spectral clustering, the distribution of data points in the local region is summarised by a covariance matrix which is viewed as the Mahalanobis distance. For the action recognition problem, we extract subspace information for each action class. The query video sequence is labeled by information regarding its distance to the subspaces of the corresponding video classes. Three new algorithms are introduced for hashing-based approaches for approximate nearest neighbour (ANN) search problems, NOKMeans relaxes the orthogonal condition of the encoding functions in previous quantisation error based methods by representing data points in a new feature space; Auto-JacoBin uses a robust auto-encoder model to preserve the geometric information from the original space into the binary codes; and AGreedy assigns a score, which reflects the ability to preserve the order information in the local regions, for any set of encoding functions and an alternating greedy method is used to find a local optimal solution. The geometric information has the potential to bring better solutions for computer vision problems. As shown in our experiments, the benefits include increasing clustering accuracy, reducing the computation for recognising actions in videos and increasing retrieval performance for ANN problems

    Privacy-preserving iVector-based speaker verification

    Get PDF
    This work introduces an efficient algorithm to develop a privacy-preserving (PP) voice verification based on iVector and linear discriminant analysis techniques. This research considers a scenario in which users enrol their voice biometric to access different services (i.e., banking). Once enrolment is completed, users can verify themselves using their voice-print instead of alphanumeric passwords. Since a voice-print is unique for everyone, storing it with a third-party server raises several privacy concerns. To address this challenge, this work proposes a novel technique based on randomisation to carry out voice authentication, which allows the user to enrol and verify their voice in the randomised domain. To achieve this, the iVector based voice verification technique has been redesigned to work on the randomised domain. The proposed algorithm is validated using a well known speech dataset. The proposed algorithm neither compromises the authentication accuracy nor adds additional complexity due to the randomisation operations

    Low-bit quantization for attributed network representation learning

    Full text link
    © 2019 International Joint Conferences on Artificial Intelligence. All rights reserved. Attributed network embedding plays an important role in transferring network data into compact vectors for effective network analysis. Existing attributed network embedding models are designed either in continuous Euclidean spaces which introduce data redundancy or in binary coding spaces which incur significant loss of representation accuracy. To this end, we present a new Low-Bit Quantization for Attributed Network Representation Learning model (LQANR for short) that can learn compact node representations with low bit-width values while preserving high representation accuracy. Specifically, we formulate a new representation learning function based on matrix factorization that can jointly learn the low-bit node representations and the layer aggregation weights under the low-bit quantization constraint. Because the new learning function falls into the category of mixed integer optimization, we propose an efficient mixed-integer based alternating direction method of multipliers (ADMM) algorithm as the solution. Experiments on real-world node classification and link prediction tasks validate the promising results of the proposed LQANR model
    • …
    corecore