143 research outputs found

    Large-scale image retrieval using similarity preserving binary codes

    Get PDF
    Image retrieval is a fundamental problem in computer vision, and has many applications. When the dataset size gets very large, retrieving images in Internet image collections becomes very challenging. The challenges come from storage, computation speed, and similarity representation. My thesis addresses learning compact similarity preserving binary codes, which represent each image by a short binary string, for fast retrieval in large image databases. I will first present an approach called Iterative Quantization to convert high-dimensional vectors to compact binary codes, which works by learning a rotation to minimize the quantization error of mapping data to the vertices of a binary Hamming cube. This approach achieves state-of-the-art accuracy for preserving neighbors in the original feature space, as well as state-of-the-art semantic precision. Second, I will extend this approach to two different scenarios in large-scale recognition and retrieval problems. The first extension is aimed at high-dimensional histogram data, such as bag-of-words features or text documents. Such vectors are typically sparse and nonnegative. I develop an algorithm that explores the special structure of such data by mapping feature vectors to binary vertices in the positive orthant, which gives improved performance. The second extension is for Fisher Vectors, which are dense descriptors having tens of thousands to millions of dimensions. I develop a novel method for converting such descriptors to compact similarity-preserving binary codes that exploits their natural matrix structure to reduce their dimensionality using compact bilinear projections instead of a single large projection matrix. This method achieves retrieval and classification accuracy comparable to that of the original descriptors and to the state-of-the-art Product Quantization approach while having orders of magnitude faster code generation time and smaller memory footprint. Finally, I present two applications of using Internet images and tags/labels to learn binary codes with label supervision, and show improved retrieval accuracy on several large Internet image datasets. First, I will present an application that performs cross-modal retrieval in the Hamming space. Then I will present an application on using supervised binary classeme representations for large-scale image retrieval.Doctor of Philosoph

    Multi-Modality Human Action Recognition

    Get PDF
    Human action recognition is very useful in many applications in various areas, e.g. video surveillance, HCI (Human computer interaction), video retrieval, gaming and security. Recently, human action recognition becomes an active research topic in computer vision and pattern recognition. A number of action recognition approaches have been proposed. However, most of the approaches are designed on the RGB images sequences, where the action data was collected by RGB/intensity camera. Thus the recognition performance is usually related to various occlusion, background, and lighting conditions of the image sequences. If more information can be provided along with the image sequences, more data sources other than the RGB video can be utilized, human actions could be better represented and recognized by the designed computer vision system.;In this dissertation, the multi-modality human action recognition is studied. On one hand, we introduce the study of multi-spectral action recognition, which involves the information from different spectrum beyond visible, e.g. infrared and near infrared. Action recognition in individual spectra is explored and new methods are proposed. Then the cross-spectral action recognition is also investigated and novel approaches are proposed in our work. On the other hand, since the depth imaging technology has made a significant progress recently, where depth information can be captured simultaneously with the RGB videos. The depth-based human action recognition is also investigated. I first propose a method combining different type of depth data to recognize human actions. Then a thorough evaluation is conducted on spatiotemporal interest point (STIP) based features for depth-based action recognition. Finally, I advocate the study of fusing different features for depth-based action analysis. Moreover, human depression recognition is studied by combining facial appearance model as well as facial dynamic model

    Quantum Entanglement in Time

    Full text link
    In this doctoral thesis we provide one of the first theoretical expositions on a quantum effect known as entanglement in time. It can be viewed as an interdependence of quantum systems across time, which is stronger than could ever exist between classical systems. We explore this temporal effect within the study of quantum information and its foundations as well as through relativistic quantum information. An original contribution of this thesis is the design of one of the first applications of entanglement in time.Comment: 271 pages, PhD Thesis (Victoria University of Wellington
    • …
    corecore