2,151 research outputs found

    Learning compact hashing codes with complex objectives from multiple sources for large scale similarity search

    Get PDF
    Similarity search is a key problem in many real world applications including image and text retrieval, content reuse detection and collaborative filtering. The purpose of similarity search is to identify similar data examples given a query example. Due to the explosive growth of the Internet, a huge amount of data such as texts, images and videos has been generated, which indicates that efficient large scale similarity search becomes more important.^ Hashing methods have become popular for large scale similarity search due to their computational and memory efficiency. These hashing methods design compact binary codes to represent data examples so that similar examples are mapped into similar codes. This dissertation addresses five major problems for utilizing supervised information from multiple sources in hashing with respect to different objectives. Firstly, we address the problem of incorporating semantic tags by modeling the latent correlations between tags and data examples. More precisely, the hashing codes are learned in a unified semi-supervised framework by simultaneously preserving the similarities between data examples and ensuring the tag consistency via a latent factor model. Secondly, we solve the missing data problem by latent subspace learning from multiple sources. The hashing codes are learned by enforcing the data consistency among different sources. Thirdly, we address the problem of hashing on structured data by graph learning. A weighted graph is constructed based on the structured knowledge from the data. The hashing codes are then learned by preserving the graph similarities. Fourthly, we address the problem of learning high ranking quality hashing codes by utilizing the relevance judgments from users. The hashing code/function is learned via optimizing a commonly used non-smooth non-convex ranking measure, NDCG. Finally, we deal with the problem of insufficient supervision by active learning. We propose to actively select the most informative data examples and tags in a joint manner based on the selection criteria that both the data examples and tags should be most uncertain and dissimilar with each other.^ Extensive experiments on several large scale datasets demonstrate the superior performance of the proposed approaches over several state-of-the-art hashing methods from different perspectives

    Traffic Management Applications for Stateful SDN Data Plane

    Get PDF
    The successful OpenFlow approach to Software Defined Networking (SDN) allows network programmability through a central controller able to orchestrate a set of dumb switches. However, the simple match/action abstraction of OpenFlow switches constrains the evolution of the forwarding rules to be fully managed by the controller. This can be particularly limiting for a number of applications that are affected by the delay of the slow control path, like traffic management applications. Some recent proposals are pushing toward an evolution of the OpenFlow abstraction to enable the evolution of forwarding policies directly in the data plane based on state machines and local events. In this paper, we present two traffic management applications that exploit a stateful data plane and their prototype implementation based on OpenState, an OpenFlow evolution that we recently proposed.Comment: 6 pages, 9 figure

    Encoding techniques for complex information structures in connectionist systems

    Get PDF
    Two general information encoding techniques called relative position encoding and pattern similarity association are presented. They are claimed to be a convenient basis for the connectionist implementation of complex, short term information processing of the sort needed in common sense reasoning, semantic/pragmatic interpretation of natural language utterances, and other types of high level cognitive processing. The relationships of the techniques to other connectionist information-structuring methods, and also to methods used in computers, are discussed in detail. The rich inter-relationships of these other connectionist and computer methods are also clarified. The particular, simple forms are discussed that the relative position encoding and pattern similarity association techniques take in the author's own connectionist system, called Conposit, in order to clarify some issues and to provide evidence that the techniques are indeed useful in practice

    Learning Dynamic Feature Selection for Fast Sequential Prediction

    Full text link
    We present paired learning and inference algorithms for significantly reducing computation and increasing speed of the vector dot products in the classifiers that are at the heart of many NLP components. This is accomplished by partitioning the features into a sequence of templates which are ordered such that high confidence can often be reached using only a small fraction of all features. Parameter estimation is arranged to maximize accuracy and early confidence in this sequence. Our approach is simpler and better suited to NLP than other related cascade methods. We present experiments in left-to-right part-of-speech tagging, named entity recognition, and transition-based dependency parsing. On the typical benchmarking datasets we can preserve POS tagging accuracy above 97% and parsing LAS above 88.5% both with over a five-fold reduction in run-time, and NER F1 above 88 with more than 2x increase in speed.Comment: Appears in The 53rd Annual Meeting of the Association for Computational Linguistics, Beijing, China, July 201
    • …
    corecore