308 research outputs found

    Online supervised hashing

    Full text link
    Fast nearest neighbor search is becoming more and more crucial given the advent of large-scale data in many computer vision applications. Hashing approaches provide both fast search mechanisms and compact index structures to address this critical need. In image retrieval problems where labeled training data is available, supervised hashing methods prevail over unsupervised methods. Most state-of-the-art supervised hashing approaches employ batch-learners. Unfortunately, batch-learning strategies may be inefficient when confronted with large datasets. Moreover, with batch-learners, it is unclear how to adapt the hash functions as the dataset continues to grow and new variations appear over time. To handle these issues, we propose OSH: an Online Supervised Hashing technique that is based on Error Correcting Output Codes. We consider a stochastic setting where the data arrives sequentially and our method learns and adapts its hashing functions in a discriminative manner. Our method makes no assumption about the number of possible class labels, and accommodates new classes as they are presented in the incoming data stream. In experiments with three image retrieval benchmarks, our method yields state-of-the-art retrieval performance as measured in Mean Average Precision, while also being orders-of-magnitude faster than competing batch methods for supervised hashing. Also, our method significantly outperforms recently introduced online hashing solutions.https://pdfs.semanticscholar.org/555b/de4f14630d8606e37096235da8933df228f1.pdfAccepted manuscrip

    Social Values in Social Work: A Developmental Model

    Get PDF
    The article suggests a five stage model which describes the development of social values in the socialization to social work and other human service professions. The five stages of development include the following: antecedent factors, anticipatory socialization, professional training period, performance in a professional organization and the crystalization of a professional worldview. The main thrust of the paper is the idea that the professional person develops himelf for a very long time before reaching professional maturity. This development represents a constant dialogue between the persons background factors needs and motives and the institutional and organizational contexts he encounters in his career. Along this developmental sequence, conflicts and incongruencies many times arise between the developing professional and his system of values and the organization and its demands. These conflicts will force the developing professional to come out with coping solutions that may involve leaving the field or the profession. Some practical as well as theoretical implications for further research and application are discussed at the end of the article

    Grounding deep models of visual data

    Get PDF
    Deep models are state-of-the-art for many computer vision tasks including object classification, action recognition, and captioning. As Artificial Intelligence systems that utilize deep models are becoming ubiquitous, it is also becoming crucial to explain why they make certain decisions: Grounding model decisions. In this thesis, we study: 1) Improving Model Classification. We show that by utilizing web action images along with videos in training for action recognition, significant performance boosts of convolutional models can be achieved. Without explicit grounding, labeled web action images tend to contain discriminative action poses, which highlight discriminative portions of a video’s temporal progression. 2) Spatial Grounding. We visualize spatial evidence of deep model predictions using a discriminative top-down attention mechanism, called Excitation Backprop. We show how such visualizations are equally informative for correct and incorrect model predictions, and highlight the shift of focus when different training strategies are adopted. 3) Spatial Grounding for Improving Model Classification at Training Time. We propose a guided dropout regularizer for deep networks based on the evidence of a network prediction. This approach penalizes neurons that are most relevant for model prediction. By dropping such high-saliency neurons, the network is forced to learn alternative paths in order to maintain loss minimization. We demonstrate better generalization ability, an increased utilization of network neurons, and a higher resilience to network compression. 4) Spatial Grounding for Improving Model Classification at Test Time. We propose Guided Zoom, an approach that utilizes spatial grounding to make more informed predictions at test time. Guided Zoom compares the evidence used to make a preliminary decision with the evidence of correctly classified training examples to ensure evidenceprediction consistency, otherwise refines the prediction. We demonstrate accuracy gains for fine-grained classification. 5) Spatiotemporal Grounding. We devise a formulation that simultaneously grounds evidence in space and time, in a single pass, using top-down saliency. We visualize the spatiotemporal cues that contribute to a deep recurrent neural network’s classification/captioning output. Based on these spatiotemporal cues, we are able to localize segments within a video that correspond with a specific action, or phrase from a caption, without explicitly optimizing/training for these tasks

    MIHash: Online Hashing with Mutual Information

    Full text link
    Learning-based hashing methods are widely used for nearest neighbor retrieval, and recently, online hashing methods have demonstrated good performance-complexity trade-offs by learning hash functions from streaming data. In this paper, we first address a key challenge for online hashing: the binary codes for indexed data must be recomputed to keep pace with updates to the hash functions. We propose an efficient quality measure for hash functions, based on an information-theoretic quantity, mutual information, and use it successfully as a criterion to eliminate unnecessary hash table updates. Next, we also show how to optimize the mutual information objective using stochastic gradient descent. We thus develop a novel hashing method, MIHash, that can be used in both online and batch settings. Experiments on image retrieval benchmarks (including a 2.5M image dataset) confirm the effectiveness of our formulation, both in reducing hash table recomputations and in learning high-quality hash functions.Comment: International Conference on Computer Vision (ICCV), 201

    Hashing as Tie-Aware Learning to Rank

    Full text link
    Hashing, or learning binary embeddings of data, is frequently used in nearest neighbor retrieval. In this paper, we develop learning to rank formulations for hashing, aimed at directly optimizing ranking-based evaluation metrics such as Average Precision (AP) and Normalized Discounted Cumulative Gain (NDCG). We first observe that the integer-valued Hamming distance often leads to tied rankings, and propose to use tie-aware versions of AP and NDCG to evaluate hashing for retrieval. Then, to optimize tie-aware ranking metrics, we derive their continuous relaxations, and perform gradient-based optimization with deep neural networks. Our results establish the new state-of-the-art for image retrieval by Hamming ranking in common benchmarks.Comment: 15 pages, 3 figures. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 201

    Kurt Lewin: 70 Years On

    Get PDF
    First paragraph: In introducing this Special Issue devoted to evaluating the continuing influence of Kurt Lewin, we begin by briefly examining his life. This is followed by a more detailed examination of the conceptual roots of field theory, which is the foundation for his approach to change. We then discuss Lewin’s commitment to creating a better world, which provided the underpinning motivation for all his work. The editorial concludes by discussing how the four articles in this Special Issue build on, develop and demonstrate the continuing relevance of Lewin’s work

    Do less and achieve more: Training CNNs for action recognition utilizing action images from the Web

    Full text link
    Recently, attempts have been made to collect millions of videos to train Convolutional Neural Network (CNN) models for action recognition in videos. However, curating such large-scale video datasets requires immense human labor, and training CNNs on millions of videos demands huge computational resources. In contrast, collecting action images from the Web is much easier and training on images requires much less computation. In addition, labeled web images tend to contain discriminative action poses, which highlight discriminative portions of a video’s temporal progression. Through extensive experiments, we explore the question of whether we can utilize web action images to train better CNN models for action recognition in videos. We collect 23.8K manually filtered images from the Web that depict the 101 actions in the UCF101 action video dataset. We show that by utilizing web action images along with videos in training, significant performance boosts of CNN models can be achieved. We also investigate the scalability of the process by leveraging crawled web images (unfiltered) for UCF101 and ActivityNet. Using unfiltered images we can achieve performance improvements that are on-par with using filtered images. This means we can further reduce annotation labor and easily scale-up to larger problems. We also shed light on an artifact of finetuning CNN models that reduces the effective parameters of the CNN and show that using web action images can significantly alleviate this problem.https://arxiv.org/pdf/1512.07155v1.pdfFirst author draf
    • …
    corecore