Search CORE

13,626 research outputs found

Learning compact binary codes for visual tracking

Author: Dick A.
Li X.
Shen C.
Van Den Hengel A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

A key problem in visual tracking is to represent the appearance of an object in a way that is robust to visual changes. To attain this robustness, increasingly complex models are used to capture appearance variations. However, such models can be difficult to maintain accurately and efficiently. In this paper, we propose a visual tracker in which objects are represented by compact and discriminative binary codes. This representation can be processed very efficiently, and is capable of effectively fusing information from multiple cues. An incremental discriminative learner is then used to construct an appearance model that optimally separates the object from its surrounds. Furthermore, we design a hypergraph propagation method to capture the contextual information on samples, which further improves the tracking accuracy. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracker.Xi Li, Chunhua Shen, Anthony Dick, Anton van den Hengelhttp://www.pamitc.org/cvpr13

Crossref

Adelaide Research & Scholarship

Online supervised hashing

Author: Bargal Sarah Adel
Cakir Fatih
Sclaroff Stan
Publication venue: 'Elsevier BV'
Publication date: 01/03/2017
Field of study

Fast nearest neighbor search is becoming more and more crucial given the advent of large-scale data in many computer vision applications. Hashing approaches provide both fast search mechanisms and compact index structures to address this critical need. In image retrieval problems where labeled training data is available, supervised hashing methods prevail over unsupervised methods. Most state-of-the-art supervised hashing approaches employ batch-learners. Unfortunately, batch-learning strategies may be inefficient when confronted with large datasets. Moreover, with batch-learners, it is unclear how to adapt the hash functions as the dataset continues to grow and new variations appear over time. To handle these issues, we propose OSH: an Online Supervised Hashing technique that is based on Error Correcting Output Codes. We consider a stochastic setting where the data arrives sequentially and our method learns and adapts its hashing functions in a discriminative manner. Our method makes no assumption about the number of possible class labels, and accommodates new classes as they are presented in the incoming data stream. In experiments with three image retrieval benchmarks, our method yields state-of-the-art retrieval performance as measured in Mean Average Precision, while also being orders-of-magnitude faster than competing batch methods for supervised hashing. Also, our method significantly outperforms recently introduced online hashing solutions.https://pdfs.semanticscholar.org/555b/de4f14630d8606e37096235da8933df228f1.pdfAccepted manuscrip

Boston University Institutional Repository (OpenBU)

Deep Discrete Hashing with Self-supervised Pairwise Labels

Author: A Andoni
B Fan
E Tola
H Bay
J Guo
J Song
J Song
O Russakovsky
Q Tian
R Salakhutdinov
T-T Do
Y Gong
Z Pan
Publication venue
Publication date: 07/07/2017
Field of study

Hashing methods have been widely used for applications of large-scale image retrieval and classification. Non-deep hashing methods using handcrafted features have been significantly outperformed by deep hashing methods due to their better feature representation and end-to-end learning framework. However, the most striking successes in deep hashing have mostly involved discriminative models, which require labels. In this paper, we propose a novel unsupervised deep hashing method, named Deep Discrete Hashing (DDH), for large-scale image retrieval and classification. In the proposed framework, we address two main problems: 1) how to directly learn discrete binary codes? 2) how to equip the binary representation with the ability of accurate image retrieval and classification in an unsupervised way? We resolve these problems by introducing an intermediate variable and a loss function steering the learning process, which is based on the neighborhood structure in the original space. Experimental results on standard datasets (CIFAR-10, NUS-WIDE, and Oxford-17) demonstrate that our DDH significantly outperforms existing hashing methods by large margin in terms of~mAP for image retrieval and object recognition. Code is available at \url{https://github.com/htconquer/ddh}

arXiv.org e-Print Archive

Crossref

Content-Based Video Retrieval in Historical Collections of the German Broadcasting Archive

Author: B Zhou
D Albertson
G Marchionini
J Matas
M Mühling
PN Belhumeur
R Salakhutdinov
T Ahonen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/02/2017
Field of study

The German Broadcasting Archive (DRA) maintains the cultural heritage of radio and television broadcasts of the former German Democratic Republic (GDR). The uniqueness and importance of the video material stimulates a large scientific interest in the video content. In this paper, we present an automatic video analysis and retrieval system for searching in historical collections of GDR television recordings. It consists of video analysis algorithms for shot boundary detection, concept classification, person recognition, text recognition and similarity search. The performance of the system is evaluated from a technical and an archival perspective on 2,500 hours of GDR television recordings.Comment: TPDL 2016, Hannover, Germany. Final version is available at Springer via DO

arXiv.org e-Print Archive

Crossref

DeepFactors: Real-time probabilistic dense monocular SLAM

Author: Clark R
Czarnowski J
Davison AJ
Laidlow T
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/12/2019
Field of study

The ability to estimate rich geometry and camera motion from monocular imagery is fundamental to future interactive robotics and augmented reality applications. Different approaches have been proposed that vary in scene geometry representation (sparse landmarks, dense maps), the consistency metric used for optimising the multi-view problem, and the use of learned priors. We present a SLAM system that unifies these methods in a probabilistic framework while still maintaining real-time performance. This is achieved through the use of a learned compact depth map representation and reformulating three different types of errors: photometric, reprojection and geometric, which we make use of within standard factor graph software. We evaluate our system on trajectory estimation and depth reconstruction on real-world sequences and present various examples of estimated dense geometry

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository