43 research outputs found
Long-Term Visual Object Tracking Benchmark
We propose a new long video dataset (called Track Long and Prosper - TLP) and
benchmark for single object tracking. The dataset consists of 50 HD videos from
real world scenarios, encompassing a duration of over 400 minutes (676K
frames), making it more than 20 folds larger in average duration per sequence
and more than 8 folds larger in terms of total covered duration, as compared to
existing generic datasets for visual tracking. The proposed dataset paves a way
to suitably assess long term tracking performance and train better deep
learning architectures (avoiding/reducing augmentation, which may not reflect
real world behaviour). We benchmark the dataset on 17 state of the art trackers
and rank them according to tracking accuracy and run time speeds. We further
present thorough qualitative and quantitative evaluation highlighting the
importance of long term aspect of tracking. Our most interesting observations
are (a) existing short sequence benchmarks fail to bring out the inherent
differences in tracking algorithms which widen up while tracking on long
sequences and (b) the accuracy of trackers abruptly drops on challenging long
sequences, suggesting the potential need of research efforts in the direction
of long-term tracking.Comment: ACCV 2018 (Oral
AD51B in Familial Breast Cancer
Common variation on 14q24.1, close to RAD51B, has been associated with breast cancer: rs999737 and rs2588809 with the risk of female breast cancer and rs1314913 with the risk of male breast cancer. The aim of this study was to investigate the role of RAD51B variants in breast cancer predisposition, particularly in the context of familial breast cancer in Finland. We sequenced the coding region of RAD51B in 168 Finnish breast cancer patients from the Helsinki region for identification of possible recurrent founder mutations. In addition, we studied the known rs999737, rs2588809, and rs1314913 SNPs and RAD51B haplotypes in 44,791 breast cancer cases and 43,583 controls from 40 studies participating in the Breast Cancer Association Consortium (BCAC) that were genotyped on a custom chip (iCOGS). We identified one putatively pathogenic missense mutation c.541C>T among the Finnish cancer patients and subsequently genotyped the mutation in additional breast cancer cases (n = 5259) and population controls (n = 3586) from Finland and Belarus. No significant association with breast cancer risk was seen in the meta-analysis of the Finnish datasets or in the large BCAC dataset. The association with previously identified risk variants rs999737, rs2588809, and rs1314913 was replicated among all breast cancer cases and also among familial cases in the BCAC dataset. The most significant association was observed for the haplotype carrying the risk-alleles of all the three SNPs both among all cases (odds ratio (OR): 1.15, 95% confidence interval (CI): 1.11–1.19, P = 8.88 x 10−16) and among familial cases (OR: 1.24, 95% CI: 1.16–1.32, P = 6.19 x 10−11), compared to the haplotype with the respective protective alleles. Our results suggest that loss-of-function mutations in RAD51B are rare, but common variation at the RAD51B region is significantly associated with familial breast cancer risk
NOVA : Rendering Virtual Worlds with Humans for Computer Vision Tasks
Today, the cutting edge of computer vision research greatly depends on the availability of large datasets, which are critical for effectively training and testing new methods. Manually annotating visual data, however, is not only a labor-intensive process but also prone to errors. In this study, we present NOVA, a versatile framework to create realistic-looking 3D rendered worlds containing procedurally generated humans with rich pixel-level ground truth annotations. NOVA can simulate various environmental factors such as weather conditions or different times of day, and bring an exceptionally diverse set of humans to life, each having a distinct body shape, gender and age. To demonstrate NOVA's capabilities, we generate two synthetic datasets for person tracking. The first one includes 108 sequences, each with different levels of difficulty like tracking in crowded scenes or at nighttime and aims for testing the limits of current state-of-the-art trackers. A second dataset of 97 sequences with normal weather conditions is used to show how our synthetic sequences can be utilized to train and boost the performance of deep-learning based trackers. Our results indicate that the synthetic data generated by NOVA represents a good proxy of the real-world and can be exploited for computer vision tasks
Efficient articulated trajectory reconstruction using dynamic programming and filters
This paper considers the problem of reconstructing the motion of a 3D articulated tree from 2D point correspondences subject to some temporal prior. Hitherto, smooth motion has been encouraged using a trajectory basis, yielding a hard combinatorial problem with time complexity growing exponentially in the number of frames. Branch and bound strategies have previously attempted to curb this complexity whilst maintaining global optimality. However, they provide no guarantee of being more efficient than exhaustive search. Inspired by recent work which reconstructs general trajectories using compact high-pass filters, we develop a dynamic programming approach which scales linearly in the number of frames, leveraging the intrinsically local nature of filter interactions. Extension to affine projection enables reconstruction without estimating cameras
Learning feed-forward one-shot learners
One-shot learning is usually tackled by using generative models or discriminative embeddings. Discriminative methods based on deep learning, which are very effective in other learning scenarios, are ill-suited for one-shot learning as they need large amounts of training data. In this paper, we propose a method to learn the parameters of a deep model in one shot. We construct the learner as a second deep network, called a learnet, which predicts the parameters of a pupil network from a single exemplar. In this manner we obtain an efficient feed-forward one-shot learner, trained end-to-end by minimizing a one-shot classification objective in a learning to learn formulation. In order to make the construction feasible, we propose a number of factorizations of the parameters of the pupil network. We demonstrate encouraging results by learning characters from single exemplars in Omniglot, and by tracking visual objects from a single initial exemplar in the Visual Object Tracking benchmark
Devon: Deformable volume network for learning optical flow
We propose a new neural network module, Deformable Cost Volume, for learning large displacement optical flow. The module does not distort the original images or their feature maps and therefore avoids the artifacts associated with warping. Based on this module, a new neural network model is proposed. The full version of this paper can be found online (https://arxiv.org/abs/1802.07351)
Devon: Deformable volume network for learning optical flow
We propose a new neural network module, Deformable Cost Volume, for learning large displacement optical flow. The module does not distort the original images or their feature maps and therefore avoids the artifacts associated with warping. Based on this module, a new neural network model is proposed. The full version of this paper can be found online (https://arxiv.org/abs/1802.07351)
Devon: Deformable volume network for learning optical flow
We propose a new neural network module, Deformable Cost Volume, for learning large displacement optical flow. The module does not distort the original images or their feature maps and therefore avoids the artifacts associated with warping. Based on this module, a new neural network model is proposed. The full version of this paper can be found online (https://arxiv.org/abs/1802.07351)