338,466 research outputs found

    Efficient On-the-fly Category Retrieval using ConvNets and GPUs

    Full text link
    We investigate the gains in precision and speed, that can be obtained by using Convolutional Networks (ConvNets) for on-the-fly retrieval - where classifiers are learnt at run time for a textual query from downloaded images, and used to rank large image or video datasets. We make three contributions: (i) we present an evaluation of state-of-the-art image representations for object category retrieval over standard benchmark datasets containing 1M+ images; (ii) we show that ConvNets can be used to obtain features which are incredibly performant, and yet much lower dimensional than previous state-of-the-art image representations, and that their dimensionality can be reduced further without loss in performance by compression using product quantization or binarization. Consequently, features with the state-of-the-art performance on large-scale datasets of millions of images can fit in the memory of even a commodity GPU card; (iii) we show that an SVM classifier can be learnt within a ConvNet framework on a GPU in parallel with downloading the new training images, allowing for a continuous refinement of the model as more images become available, and simultaneous training and ranking. The outcome is an on-the-fly system that significantly outperforms its predecessors in terms of: precision of retrieval, memory requirements, and speed, facilitating accurate on-the-fly learning and ranking in under a second on a single GPU.Comment: Published in proceedings of ACCV 201

    OD-NeRF: Efficient Training of On-the-Fly Dynamic Neural Radiance Fields

    Full text link
    Dynamic neural radiance fields (dynamic NeRFs) have demonstrated impressive results in novel view synthesis on 3D dynamic scenes. However, they often require complete video sequences for training followed by novel view synthesis, which is similar to playing back the recording of a dynamic 3D scene. In contrast, we propose OD-NeRF to efficiently train and render dynamic NeRFs on-the-fly which instead is capable of streaming the dynamic scene. When training on-the-fly, the training frames become available sequentially and the model is trained and rendered frame-by-frame. The key challenge of efficient on-the-fly training is how to utilize the radiance field estimated from the previous frames effectively. To tackle this challenge, we propose: 1) a NeRF model conditioned on the multi-view projected colors to implicitly track correspondence between the current and previous frames, and 2) a transition and update algorithm that leverages the occupancy grid from the last frame to sample efficiently at the current frame. Our algorithm can achieve an interactive speed of 6FPS training and rendering on synthetic dynamic scenes on-the-fly, and a significant speed-up compared to the state-of-the-art on real-world dynamic scenes

    Unsupervised Training for 3D Morphable Model Regression

    Full text link
    We present a method for training a regression network from image pixels to 3D morphable model coordinates using only unlabeled photographs. The training loss is based on features from a facial recognition network, computed on-the-fly by rendering the predicted faces with a differentiable renderer. To make training from features feasible and avoid network fooling effects, we introduce three objectives: a batch distribution loss that encourages the output distribution to match the distribution of the morphable model, a loopback loss that ensures the network can correctly reinterpret its own output, and a multi-view identity loss that compares the features of the predicted 3D face and the input photograph from multiple viewing angles. We train a regression network using these objectives, a set of unlabeled photographs, and the morphable model itself, and demonstrate state-of-the-art results.Comment: CVPR 2018 version with supplemental material (http://openaccess.thecvf.com/content_cvpr_2018/html/Genova_Unsupervised_Training_for_CVPR_2018_paper.html

    Learning recurrent representations for hierarchical behavior modeling

    Get PDF
    We propose a framework for detecting action patterns from motion sequences and modeling the sensory-motor relationship of animals, using a generative recurrent neural network. The network has a discriminative part (classifying actions) and a generative part (predicting motion), whose recurrent cells are laterally connected, allowing higher levels of the network to represent high level phenomena. We test our framework on two types of data, fruit fly behavior and online handwriting. Our results show that 1) taking advantage of unlabeled sequences, by predicting future motion, significantly improves action detection performance when training labels are scarce, 2) the network learns to represent high level phenomena such as writer identity and fly gender, without supervision, and 3) simulated motion trajectories, generated by treating motion prediction as input to the network, look realistic and may be used to qualitatively evaluate whether the model has learnt generative control rules
    • …
    corecore