3,488 research outputs found

    New deep learning approaches to domain adaptation and their applications in 3D hand pose estimation

    Full text link
    This study investigates several methods for using artificial intelligence to give machines the ability to see. It introduced several methods for image recognition that are more accurate and efficient compared to the existing approaches

    State of the Art in Dense Monocular Non-Rigid 3D Reconstruction

    Get PDF
    3D reconstruction of deformable (or non-rigid) scenes from a set of monocular2D image observations is a long-standing and actively researched area ofcomputer vision and graphics. It is an ill-posed inverse problem,since--without additional prior assumptions--it permits infinitely manysolutions leading to accurate projection to the input 2D images. Non-rigidreconstruction is a foundational building block for downstream applicationslike robotics, AR/VR, or visual content creation. The key advantage of usingmonocular cameras is their omnipresence and availability to the end users aswell as their ease of use compared to more sophisticated camera set-ups such asstereo or multi-view systems. This survey focuses on state-of-the-art methodsfor dense non-rigid 3D reconstruction of various deformable objects andcomposite scenes from monocular videos or sets of monocular views. It reviewsthe fundamentals of 3D reconstruction and deformation modeling from 2D imageobservations. We then start from general methods--that handle arbitrary scenesand make only a few prior assumptions--and proceed towards techniques makingstronger assumptions about the observed objects and types of deformations (e.g.human faces, bodies, hands, and animals). A significant part of this STAR isalso devoted to classification and a high-level comparison of the methods, aswell as an overview of the datasets for training and evaluation of thediscussed techniques. We conclude by discussing open challenges in the fieldand the social aspects associated with the usage of the reviewed methods.<br

    State of the Art in Dense Monocular Non-Rigid 3D Reconstruction

    Full text link
    3D reconstruction of deformable (or non-rigid) scenes from a set of monocular 2D image observations is a long-standing and actively researched area of computer vision and graphics. It is an ill-posed inverse problem, since--without additional prior assumptions--it permits infinitely many solutions leading to accurate projection to the input 2D images. Non-rigid reconstruction is a foundational building block for downstream applications like robotics, AR/VR, or visual content creation. The key advantage of using monocular cameras is their omnipresence and availability to the end users as well as their ease of use compared to more sophisticated camera set-ups such as stereo or multi-view systems. This survey focuses on state-of-the-art methods for dense non-rigid 3D reconstruction of various deformable objects and composite scenes from monocular videos or sets of monocular views. It reviews the fundamentals of 3D reconstruction and deformation modeling from 2D image observations. We then start from general methods--that handle arbitrary scenes and make only a few prior assumptions--and proceed towards techniques making stronger assumptions about the observed objects and types of deformations (e.g. human faces, bodies, hands, and animals). A significant part of this STAR is also devoted to classification and a high-level comparison of the methods, as well as an overview of the datasets for training and evaluation of the discussed techniques. We conclude by discussing open challenges in the field and the social aspects associated with the usage of the reviewed methods.Comment: 25 page

    Weakly supervised learning with stochastic supervision and knowledge transfer

    Get PDF
    In recent years, machine learning methods especially supervised learning methods have achieved great progress in both methodologies and applications. However, in supervised learning, each training sample requires a label to indicate its ground-truth. In many machine learning tasks, it is hard to get sufficient accurately labelled training samples. Weakly supervised learning is an extended setting of supervised learning to more general tasks. In this thesis, we focus on proposing novel methods for inaccurate supervision and incomplete supervision under the setting of weakly supervised learning. In inaccurate supervision, problems with nondeterministic labels, such as stochastic supervision problems, are rarely discussed. In stochastic supervision, the supervision is a probabilistic assessment rather than a deterministic label. In Chapter 2, we provide four generalisations of stochastic supervision models, extending them to asymmetric assessments, multiple classes, feature-dependent assessments, and multi-modal classes, respectively. Corresponding to these generalisations, four new EM algorithms are derived. We show the effectiveness of our generalisations through illustrative examples of simulated datasets, as well as real-world examples of two famous datasets, the MNIST dataset, and the CIFAR-10 dataset. For incomplete supervision problems, we focus on improving the semi-supervised learning in one domain/task by transferring knowledge from another domain/task or from many domains/tasks. In Chapter 3, a novel domain-adaptation-based method is proposed to improve a typical application of semi-supervised learning: the pose estimation, in which the implicit density estimation problem in the domain adaptation is solved by using a neural network to approximate it. The proposed method transfers the knowledge from the training samples in the synthetic data domain to improve the learner in the real data domain, and achieves state-of-the-art performance. In Chapter 4, we focus on transferring knowledge from many tasks to improve the semi-supervised few-shot learning. We use meta-learning to transfer knowledge from many meta-train tasks. A tailor-made ensemble method for few-shot learning is proposed to relieve the pseudo-label noise problem in the semi-supervised few-shot learning. The proposed method also achieves state-of-the-art performances in two widely used benchmark datasets (miniImageNet and tieredImageNet) in few-shot learning
    • …
    corecore