72 research outputs found

    Beauty3DFaceNet:Deep geometry and texture fusion for 3D facial attractiveness prediction

    Get PDF
    We present Beauty3DFaceNet, the first deep convolutional neural network to predict attractiveness on 3D faces with both geometry and texture information. The proposed network can learn discriminative and complementary 2D and 3D facial features, allowing accurate attractiveness prediction for 3D faces. The main component of our network is a fusion module that fuses geometric features and texture features. We further employ a novel sampling strategy for our network based on a prior of facial landmarks, which improves the performance of learning aesthetic features from a face point cloud. Comparing to previous work, our approach takes full advantage of 3D geometry and 2D texture and does not rely on handcrafted features based on highly accurate facial characteristics such as feature points. To facilitate 3D facial attractiveness research, we also construct the first 3D face dataset ShadowFace3D, which contains 6,000 high-quality 3D faces with attractiveness labeled by human annotators. Extensive quantitative and qualitative evaluations show that Beauty3DFaceNet achieves a significant correlation with the average human ratings. This validates that a deep learning network can effectively learn and predict 3D facial attractiveness.</p

    Relational Self-Supervised Learning

    Full text link
    Self-supervised Learning (SSL) including the mainstream contrastive learning has achieved great success in learning visual representations without data annotations. However, most methods mainly focus on the instance level information (\ie, the different augmented images of the same instance should have the same feature or cluster into the same class), but there is a lack of attention on the relationships between different instances. In this paper, we introduce a novel SSL paradigm, which we term as relational self-supervised learning (ReSSL) framework that learns representations by modeling the relationship between different instances. Specifically, our proposed method employs sharpened distribution of pairwise similarities among different instances as \textit{relation} metric, which is thus utilized to match the feature embeddings of different augmentations. To boost the performance, we argue that weak augmentations matter to represent a more reliable relation, and leverage momentum strategy for practical efficiency. The designed asymmetric predictor head and an InfoNCE warm-up strategy enhance the robustness to hyper-parameters and benefit the resulting performance. Experimental results show that our proposed ReSSL substantially outperforms the state-of-the-art methods across different network architectures, including various lightweight networks (\eg, EfficientNet and MobileNet).Comment: Extended version of NeurIPS 2021 paper. arXiv admin note: substantial text overlap with arXiv:2107.0928

    Dynamic fusion with intra-and inter-modality attention flow for visual question answering

    Get PDF
    Learning effective fusion of multi-modality features is at the heart of visual question answering. We propose a novel method of dynamically fusing multi-modal features with intra- and inter-modality information flow, which alternatively pass dynamic information between and across the visual and language modalities. It can robustly capture the high-level interactions between language and vision domains, thus significantly improves the performance of visual question answering. We also show that the proposed dynamic intra-modality attention flow conditioned on the other modality can dynamically modulate the intra-modality attention of the target modality, which is vital for multimodality feature fusion. Experimental evaluations on the VQA 2.0 dataset show that the proposed method achieves state-of-the-art VQA performance. Extensive ablation studies are carried out for the comprehensive analysis of the proposed method.Comment: CVPR 2019 ORA

    CoNe: Contrast Your Neighbours for Supervised Image Classification

    Full text link
    Image classification is a longstanding problem in computer vision and machine learning research. Most recent works (e.g. SupCon , Triplet, and max-margin) mainly focus on grouping the intra-class samples aggressively and compactly, with the assumption that all intra-class samples should be pulled tightly towards their class centers. However, such an objective will be very hard to achieve since it ignores the intra-class variance in the dataset. (i.e. different instances from the same class can have significant differences). Thus, such a monotonous objective is not sufficient. To provide a more informative objective, we introduce Contrast Your Neighbours (CoNe) - a simple yet practical learning framework for supervised image classification. Specifically, in CoNe, each sample is not only supervised by its class center but also directly employs the features of its similar neighbors as anchors to generate more adaptive and refined targets. Moreover, to further boost the performance, we propose ``distributional consistency" as a more informative regularization to enable similar instances to have a similar probability distribution. Extensive experimental results demonstrate that CoNe achieves state-of-the-art performance across different benchmark datasets, network architectures, and settings. Notably, even without a complicated training recipe, our CoNe achieves 80.8\% Top-1 accuracy on ImageNet with ResNet-50, which surpasses the recent Timm training recipe (80.4\%). Code and pre-trained models are available at \href{https://github.com/mingkai-zheng/CoNe}{https://github.com/mingkai-zheng/CoNe}

    Blending using ODE swept surfaces with shape control and C1 continuity

    Get PDF
    Surface blending with tangential continuity is most widely applied in computer aided design, manufacturing systems, and geometric modeling. In this paper, we propose a new blending method to effectively control the shape of blending surfaces, which can also satisfy the blending constraints of tangent continuity exactly. This new blending method is based on the concept of swept surfaces controlled by a vector-valued fourth order ordinary differential equation (ODE). It creates blending surfaces by sweeping a generator along two trimlines and making the generator exactly satisfy the tangential constraints at the trimlines. The shape of blending surfaces is controlled by manipulating the generator with the solution to a vector-valued fourth order ODE. This new blending methods have the following advantages: 1). exact satisfaction of 1C continuous blending boundary constraints, 2). effective shape control of blending surfaces, 3). high computing efficiency due to explicit mathematical representation of blending surfaces, and 4). ability to blend multiple (more than two) primary surfaces

    Dynamic skin deformation using finite difference solutions for character animation

    Get PDF
    We present a new skin deformation method to create dynamic skin deformations in this paper. The core elements of our approach are a dynamic deformation model, an efficient data-driven finite difference solution, and a curve-based representation of 3D models. We first reconstruct skin deformation models at different poses from the taken photos of a male human arm movement to achieve real deformed skin shapes. Then, we extract curves from these reconstructed skin deformation models. A new dynamic deformation model is proposed to describe physics of dynamic curve deformations, and its finite difference solution is developed to determine shape changes of the extracted curves. In order to improve visual realism of skin deformations, we employ data-driven methods and introduce skin shapes at the initial and final poses into our proposed dynamic deformation model. Experimental examples and comparisons made in this paper indicate that our proposed dynamic skin deformation technique can create realistic deformed skin shapes efficiently with a small data size
    corecore