345 research outputs found

    MVF-Net: Multi-View 3D Face Morphable Model Regression

    Full text link
    We address the problem of recovering the 3D geometry of a human face from a set of facial images in multiple views. While recent studies have shown impressive progress in 3D Morphable Model (3DMM) based facial reconstruction, the settings are mostly restricted to a single view. There is an inherent drawback in the single-view setting: the lack of reliable 3D constraints can cause unresolvable ambiguities. We in this paper explore 3DMM-based shape recovery in a different setting, where a set of multi-view facial images are given as input. A novel approach is proposed to regress 3DMM parameters from multi-view inputs with an end-to-end trainable Convolutional Neural Network (CNN). Multiview geometric constraints are incorporated into the network by establishing dense correspondences between different views leveraging a novel self-supervised view alignment loss. The main ingredient of the view alignment loss is a differentiable dense optical flow estimator that can backpropagate the alignment errors between an input view and a synthetic rendering from another input view, which is projected to the target view through the 3D shape to be inferred. Through minimizing the view alignment loss, better 3D shapes can be recovered such that the synthetic projections from one view to another can better align with the observed image. Extensive experiments demonstrate the superiority of the proposed method over other 3DMM methods.Comment: 2019 Conference on Computer Vision and Pattern Recognitio

    CycleACR: Cycle Modeling of Actor-Context Relations for Video Action Detection

    Full text link
    The relation modeling between actors and scene context advances video action detection where the correlation of multiple actors makes their action recognition challenging. Existing studies model each actor and scene relation to improve action recognition. However, the scene variations and background interference limit the effectiveness of this relation modeling. In this paper, we propose to select actor-related scene context, rather than directly leverage raw video scenario, to improve relation modeling. We develop a Cycle Actor-Context Relation network (CycleACR) where there is a symmetric graph that models the actor and context relations in a bidirectional form. Our CycleACR consists of the Actor-to-Context Reorganization (A2C-R) that collects actor features for context feature reorganizations, and the Context-to-Actor Enhancement (C2A-E) that dynamically utilizes reorganized context features for actor feature enhancement. Compared to existing designs that focus on C2A-E, our CycleACR introduces A2C-R for a more effective relation modeling. This modeling advances our CycleACR to achieve state-of-the-art performance on two popular action detection datasets (i.e., AVA and UCF101-24). We also provide ablation studies and visualizations as well to show how our cycle actor-context relation modeling improves video action detection. Code is available at https://github.com/MCG-NJU/CycleACR.Comment: technical repor

    TD^2-Net: Toward Denoising and Debiasing for Dynamic Scene Graph Generation

    Full text link
    Dynamic scene graph generation (SGG) focuses on detecting objects in a video and determining their pairwise relationships. Existing dynamic SGG methods usually suffer from several issues, including 1) Contextual noise, as some frames might contain occluded and blurred objects. 2) Label bias, primarily due to the high imbalance between a few positive relationship samples and numerous negative ones. Additionally, the distribution of relationships exhibits a long-tailed pattern. To address the above problems, in this paper, we introduce a network named TD2^2-Net that aims at denoising and debiasing for dynamic SGG. Specifically, we first propose a denoising spatio-temporal transformer module that enhances object representation with robust contextual information. This is achieved by designing a differentiable Top-K object selector that utilizes the gumbel-softmax sampling strategy to select the relevant neighborhood for each object. Second, we introduce an asymmetrical reweighting loss to relieve the issue of label bias. This loss function integrates asymmetry focusing factors and the volume of samples to adjust the weights assigned to individual samples. Systematic experimental results demonstrate the superiority of our proposed TD2^2-Net over existing state-of-the-art approaches on Action Genome databases. In more detail, TD2^2-Net outperforms the second-best competitors by 12.7 \% on mean-Recall@10 for predicate classification.Comment: Accepted by AAAI 202

    Self-supervised Learning of Detailed 3D Face Reconstruction

    Full text link
    In this paper, we present an end-to-end learning framework for detailed 3D face reconstruction from a single image. Our approach uses a 3DMM-based coarse model and a displacement map in UV-space to represent a 3D face. Unlike previous work addressing the problem, our learning framework does not require supervision of surrogate ground-truth 3D models computed with traditional approaches. Instead, we utilize the input image itself as supervision during learning. In the first stage, we combine a photometric loss and a facial perceptual loss between the input face and the rendered face, to regress a 3DMM-based coarse model. In the second stage, both the input image and the regressed texture of the coarse model are unwrapped into UV-space, and then sent through an image-toimage translation network to predict a displacement map in UVspace. The displacement map and the coarse model are used to render a final detailed face, which again can be compared with the original input image to serve as a photometric loss for the second stage. The advantage of learning displacement map in UV-space is that face alignment can be explicitly done during the unwrapping, thus facial details are easier to learn from large amount of data. Extensive experiments demonstrate the superiority of the proposed method over previous work.Comment: Accepted by IEEE Transactions on Image Processing (TIP

    A Completed Multiple Threshold Encoding Pattern for Texture Classification

    Get PDF
    The binary pattern family has drawn wide attention for texture representation due to its promising performance and simple operation. However, most binary pattern methods focus on local neighborhoods but ignore center pixels. Even if some studies introduce the center based sub-pattern to provide complementary information, existing center based sub-patterns are much weaker than other local neighborhood based sub-patterns. This severe unbalance limits the classification performance of fusion features significantly. To alleviate this problem, this paper designs a multiple threshold center pattern (MTCP) to provide a more discriminative and complementary local texture representation with a compact form. First, a multiple threshold encoding strategy is designed to encode the center pixel that generates three 1-bit binary patterns. Second, it adopts a compact multi-pattern encoding strategy to combine them into a 3-bit MTCP. Furthermore, this paper proposes a completed multiple threshold encoding pattern by fusing the MTCP, local sign pattern, and local magnitude pattern. Comprehensive experimental evaluations on three popular texture classification benchmarks confirm that the completed multiple threshold encoding pattern achieves superior texture classification performance

    Effects of Condylar Elastic Properties to Temporomandibular Joint Stress

    Get PDF
    Mandibular condyle plays an important role in the growth and reconstruction of the temporomandibular joint (TMJ). We aimed to obtain orthotropic elastic parameters of the condyle using a continuous-wave ultrasonic technique and to observe the effects of condylar elastic parameters on stress distribution of the TMJ using finite element analysis (FEA). Using the ultrasonic technique, all nine elastic parameters were obtained, which showed that the mandibular condyle was orthotropic. With the condyle defined as orthotropic, the occlusal stress was transferred fluently and uniformly from the mandible to the TMJ. The stress distribution in the isotropic model showed stepped variation among different anatomical structures with higher stress values in the cartilage and condyle than in the orthotropic model. We conclude that anisotropy has subtle yet significant effects on stress distribution of the TMJ and could improve the reality of simulations

    VITAL: VIsual Tracking via Adversarial Learning

    Full text link
    The tracking-by-detection framework consists of two stages, i.e., drawing samples around the target object in the first stage and classifying each sample as the target object or as background in the second stage. The performance of existing trackers using deep classification networks is limited by two aspects. First, the positive samples in each frame are highly spatially overlapped, and they fail to capture rich appearance variations. Second, there exists extreme class imbalance between positive and negative samples. This paper presents the VITAL algorithm to address these two problems via adversarial learning. To augment positive samples, we use a generative network to randomly generate masks, which are applied to adaptively dropout input features to capture a variety of appearance changes. With the use of adversarial learning, our network identifies the mask that maintains the most robust features of the target objects over a long temporal span. In addition, to handle the issue of class imbalance, we propose a high-order cost sensitive loss to decrease the effect of easy negative samples to facilitate training the classification network. Extensive experiments on benchmark datasets demonstrate that the proposed tracker performs favorably against state-of-the-art approaches.Comment: Spotlight in CVPR 201

    Discovery of novel inhibitors of Streptococcus pneumoniae based on the virtual screening with the homology-modeled structure of histidine kinase (VicK)

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Due to the widespread abusage of antibiotics, antibiotic-resistance in <it>Streptococcus pneumoniae </it>(<it>S. pneumoniae</it>) has been increasing quickly in recent years, and it is obviously urgent to develop new types of antibiotics. Two-component systems (TCSs) are the major signal transduction pathways in bacteria and have emerged as potential targets for antibacterial drugs. Among the 13 pairs of TCSs proteins presenting in <it>S. pneumoniae</it>, VicR/K is the unique one essential for bacterium growth, and block agents to which, if can be found, may be developed as effective antibiotics against <it>S. pneumoniae </it>infection.</p> <p>Results</p> <p>Using a structure-based virtual screening (SBVS) method, 105 compounds were computationally identified as potential inhibitors of the histidine kinase (HK) VicK protein from the compound library SPECS. Six of them were then validated <it>in vitro </it>to be active in inhibiting the growth of <it>S. pneumoniae </it>without obvious cytotoxicity to Vero cell. In mouse sepsis models, these compounds are still able to decrease the mortality of the mice infected by <it>S. pneumoniae </it>and one compound even has significant therapeutic effect.</p> <p>Conclusion</p> <p>To our knowledge, these compounds are the first reported inhibitors of HK with antibacterial activity <it>in vitro </it>and <it>in vivo</it>, and are novel lead structures for developing new drugs to combat pneumococcal infection.</p
    corecore