6 research outputs found

    Deep Neural Networks to Learn Basis Functions with a Temporal Covariance Loss

    Get PDF
    Department of Computer Science and EngineeringDeep Neural Networks (DNNs) and Gaussian Processes (GPs) are commonly used prediction models to solve regression problems on time series data. A GP can approximate a smooth function arbitrarily well. When the function satisfies some conditions. We adopt the principles of GP learning to DNN learning on time series data. While previous approaches need to change the architecture of DNNs or be explicitly derived from the GPs algorithm, we concentrate on the learning scheme of DNNs to leverage the important principles of GPs by proposing the Temporal Covariance loss function. Whereas the conventional loss function of DNNs only captures the mean of the target values, Temporal Covariance loss function further captures the covariance of the target values where covariance function is the other factor to define GPs along with the mean function. We show that learning DNNs and Convolutional Neural Networks (CNNs) with the Temporal Covariance loss function can obtain more accurate models for sets of regression problems with US groundwater data and NASDAQ 100 stock data.clos

    A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optimization

    Full text link
    We propose NeuFace, a 3D face mesh pseudo annotation method on videos via neural re-parameterized optimization. Despite the huge progress in 3D face reconstruction methods, generating reliable 3D face labels for in-the-wild dynamic videos remains challenging. Using NeuFace optimization, we annotate the per-view/-frame accurate and consistent face meshes on large-scale face videos, called the NeuFace-dataset. We investigate how neural re-parameterization helps to reconstruct image-aligned facial details on 3D meshes via gradient analysis. By exploiting the naturalness and diversity of 3D faces in our dataset, we demonstrate the usefulness of our dataset for 3D face-related tasks: improving the reconstruction accuracy of an existing 3D face reconstruction model and learning 3D facial motion prior. Code and datasets will be available at https://neuface-dataset.github.io.Comment: 9 pages, 7 figures, and 3 tables for the main paper. 8 pages, 6 figures and 3 tables for the appendi

    LaughTalk: Expressive 3D Talking Head Generation with Laughter

    Full text link
    Laughter is a unique expression, essential to affirmative social interactions of humans. Although current 3D talking head generation methods produce convincing verbal articulations, they often fail to capture the vitality and subtleties of laughter and smiles despite their importance in social context. In this paper, we introduce a novel task to generate 3D talking heads capable of both articulate speech and authentic laughter. Our newly curated dataset comprises 2D laughing videos paired with pseudo-annotated and human-validated 3D FLAME parameters and vertices. Given our proposed dataset, we present a strong baseline with a two-stage training scheme: the model first learns to talk and then acquires the ability to express laughter. Extensive experiments demonstrate that our method performs favorably compared to existing approaches in both talking head generation and expressing laughter signals. We further explore potential applications on top of our proposed method for rigging realistic avatars.Comment: Accepted to WACV202

    Global Deconvolutional Networks for Semantic Segmentation

    No full text
    Semantic image segmentation is a principal problem in computer vision, where the aim is to correctly classify each individual pixel of an image into a semantic label. Its widespread use in many areas, including medical imaging and autonomous driving, has fostered extensive research in recent years. Empirical improvements in tackling this task have primarily been motivated by successful exploitation of Convolutional Neural Networks (CNNs) pre-trained for image classification and object recognition. However, the pixel-wise labelling with CNNs has its own unique challenges: (1) an accurate deconvolution, or upsampling, of low-resolution output into a higher-resolution segmentation mask and (2) an inclusion of global information, or context, within locally extracted features. To address these issues, we propose a novel architecture to conduct the equivalent of the deconvolution operation globally and acquire dense predictions. We demonstrate that it leads to improved performance of state-of-the-art semantic segmentation models on the PASCAL VOC 2012 benchmark, reaching 74.0% mean IU accuracy on the test set

    Development of Intraoperative Near-Infrared Fluorescence Imaging System Using a Dual-CMOS Single Camera

    No full text
    We developed a single-camera-based near-infrared (NIR) fluorescence imaging device using indocyanine green (ICG) NIR fluorescence contrast agents for image-induced surgery. In general, a fluorescent imaging system that simultaneously provides color and NIR images uses two cameras, which is disadvantageous because it increases the imaging head of the system. Recently, a single-camera-based NIR optical imaging device with quantum efficiency partially extended to the NIR region was developed to overcome this drawback. The system used RGB_NIR filters for camera sensors to provide color and NIR images simultaneously; however, the sensitivity and resolution of the infrared images are reduced by 1/4, and the exposure time and gain cannot be set individually when acquiring color and NIR images. Thus, to overcome these shortcomings, this study developed a compact fluorescent imaging system that uses a single camera with two complementary metal–oxide semiconductor (CMOS) image sensors. Sensitivity and signal-to-background ratio were measured according to the concentrations of ICG solution, exposure time, and camera gain to evaluate the performance of the imaging system. Consequently, the clinical applicability of the system was confirmed through the toxicity analysis of the light source and in vivo testing
    corecore