105 research outputs found

    Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video

    Full text link
    Synthesizing realistic videos according to a given speech is still an open challenge. Previous works have been plagued by issues such as inaccurate lip shape generation and poor image quality. The key reason is that only motions and appearances on limited facial areas (e.g., lip area) are mainly driven by the input speech. Therefore, directly learning a mapping function from speech to the entire head image is prone to ambiguity, particularly when using a short video for training. We thus propose a decomposition-synthesis-composition framework named Speech to Lip (Speech2Lip) that disentangles speech-sensitive and speech-insensitive motion/appearance to facilitate effective learning from limited training data, resulting in the generation of natural-looking videos. First, given a fixed head pose (i.e., canonical space), we present a speech-driven implicit model for lip image generation which concentrates on learning speech-sensitive motion and appearance. Next, to model the major speech-insensitive motion (i.e., head movement), we introduce a geometry-aware mutual explicit mapping (GAMEM) module that establishes geometric mappings between different head poses. This allows us to paste generated lip images at the canonical space onto head images with arbitrary poses and synthesize talking videos with natural head movements. In addition, a Blend-Net and a contrastive sync loss are introduced to enhance the overall synthesis performance. Quantitative and qualitative results on three benchmarks demonstrate that our model can be trained by a video of just a few minutes in length and achieve state-of-the-art performance in both visual quality and speech-visual synchronization. Code: https://github.com/CVMI-Lab/Speech2Lip

    Spatial distribution of job opportunities in China: Evidence from the opening of the high-speed rail

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.The provision of sufficient job opportunities has traditionally been a primary objective for both local and central governments. In response to this concern, we investigate spatial dependence of job opportunities among 30 Chinese provincial capital cities (PCCs) from 2002 to 2016, giving special attention to the spatial spillovers of the opening of the high-speed rail (HSR). Using appropriate spatial panel data models, our findings suggest the presence of significant spatial autocorrelation of job opportunities among PCCs. Whilst the HSR has been found to increase job opportunities at the national level, which, however, is not confirmed at the regional level. The spatial spillover effects of the HSR are significant and positive only in the eastern/northeastern region. These findings can help the central government to more fully understand spatial dependence of job opportunities, better plan future HSR networks, and efficiently allocate transportation resources, encouraging cross-regional collaboration to promote regional employment

    ART\boldsymbol{\cdot}V: Auto-Regressive Text-to-Video Generation with Diffusion Models

    Full text link
    We present ART\boldsymbol{\cdot}V, an efficient framework for auto-regressive video generation with diffusion models. Unlike existing methods that generate entire videos in one-shot, ART\boldsymbol{\cdot}V generates a single frame at a time, conditioned on the previous ones. The framework offers three distinct advantages. First, it only learns simple continual motions between adjacent frames, therefore avoiding modeling complex long-range motions that require huge training data. Second, it preserves the high-fidelity generation ability of the pre-trained image diffusion models by making only minimal network modifications. Third, it can generate arbitrarily long videos conditioned on a variety of prompts such as text, image or their combinations, making it highly versatile and flexible. To combat the common drifting issue in AR models, we propose masked diffusion model which implicitly learns which information can be drawn from reference images rather than network predictions, in order to reduce the risk of generating inconsistent appearances that cause drifting. Moreover, we further enhance generation coherence by conditioning it on the initial frame, which typically contains minimal noise. This is particularly useful for long video generation. When trained for only two weeks on four GPUs, ART\boldsymbol{\cdot}V already can generate videos with natural motions, rich details and a high level of aesthetic quality. Besides, it enables various appealing applications, e.g., composing a long video from multiple text prompts.Comment: 24 pages, 21 figures. Project page at https://warranweng.github.io/art.

    Petroleum Hydrocarbon-Degrading Bacteria for the Remediation of Oil Pollution Under Aerobic Conditions: A Perspective Analysis

    Get PDF
    With the sharp increase in population and modernization of society, environmental pollution resulting from petroleum hydrocarbons has increased, resulting in an urgent need for remediation. Petroleum hydrocarbon-degrading bacteria are ubiquitous in nature and can utilize these compounds as sources of carbon and energy. Bacteria displaying such capabilities are often exploited for the bioremediation of petroleum oil-contaminated environments. Recently, microbial remediation technology has developed rapidly and achieved major gains. However, this technology is not omnipotent. It is affected by many environmental factors that hinder its practical application, limiting the large-scale application of the technology. This paper provides an overview of the recent literature referring to the usage of bacteria as biodegraders, discusses barriers regarding the implementation of this microbial technology, and provides suggestions for further developments

    A Resource Aware MapReduce Based Parallel SVM for Large Scale Image Classifications

    Get PDF
    Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them support vector machines (SVMs) are used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large. This paper presents RASMO, a resource aware MapReduce based parallel SVM algorithm for large scale image classifications which partitions the training data set into smaller subsets and optimizes SVM training in parallel using a cluster of computers. A genetic algorithm based load balancing scheme is designed to optimize the performance of RASMO in heterogeneous computing environments. RASMO is evaluated in both experimental and simulation environments. The results show that the parallel SVM algorithm reduces the training time significantly compared with the sequential SMO algorithm while maintaining a high level of accuracy in classifications.National Basic Research Program (973) of China under Grant 2014CB34040

    A genetic variation map for chicken with 2.8 million single-nucleotide polymorphisms

    Get PDF
    We describe a genetic variation map for the chicken genome containing 2.8 million single-nucleotide polymorphisms ( SNPs). This map is based on a comparison of the sequences of three domestic chicken breeds ( a broiler, a layer and a Chinese silkie) with that of their wild ancestor, red jungle fowl. Subsequent experiments indicate that at least 90% of the variant sites are true SNPs, and at least 70% are common SNPs that segregate in many domestic breeds. Mean nucleotide diversity is about five SNPs per kilobase for almost every possible comparison between red jungle fowl and domestic lines, between two different domestic lines, and within domestic lines - in contrast to the notion that domestic animals are highly inbred relative to their wild ancestors. In fact, most of the SNPs originated before domestication, and there is little evidence of selective sweeps for adaptive alleles on length scales greater than 100 kilobases

    ARGONAUTE10 and ARGONAUTE1 Regulate the Termination of Floral Stem Cells through Two MicroRNAs in Arabidopsis

    Get PDF
    Stem cells are crucial in morphogenesis in plants and animals. Much is known about the mechanisms that maintain stem cell fates or trigger their terminal differentiation. However, little is known about how developmental time impacts stem cell fates. Using Arabidopsis floral stem cells as a model, we show that stem cells can undergo precise temporal regulation governed by mechanisms that are distinct from, but integrated with, those that specify cell fates. We show that two microRNAs, miR172 and miR165/166, through targeting APETALA2 and type III homeodomain-leucine zipper (HD-Zip) genes, respectively, regulate the temporal program of floral stem cells. In particular, we reveal a role of the type III HD-Zip genes, previously known to specify lateral organ polarity, in stem cell termination. Both reduction in HD-Zip expression by over-expression of miR165/166 and mis-expression of HD-Zip genes by rendering them resistant to miR165/166 lead to prolonged floral stem cell activity, indicating that the expression of HD-Zip genes needs to be precisely controlled to achieve floral stem cell termination. We also show that both the ubiquitously expressed ARGONAUTE1 (AGO1) gene and its homolog AGO10, which exhibits highly restricted spatial expression patterns, are required to maintain the correct temporal program of floral stem cells. We provide evidence that AGO10, like AGO1, associates with miR172 and miR165/166 in vivo and exhibits “slicer” activity in vitro. Despite the common biological functions and similar biochemical activities, AGO1 and AGO10 exert different effects on miR165/166 in vivo. This work establishes a network of microRNAs and transcription factors governing the temporal program of floral stem cells and sheds light on the relationships among different AGO genes, which tend to exist in gene families in multicellular organisms

    The Genomes of Oryza sativa: A History of Duplications

    Get PDF
    We report improved whole-genome shotgun sequences for the genomes of indica and japonica rice, both with multimegabase contiguity, or almost 1,000-fold improvement over the drafts of 2002. Tested against a nonredundant collection of 19,079 full-length cDNAs, 97.7% of the genes are aligned, without fragmentation, to the mapped super-scaffolds of one or the other genome. We introduce a gene identification procedure for plants that does not rely on similarity to known genes to remove erroneous predictions resulting from transposable elements. Using the available EST data to adjust for residual errors in the predictions, the estimated gene count is at least 38,000–40,000. Only 2%–3% of the genes are unique to any one subspecies, comparable to the amount of sequence that might still be missing. Despite this lack of variation in gene content, there is enormous variation in the intergenic regions. At least a quarter of the two sequences could not be aligned, and where they could be aligned, single nucleotide polymorphism (SNP) rates varied from as little as 3.0 SNP/kb in the coding regions to 27.6 SNP/kb in the transposable elements. A more inclusive new approach for analyzing duplication history is introduced here. It reveals an ancient whole-genome duplication, a recent segmental duplication on Chromosomes 11 and 12, and massive ongoing individual gene duplications. We find 18 distinct pairs of duplicated segments that cover 65.7% of the genome; 17 of these pairs date back to a common time before the divergence of the grasses. More important, ongoing individual gene duplications provide a never-ending source of raw material for gene genesis and are major contributors to the differences between members of the grass family
    corecore