226 research outputs found

    An Analysis of Colour Metaphors of Emotions in Jimmy’s Picture Book-Turn Lift, Turn Right

    Get PDF
    The analysis in this paper aims to find colour metaphors in the artist Jimmy Liao’s picture book Turn Lift, Turn right, and to use the pictures in this book to establish the relations between colour and emotions. This paper selects seven pictures of indoor scenes from this picture book, and brings forth an illustration by observing that different colours can be used as metaphor to convey different emotions. The results clearly indicate that the author Jimmy choose a large number of warm colour (such as yellow, red and orange) to present the positive emotions (like energy, passion and strength), while the cool colour (like blue and gray) to give out negative feelings (such as fear of loss, sorrow, heartbreaking)

    Hybrid Neural Rendering for Large-Scale Scenes with Motion Blur

    Full text link
    Rendering novel view images is highly desirable for many applications. Despite recent progress, it remains challenging to render high-fidelity and view-consistent novel views of large-scale scenes from in-the-wild images with inevitable artifacts (e.g., motion blur). To this end, we develop a hybrid neural rendering model that makes image-based representation and neural 3D representation join forces to render high-quality, view-consistent images. Besides, images captured in the wild inevitably contain artifacts, such as motion blur, which deteriorates the quality of rendered images. Accordingly, we propose strategies to simulate blur effects on the rendered images to mitigate the negative influence of blurriness images and reduce their importance during training based on precomputed quality-aware weights. Extensive experiments on real and synthetic data demonstrate our model surpasses state-of-the-art point-based methods for novel view synthesis. The code is available at https://daipengwa.github.io/Hybrid-Rendering-ProjectPage

    MarS3D: A Plug-and-Play Motion-Aware Model for Semantic Segmentation on Multi-Scan 3D Point Clouds

    Full text link
    3D semantic segmentation on multi-scan large-scale point clouds plays an important role in autonomous systems. Unlike the single-scan-based semantic segmentation task, this task requires distinguishing the motion states of points in addition to their semantic categories. However, methods designed for single-scan-based segmentation tasks perform poorly on the multi-scan task due to the lacking of an effective way to integrate temporal information. We propose MarS3D, a plug-and-play motion-aware module for semantic segmentation on multi-scan 3D point clouds. This module can be flexibly combined with single-scan models to allow them to have multi-scan perception abilities. The model encompasses two key designs: the Cross-Frame Feature Embedding module for enriching representation learning and the Motion-Aware Feature Learning module for enhancing motion awareness. Extensive experiments show that MarS3D can improve the performance of the baseline model by a large margin. The code is available at https://github.com/CVMI-Lab/MarS3D

    Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition

    Full text link
    The Transformer architecture model, based on self-attention and multi-head attention, has achieved remarkable success in offline end-to-end Automatic Speech Recognition (ASR). However, self-attention and multi-head attention cannot be easily applied for streaming or online ASR. For self-attention in Transformer ASR, the softmax normalization function-based attention mechanism makes it impossible to highlight important speech information. For multi-head attention in Transformer ASR, it is not easy to model monotonic alignments in different heads. To overcome these two limits, we integrate sparse attention and monotonic attention into Transformer-based ASR. The sparse mechanism introduces a learned sparsity scheme to enable each self-attention structure to fit the corresponding head better. The monotonic attention deploys regularization to prune redundant heads for the multi-head attention structure. The experiments show that our method can effectively improve the attention mechanism on widely used benchmarks of speech recognition.Comment: Accepted to DSAA 202

    Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video

    Full text link
    Synthesizing realistic videos according to a given speech is still an open challenge. Previous works have been plagued by issues such as inaccurate lip shape generation and poor image quality. The key reason is that only motions and appearances on limited facial areas (e.g., lip area) are mainly driven by the input speech. Therefore, directly learning a mapping function from speech to the entire head image is prone to ambiguity, particularly when using a short video for training. We thus propose a decomposition-synthesis-composition framework named Speech to Lip (Speech2Lip) that disentangles speech-sensitive and speech-insensitive motion/appearance to facilitate effective learning from limited training data, resulting in the generation of natural-looking videos. First, given a fixed head pose (i.e., canonical space), we present a speech-driven implicit model for lip image generation which concentrates on learning speech-sensitive motion and appearance. Next, to model the major speech-insensitive motion (i.e., head movement), we introduce a geometry-aware mutual explicit mapping (GAMEM) module that establishes geometric mappings between different head poses. This allows us to paste generated lip images at the canonical space onto head images with arbitrary poses and synthesize talking videos with natural head movements. In addition, a Blend-Net and a contrastive sync loss are introduced to enhance the overall synthesis performance. Quantitative and qualitative results on three benchmarks demonstrate that our model can be trained by a video of just a few minutes in length and achieve state-of-the-art performance in both visual quality and speech-visual synchronization. Code: https://github.com/CVMI-Lab/Speech2Lip

    Early patterning of cloned mouse embryos contributes to post-implantation development

    Get PDF
    AbstractSeveral research groups have suggested that the embryonic–abembryonic (Em–Ab) axis in the mouse can be predicted by the first cleavage plane of the early embryo. Currently, it is not known whether this early patterning occurs in cloned embryos produced by nuclear transfer and whether it affects development to term. In this work, the relationship between the first cleavage plane and the Em–Ab axis was determined by the labeling of one blastomere in cloned mouse embryos at the 2-cell stage, followed by ex-vivo tracking until the blastocyst stage. The results demonstrate that approximately half of the cloned blastocysts had an Em–Ab axis perpendicular to the initial cleavage plane of the 2-cell stage. These embryos were classified as “orthogonal” and the remainder as “deviant”. Additionally, we report here that cloned embryos were significantly more often orthogonal than their naturally fertilized counterparts and overexpressed Sox2. Orthogonal cloned embryos demonstrated a higher rate of post-implantation embryonic development than deviant embryos, but cloned pups did not all survive. These results reveal that the angular relationship between the Em–Ab axis and the first cleavage plane can influence later development and they support the hypothesis that proper early patterning of mammalian embryos is required after nuclear transfer
    • …
    corecore