160 research outputs found

    Deep Multimodal Speaker Naming

    Full text link
    Automatic speaker naming is the problem of localizing as well as identifying each speaking character in a TV/movie/live show video. This is a challenging problem mainly attributes to its multimodal nature, namely face cue alone is insufficient to achieve good performance. Previous multimodal approaches to this problem usually process the data of different modalities individually and merge them using handcrafted heuristics. Such approaches work well for simple scenes, but fail to achieve high performance for speakers with large appearance variations. In this paper, we propose a novel convolutional neural networks (CNN) based learning framework to automatically learn the fusion function of both face and audio cues. We show that without using face tracking, facial landmark localization or subtitle/transcript, our system with robust multimodal feature extraction is able to achieve state-of-the-art speaker naming performance evaluated on two diverse TV series. The dataset and implementation of our algorithm are publicly available online

    DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation

    Full text link
    One key challenge of exemplar-guided image generation lies in establishing fine-grained correspondences between input and guided images. Prior approaches, despite the promising results, have relied on either estimating dense attention to compute per-point matching, which is limited to only coarse scales due to the quadratic memory cost, or fixing the number of correspondences to achieve linear complexity, which lacks flexibility. In this paper, we propose a dynamic sparse attention based Transformer model, termed Dynamic Sparse Transformer (DynaST), to achieve fine-level matching with favorable efficiency. The heart of our approach is a novel dynamic-attention unit, dedicated to covering the variation on the optimal number of tokens one position should focus on. Specifically, DynaST leverages the multi-layer nature of Transformer structure, and performs the dynamic attention scheme in a cascaded manner to refine matching results and synthesize visually-pleasing outputs. In addition, we introduce a unified training objective for DynaST, making it a versatile reference-based image translation framework for both supervised and unsupervised scenarios. Extensive experiments on three applications, pose-guided person image generation, edge-based face synthesis, and undistorted image style transfer, demonstrate that DynaST achieves superior performance in local details, outperforming the state of the art while reducing the computational cost significantly. Our code is available at https://github.com/Huage001/DynaSTComment: ECCV 202

    Towards Efficient SDRTV-to-HDRTV by Learning from Image Formation

    Full text link
    Modern displays are capable of rendering video content with high dynamic range (HDR) and wide color gamut (WCG). However, the majority of available resources are still in standard dynamic range (SDR). As a result, there is significant value in transforming existing SDR content into the HDRTV standard. In this paper, we define and analyze the SDRTV-to-HDRTV task by modeling the formation of SDRTV/HDRTV content. Our analysis and observations indicate that a naive end-to-end supervised training pipeline suffers from severe gamut transition errors. To address this issue, we propose a novel three-step solution pipeline called HDRTVNet++, which includes adaptive global color mapping, local enhancement, and highlight refinement. The adaptive global color mapping step uses global statistics as guidance to perform image-adaptive color mapping. A local enhancement network is then deployed to enhance local details. Finally, we combine the two sub-networks above as a generator and achieve highlight consistency through GAN-based joint training. Our method is primarily designed for ultra-high-definition TV content and is therefore effective and lightweight for processing 4K resolution images. We also construct a dataset using HDR videos in the HDR10 standard, named HDRTV1K that contains 1235 and 117 training images and 117 testing images, all in 4K resolution. Besides, we select five metrics to evaluate the results of SDRTV-to-HDRTV algorithms. Our final results demonstrate state-of-the-art performance both quantitatively and visually. The code, model and dataset are available at https://github.com/xiaom233/HDRTVNet-plus.Comment: Extended version of HDRTVNe

    Allylic oxidation of olefins with a manganese-based metal-organic framework

    Get PDF
    Selective oxidation of olefins to α,β-unsaturated ketones under mild reaction conditions have attracted considerable interest, since α,β-unsaturated ketones can serve to be synthetic precursors for various downstream chemical products. The major challenges inherently with this chemical oxidation are chem-, regio-selectivity as well as environmental concerns, i.e. catalyst recycle, safety and cost. Using atmospheric oxygen as an environmental friendly oxidant, we found that a metal-organic framework (MOF) constructed with Mn and tetrazolate ligand (CPF-5) showed good activity and selectivity for the allylic oxidation of olefins to α,β-unsaturated ketones. Under the optimized condition, we could achieve 98% conversion of cyclohexene and 87% selectivity toward cyclohexanone. The combination of a substoichiometric amount of TBHP (tert-butylhydroperoxide) and oxygen not only provides a cost effective oxidation system but significantly enhances the selectivity to α,β-unsaturated ketones, outperforming most reported oxidation methods. This catalytic system is heterogeneous in nature, and CPF-5 could be reused at least five times without a significant decrease in its catalytic activity and selectivity

    Statistical Optimization of Operational Parameters for Enhanced Naphthalene Degradation by Photocatalyst

    Get PDF
    The optimization of operational parameters for enhanced naphthalene degradation by TiO2/Fe3O4-SiO2 (TFS) photocatalyst was conducted using statistical experimental design and analysis. Central composite design method of response surface methodology (RSM) was adopted to investigate the optimum value of the selected factors for achieving maximum naphthalene degradation. Experimental results showed that irradiation time, pH, and TFS photocatalyst loading had significant influence on naphthalene degradation and the maximum degradation rate of 97.39% was predicted when the operational parameters were irradiation time 97.1 min, pH 2.1, and catalyst loading 0.962 g/L, respectively. The results were further verified by repeated experiments under optimal conditions. The excellent correlation between predicted and measured values further confirmed the validity and practicability of this statistical optimum strategy

    Superconductivity in a new layered cobalt oxychalcogenide Na6_{6}Co3_{3}Se6_{6}O3_{3} with a 3d5d^{5} triangular lattice

    Full text link
    Unconventional superconductivity in bulk materials under ambient pressure is extremely rare among the 3dd transition-metal compounds outside the layered cuprates and iron-based family. It is predominantly linked to highly anisotropic electronic properties and quasi-two-dimensional (2D) Fermi surfaces. To date, the only known example of the Co-based exotic superconductor was the hydrated layered cobaltate, Nax_{x}CoO2⋅_{2}\cdot yH2_{2}O, and its superconductivity is realized in the vicinity of a spin-1/2 Mott state. However, the nature of the superconductivity in these materials is still an active subject of debate, and therefore, finding new class of superconductors will help unravel the mysteries of their unconventional superconductivity. Here we report the discovery of unconventional superconductivity at ∼\sim 6.3 K in our newly synthesized layered compound Na6_{6}Co3_{3}Se6_{6}O3_{3}, in which the edge-shared CoSe6_{6} octahedra form [CoSe2_{2}] layers with a perfect triangular lattice of Co ions. It is the first 3dd transition-metal oxychalcogenide superconductor with distinct structural and chemical characteristics. Despite its relatively low TcT_{c}, material exhibits extremely high superconducting upper critical fields, μ0Hc2(0)\mu_{0}H_{c2}(0), which far exceeds the Pauli paramagnetic limit by a factor of 3 - 4. First-principles calculations show that Na6_{6}Co3_{3}Se6_{6}O3_{3} is a rare example of negative charge transfer superconductor. This new cobalt oxychalcogenide with a geometrical frustration among Co spins, shows great potential as a highly appealing candidate for the realization of high-TcT_{c} and/or unconventional superconductivity beyond the well-established Cu- and Fe-based superconductor families, and opened a new field in physics and chemistry of low-dimensional superconductors

    Transcriptome analysis of the hepatopancreas from the Litopenaeus vannamei infected with different flagellum types of Vibrio alginolyticus strains

    Get PDF
    Vibrio alginolyticus, one of the prevalently harmful Vibrio species found in the ocean, causes significant economic damage in the shrimp farming industry. Its flagellum serves as a crucial virulence factor in the invasion of host organisms. However, the processes of bacteria flagella recognition and activation of the downstream immune system in shrimp remain unclear. To enhance comprehension of this, a ΔflhG strain was created by in-frame deletion of the flhG gene in V. alginolyticus strain HN08155. Then we utilized the transcriptome analysis to examine the different immune responses in Litopenaeus vannamei hepatopancreas after being infected with the wild type and the mutant strains. The results showed that the ΔflhG strain, unlike the wild type, lost its ability to regulate flagella numbers negatively and displayed multiple flagella. When infected with the hyperflagella-type strain, the RNA-seq revealed the upregulation of several immune-related genes in the shrimp hepatopancreas. Notably, two C-type lectins (CTLs), namely galactose-specific lectin nattectin and macrophage mannose receptor 1, and the TNF receptor-associated factor (TRAF) 6 gene were upregulated significantly. These findings suggested that C-type lectins were potentially involved in flagella recognition in shrimp and the immune system was activated through the TRAF6 pathway after flagella detection by CTLs
    • …
    corecore