Search CORE

196 research outputs found

MVControl: Adding Conditional Control to Multi-view Diffusion for Controllable Text-to-3D Generation

Author: Chen Yiming
Li Zhiqi
Liu Peidong
Zhao Lingzhe
Publication venue
Publication date: 27/11/2023
Field of study

We introduce MVControl, a novel neural network architecture that enhances existing pre-trained multi-view 2D diffusion models by incorporating additional input conditions, e.g. edge maps. Our approach enables the generation of controllable multi-view images and view-consistent 3D content. To achieve controllable multi-view image generation, we leverage MVDream as our base model, and train a new neural network module as additional plugin for end-to-end task-specific condition learning. To precisely control the shapes and views of generated images, we innovatively propose a new conditioning mechanism that predicts an embedding encapsulating the input spatial and view conditions, which is then injected to the network globally. Once MVControl is trained, score-distillation (SDS) loss based optimization can be performed to generate 3D content, in which process we propose to use a hybrid diffusion prior. The hybrid prior relies on a pre-trained Stable-Diffusion network and our trained MVControl for additional guidance. Extensive experiments demonstrate that our method achieves robust generalization and enables the controllable generation of high-quality 3D content. Code available at https://github.com/WU-CVGL/MVControl/.Comment: Project page: https://lizhiqi49.github.io/MVControl

arXiv.org e-Print Archive

BALF: Simple and Efficient Blur Aware Local Feature Detector

Author: Chen Ben M.
Liu Peidong
Zhai Yu
Zhao Zhenjun
Publication venue
Publication date: 29/11/2022
Field of study

Local feature detection is a key ingredient of many image processing and computer vision applications, such as visual odometry and localization. Most existing algorithms focus on feature detection from a sharp image. They would thus have degraded performance once the image is blurred, which could happen easily under low-lighting conditions. To address this issue, we propose a simple yet both efficient and effective keypoint detection method that is able to accurately localize the salient keypoints in a blurred image. Our method takes advantages of a novel multi-layer perceptron (MLP) based architecture that significantly improve the detection repeatability for a blurred image. The network is also light-weight and able to run in real-time, which enables its deployment for time-constrained applications. Extensive experimental results demonstrate that our detector is able to improve the detection repeatability with blurred images, while keeping comparable performance as existing state-of-the-art detectors for sharp images

arXiv.org e-Print Archive

Recommended from our members

Synthesis of Silver Nanowires with Reduced Diameters Using Benzoin-Derived Radicals to Make Transparent Conductors with High Transparency and Low Haze.

Author: Chen Hong
Cui Fan
Dehestani Ahmad
Kuttner Elisabeth
Niu Zhiqiang
Schierle-Arndt Kerstin
Sun Yuchun
Xie Chenlu
Yang Peidong
Publication venue: eScholarship, University of California
Publication date: 01/08/2018
Field of study

Reducing the diameter of silver nanowires has been proven to be an effective way to improve their optoelectronic performance by lessening light attenuation. The state-of-the-art silver nanowires are typically around 20 nm in diameter. Herein we report a modified polyol synthesis of silver nanowires with average diameters as thin as 13 nm and aspect ratios up to 3000. The success of this synthesis is based on the employment of benzoin-derived radicals in the polyol approach and does not require high-pressure conditions. The strong reducing power of radicals allows the reduction of silver precursors to occur at relatively low temperatures, wherein the lateral growth of silver nanowires is restrained because of efficient surface passivation. The optoelectronic performance of as-prepared 13 nm silver nanowires presents a sheet resistance of 28 Ω sq-1 at a transmittance of 95% with a haze factor of ∼1.2%, comparable to that of commercial indium tin oxide (ITO)

eScholarship - University of California

Token-Level Serialized Output Training for Joint Streaming ASR and ST Leveraging Textual Alignments

Author: Chen Junkun
Gaur Yashesh
Li Jinyu
Papi Sara
Wan Peidong
Xue Jian
Publication venue
Publication date: 06/07/2023
Field of study

In real-world applications, users often require both translations and transcriptions of speech to enhance their comprehension, particularly in streaming scenarios where incremental generation is necessary. This paper introduces a streaming Transformer-Transducer that jointly generates automatic speech recognition (ASR) and speech translation (ST) outputs using a single decoder. To produce ASR and ST content effectively with minimal latency, we propose a joint token-level serialized output training method that interleaves source and target words by leveraging an off-the-shelf textual aligner. Experiments in monolingual (it-en) and multilingual (\{de,es,it\}-en) settings demonstrate that our approach achieves the best quality-latency balance. With an average ASR latency of 1s and ST latency of 1.3s, our model shows no degradation or even improves output quality compared to separate ASR and ST models, yielding an average improvement of 1.1 WER and 0.4 BLEU in the multilingual case

arXiv.org e-Print Archive

Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation

Author: Chen Junkun
Gaur Yashesh
Kanda Naoyuki
Li Jinyu
Papi Sara
Wang Peidong
Xue Jian
Publication venue
Publication date: 23/10/2023
Field of study

The growing need for instant spoken language transcription and translation is driven by increased global communication and cross-lingual interactions. This has made offering translations in multiple languages essential for user applications. Traditional approaches to automatic speech recognition (ASR) and speech translation (ST) have often relied on separate systems, leading to inefficiencies in computational resources, and increased synchronization complexity in real time. In this paper, we propose a streaming Transformer-Transducer (T-T) model able to jointly produce many-to-one and one-to-many transcription and translation using a single decoder. We introduce a novel method for joint token-level serialized output training based on timestamp information to effectively produce ASR and ST outputs in the streaming setting. Experiments on {it,es,de}->en prove the effectiveness of our approach, enabling the generation of one-to-many joint outputs with a single decoder for the first time.Comment: \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other work

arXiv.org e-Print Archive

Structural and spectral dynamics of single-crystalline Ruddlesden-Popper phase halide perovskite blue light-emitting diodes.

Author: Chen Hong
Jin Jianbo
Kang Joohoon
Kang Jun
Kong Qiao
Lai Minliang
Lin Jia
Lin Zhenni
Lu Dylan
Quan Li Na
Toney Michael F
Wang Lin-Wang
Yang Peidong
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Achieving perovskite-based high-color purity blue-emitting light-emitting diodes (LEDs) is still challenging. Here, we report successful synthesis of a series of blue-emissive two-dimensional Ruddlesden-Popper phase single crystals and their high-color purity blue-emitting LED demonstrations. Although this approach successfully achieves a series of bandgap emissions based on the different layer thicknesses, it still suffers from a conventional temperature-induced device degradation mechanism during high-voltage operations. To understand the underlying mechanism, we further elucidate temperature-induced device degradation by investigating the crystal structural and spectral evolution dynamics via in situ temperature-dependent single-crystal x-ray diffraction, photoluminescence (PL) characterization, and density functional theory calculation. The PL peak becomes asymmetrically broadened with a marked intensity decay, as temperature increases owing to [PbBr6]4- octahedra tilting and the organic chain disordering, which results in bandgap decrease. This study indicates that careful heat management under LED operation is a key factor to maintain the sharp and intense emission

IBS Publications Repository

eScholarship - University of California

Distributed topology identification algorithm of distribution network based on neighboring interaction

Author: Huazhen Cao
Lingxue Lin
Peidong Chen
Xuan He
Yifei Yang
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2023
Field of study

Intelligent distributed control and protection is a promising route towards flexible and safety operation of distribution network with widespread access of distributed energy resources A fundamental premise of the distributed decision-making is that each smart terminal can identify the topological structure of the feeder and track its changes. This paper proposes a distributed topology identification algorithm with high fault tolerance based on peer-to-peer communication. The smart terminal units (STU) installed on the nodes can dynamiclly track and identify the network topology through local measurement and information exchange with neighboring STUs. The proposed algorithm combines local measurement mutual check with contralateral connectivity predictive correction, and significantly improves the tolerance of measurement errors in topology identification. Test examples are presented to verify the effectiveness of the method

Directory of Open Access Journals

Assessment of multi-source observation merged 1 km-grid precipitation product during the disastrous rainstorms in Guangdong

Author: Chunyan ZHANG
Gang XIANG
Peidong WANG
Shangyou XIE
Sixiao YANG
Yanping ZHENG
Yizhi CHEN
Publication venue: Editorial Office of Torrential Rain and Disasters
Publication date: 01/12/2023
Field of study

This paper aims to assess the latest 1 km-grid Analysis Real Time (ART_1 km) precipitation product developed by the National Meteorological Information Center of China Meteorological Administration (CMA), which can provide great support for disaster weather monitoring and warning, intelligent grid forecasting and weather services. Observed precipitation data from the independent stations (including non-uploaded regional meteorological stations and hydrometric stations) that were not integrated into the ART_1 km precipitation product as well as precipitation classification inspection are used to assess the quality of this product during twenty disastrous rainstorm cases from May to August during 2019-2022 in Guangdong. The results show that the ART_1 km precipitation product successfully reproduces the precipitation location, strength, and trends in these cases, with the best performance in the Pearl River Delta, the east of eastern Guangdong, and the north of northern Guangdong. The stronger the precipitation, the greater the correlation as well as the root mean square error (RMSE) and mean error (ME) between the ART_1 km precipitation and the observed precipitation. When the hourly precipitation is not classified, about 60% of these independent stations present a correlation efficient ≥ 0.8, more than 90% of the stations present an RMSE within the range of [1.0, 5.0) mm, and more than 60% of the stations present a ME within ±0.1 mm. When the hourly precipitation is < 5 mm, most of the stations have a correlation efficient < 0.5, an RMSE within the range of [1.0, 5.0) mm, and a ME within [0.0, 0.5] mm. When the hourly precipitation is ≥ 20 mm, 42%~56% of the stations have a correlation efficient ≥ 0.5, and most of the stations have an RMSE ≥ 10 mm and a ME < 0 mm, even when the hourly precipitation is ≥ 50 mm, most of the stations have a ME < -10 mm. Overall, ART_1 km precipitation is usually underestimated at the independent stations, and integrating observations from more sites into producing ART_1 km precipitation is helpful to improve the quality of the products

Directory of Open Access Journals