Search CORE

78 research outputs found

Towards Consistent Video Editing with Text-to-Image Diffusion Models

Author: Guo Tiande
Han Congying
Li Bonan
Liu Luoqi
Nie Xuecheng
Zhang Zicheng
Publication venue
Publication date: 27/05/2023
Field of study

Existing works have advanced Text-to-Image (TTI) diffusion models for video editing in a one-shot learning manner. Despite their low requirements of data and computation, these methods might produce results of unsatisfied consistency with text prompt as well as temporal sequence, limiting their applications in the real world. In this paper, we propose to address the above issues with a novel EI

^2

model towards \textbf{E}nhancing v\textbf{I}deo \textbf{E}diting cons\textbf{I}stency of TTI-based frameworks. Specifically, we analyze and find that the inconsistent problem is caused by newly added modules into TTI models for learning temporal information. These modules lead to covariate shift in the feature space, which harms the editing capability. Thus, we design EI

^2

to tackle the above drawbacks with two classical modules: Shift-restricted Temporal Attention Module (STAM) and Fine-coarse Frame Attention Module (FFAM). First, through theoretical analysis, we demonstrate that covariate shift is highly related to Layer Normalization, thus STAM employs a \textit{Instance Centering} layer replacing it to preserve the distribution of temporal features. In addition, {STAM} employs an attention layer with normalized mapping to transform temporal features while constraining the variance shift. As the second part, we incorporate {STAM} with a novel {FFAM}, which efficiently leverages fine-coarse spatial information of overall frames to further enhance temporal consistency. Extensive experiments demonstrate the superiority of the proposed EI

^2

model for text-driven video editing

arXiv.org e-Print Archive

DiffBFR: Bootstrapping Diffusion Model Towards Blind Face Restoration

Author: Guo Tiande
Han Congying
Li Bonan
Nie Xuecheng
Qiu Xinmin
Zhang Zicheng
Publication venue
Publication date: 08/08/2023
Field of study

Blind face restoration (BFR) is important while challenging. Prior works prefer to exploit GAN-based frameworks to tackle this task due to the balance of quality and efficiency. However, these methods suffer from poor stability and adaptability to long-tail distribution, failing to simultaneously retain source identity and restore detail. We propose DiffBFR to introduce Diffusion Probabilistic Model (DPM) for BFR to tackle the above problem, given its superiority over GAN in aspects of avoiding training collapse and generating long-tail distribution. DiffBFR utilizes a two-step design, that first restores identity information from low-quality images and then enhances texture details according to the distribution of real faces. This design is implemented with two key components: 1) Identity Restoration Module (IRM) for preserving the face details in results. Instead of denoising from pure Gaussian random distribution with LQ images as the condition during the reverse process, we propose a novel truncated sampling method which starts from LQ images with part noise added. We theoretically prove that this change shrinks the evidence lower bound of DPM and then restores more original details. With theoretical proof, two cascade conditional DPMs with different input sizes are introduced to strengthen this sampling effect and reduce training difficulty in the high-resolution image generated directly. 2) Texture Enhancement Module (TEM) for polishing the texture of the image. Here an unconditional DPM, a LQ-free model, is introduced to further force the restorations to appear realistic. We theoretically proved that this unconditional DPM trained on pure HQ images contributes to justifying the correct distribution of inference images output from IRM in pixel-level space. Truncated sampling with fractional time step is utilized to polish pixel-level textures while preserving identity information

arXiv.org e-Print Archive

DropKey

Author: Guo Tiande
Han Congying
Hu Yinhan
Jiang Xiangjian
Li Bonan
Liu Luoqi
Nie Xuecheng
Publication venue
Publication date: 11/04/2023
Field of study

In this paper, we focus on analyzing and improving the dropout technique for self-attention layers of Vision Transformer, which is important while surprisingly ignored by prior works. In particular, we conduct researches on three core questions: First, what to drop in self-attention layers? Different from dropping attention weights in literature, we propose to move dropout operations forward ahead of attention matrix calculation and set the Key as the dropout unit, yielding a novel dropout-before-softmax scheme. We theoretically verify that this scheme helps keep both regularization and probability features of attention weights, alleviating the overfittings problem to specific patterns and enhancing the model to globally capture vital information; Second, how to schedule the drop ratio in consecutive layers? In contrast to exploit a constant drop ratio for all layers, we present a new decreasing schedule that gradually decreases the drop ratio along the stack of self-attention layers. We experimentally validate the proposed schedule can avoid overfittings in low-level features and missing in high-level semantics, thus improving the robustness and stableness of model training; Third, whether need to perform structured dropout operation as CNN? We attempt patch-based block-version of dropout operation and find that this useful trick for CNN is not essential for ViT. Given exploration on the above three questions, we present the novel DropKey method that regards Key as the drop unit and exploits decreasing schedule for drop ratio, improving ViTs in a general way. Comprehensive experiments demonstrate the effectiveness of DropKey for various ViT architectures, e.g. T2T and VOLO, as well as for various vision tasks, e.g., image classification, object detection, human-object interaction detection and human body shape recovery.Comment: Accepted by CVPR202

arXiv.org e-Print Archive

Modeling Single-Phase Inverter and Its Decentralized Coordinated Control by Using Feedback Linearization

Author: Bonan Huang
Dazhong Ma
Qiuye Sun
Renke Han
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

It is a very crucial problem to make a microgrid operated reasonably and stably. Considering the nonlinear mathematics model of inverter established in this paper, the input-output feedback linearization method is used to transform the nonlinear mathematics model of inverters to a linear tracking synchronization and consensus regulation control problem. Based on the linear mathematics model and multiagent consensus algorithm, a decentralized coordinated controller is proposed to make amplitudes and angles of voltages from inverters be consensus and active and reactive power shared in the desired ratio. The proposed control is totally distributed because each inverter only requires local and one neighbor’s information with sparse communication structure based on multiagent system. The hybrid consensus algorithm is used to keep the amplitude of the output voltages following the leader and the angles of output voltage as consensus. Then the microgrid can be operated more efficiently and the circulating current between DGs can be effectively suppressed. The effectiveness of the proposed method is proved through simulation results of a typical microgrid system

Crossref

Directory of Open Access Journals

Discovery of multi-anion antiperovskites X<sub>6</sub>NFSn<sub>2</sub> (X = Ca, Sr) as promising thermoelectric materials by computational screening

Author: Bein Thomas
Cai Zenghua
Ebert Hubert
Han Dan
Rudel Stefan S.
Scanlon David O.
Schnick Wolfgang
Spooner Kieran B.
Zhu Bonan
Publication venue
Publication date: 03/01/2024
Field of study

The thermoelectric performance of existing perovskites lags far behind that of state-of-the-art thermoelectric materials such as SnSe. Despite halide perovskites showing promising thermoelectric properties, namely, high Seebeck coefficients and ultralow thermal conductivities, their thermoelectric performance is significantly restricted by low electrical conductivities. Here, we explore new multi-anion antiperovskites X6NFSn2 (X = Ca, Sr, and Ba) via B-site anion mutation in antiperovskite and global structure searches and demonstrate their phase stability by first-principles calculations. Ca6NFSn2 and Sr6NFSn2 exhibit decent Seebeck coefficients and ultralow lattice thermal conductivities (<1 W m−1 K−1). Notably, Ca6NFSn2 and Sr6NFSn2 show remarkably larger electrical conductivities compared to the halide perovskite CsSnI3. The combined superior electrical and thermal properties of Ca6NFSn2 and Sr6NFSn2 lead to high thermoelectric figures of merit (ZTs) of ∼1.9 and ∼2.3 at high temperatures. Our exploration of multi-anion antiperovskites X6NFSn2 (X = Ca, Sr) realizes the “phonon-glass, electron-crystal” concept within the antiperovskite structure

University of Birmingham Research Portal

Bessel terahertz pulses from superluminal laser plasma filaments

Author: Chen Yanping
Han Bonan
He Feng
Sheng Zhengming
Wang Linzheng
Xia Tianhao
Zhang Jiayang
Zhang Jie
Zhang Zhelin
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 02/03/2022
Field of study

Terahertz radiation with a Bessel beam profile is demonstrated experimentally from a two-color laser filament in air, which is induced by tailored femtosecond laser pulses with an axicon. The temporal and spatial distributions of Bessel rings of the terahertz radiation are retrieved after being collected in the far field. A theoretical model is proposed, which suggests that such Bessel terahertz pulses are produced due to the combined effects of the inhomogeneous superluminal filament structure and the phase change of the two-color laser components inside the plasma channel. These two effects lead to wavefront crossover and constructive/destructive interference of terahertz radiation from different plasma sources along the laser filament, respectively. Compared with other methods, our technique can support the generation of Bessel pulses with broad spectral bandwidth. Such Bessel pulses can propagate to the far field without significant spatial spreading, which shall provide new opportunities for terahertz applications

University of Strathclyde Institutional Repository

Combining sap flow measurements and modelling to assess water needs in an oasis farmland shelterbelt of Populus simonii Carr in Northwest China

Author: Anderegg
Anderegg
Bernier
Bonan
Breda
Brito
Campbell
Cermak
Chang
Chen
Chen
Chen
Chen
Dong
Fan
Fernandez
Gharun
Granier
Granier
Granier
Han
Han
Jarvis
Jarvis
Jonard
Jung
Kang
Komatsu
Kumagai
Kume
Lagergren
Li
Li
Lin Sun
Lu
Monteith
Naithani
Ryszkowski
Shan
She
Shen
Shi
Shuai Fu
Sommer
Stewart
Wang
Whitley
Whitley
Xinli
Xinli
Xiubin
Yi Luo
Yunusa
Zeppel
Zhao
Zhou
Zweifel
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Retrieval of seasonal Rubisco-limited photosynthetic capacity at global FLUXNET sites from hyperspectral satellite remote sensing: Impact on carbon modelling

Author: Abramowitz
Alton
Alton
Alton
Alton
Alton
Baker
Ball
Bauerle
Beerling
Bonan
Boyd
Campbell
Carswell
Chen
Clark
Collatz
Combal
Curran
Dang
Dash
Dash
De Kauwe
Doughty
Ellsworth
Evans
Falge
Fang
Farquhar
Friend
Friend
Grassi
Gurevitch
Han
Heinsch
Hikosaka
Houborg
IPCC
Kattge
Law
Lewis
Los
Mauseth
McCallum
Medlyn
Medvigy
Meir
Melaas
Middleton
Misson
Monteith
Nolan
Ollinger
Paul B. Alton
Reichstein
Richardson
Ripullone
Running
Ryu
Sato
Schulze
Sellers
Serbin
Serbin
Shabanov
Sheffield
Singsaas
Smith
Vina
Vuolo
Walker
Wang
Warren
Williams
Wilson
Wilson
Wright
Wullschleger
Xu
Yang
Yang
Yuan
Zaehle
Zhang
Zhao
Publication venue: 'Elsevier BV'
Publication date: 17/08/2016
Field of study

Crossref

Cronfa at Swansea University

Recommended from our members

Disaggregation of SMOS soil moisture over West Africa using the Temperature and Vegetation Dryness Index based on SEVIRI land surface parameters

Author: Alcaraz-Segura
Anderson
Bach
Bonan
C.E. Bulgin
Carlson
Carlson
Collow
Crago
D. Ghent
de Beurs
de Robert
de Tomás
Djamai
Dorigo
Dorigo
Eastman
ECMWF
FEWSNET
FEWSNET
G. Mendiguren González
Garcia
García-Haro
Gillies
GLOBTEMP
H. Nieto
Han
Hassan
Hirsch
ISMN
Jiang
Jimenez-Munoz
Kerr
Kerr
Kustas
Leroux
Li
Li
Long
Louvet
Malbéteau
Mallick
Merlin
Merlin
Merlin
Minacapilli
Moran
Moran
Nieto
Noilhan
Panciera
Patel
Peng
Piles
Piles
Piles
Price
Proud
Proud
R. Fensholt
Rasmussen
S. Horion
Sadeghi
Sandholt
Schmugge
Sobrino
Stisen
Stisen
Sun
T. Tagesson
Tagesson
Tagesson
Tang
Tang
Tang
Trigo
Trigo
V. Zaldo Fornies
Vanbelle
Vinukollu
Wan
Wang
Wang
Wang
Wigneron
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

The overarching objective of this study was to produce a disaggregated SMOS Soil Moisture (SM) product using land surface parameters from a geostationary satellite in a region covering a diverse range of ecosystem types. SEVIRI data at 15 minute temporal resolution were used to derive the Temperature and Vegetation Dryness Index (TVDI) that served as SM proxy within the disaggregation process. West Africa (3 N, 26 W; 28 N, 26 E) was selected as a case study as it presents both an important North-South climate gradient and a diverse range of ecosystem types. The main challenge was to set up a methodology applicable over a large area that overcomes the constraints of SMOS (low spatial resolution) and TVDI (requires similar atmospheric forcing and triangular shape formed when plotting morning rise temperature versus fraction of vegetation cover) in order to produce a 0.05 degree resolution disaggregated SMOS SM product at sub-continental scale. Consistent cloud cover appeared as one of the main constraints for deriving TVDI, especially during the rainy season and in the southern parts of the region and a large adjustment window (105x105 SEVIRI pixels) was therefore deemed necessary. Both the original and the disaggregated SMOS SM products described well the seasonal dynamics observed at six locations of in situ observations. However, there was an overestimation in both products for sites in the humid southern regions; most likely caused by the presence of forest. Both TVDI and the associated disaggregated SM product was found to be highly sensitive to algorithm input parameters; especially of conditions of high fraction of vegetation cover. Additionally, seasonal dynamics in TVDI did not follow the seasonal patters of SM. Still, its spatial heterogeneity was found to be a good proxy for disaggregating SMOS SM data; main river networks and spatial patterns of SM extremes (i.e. droughts and floods) not seen in the original SMOS SM product were revealed in the disaggregated SM product for a test case of July-September 2012. The disaggregation methodology thereby successfully increased the spatial resolution of SMOS SM, with potential application for local drought/flood monitoring of importance for the livelihood of the population of West Africa

Central Archive at the University of Reading

Lund University Publications

Crossref

IRTA Pubpro

Copenhagen University Research Information System

Online Research Database In Technology