Search CORE

6,015 research outputs found

SCANet: A Self- and Cross-Attention Network for Audio-Visual Speech Separation

Author: Hu Xiaolin
Li Kai
Yang Runxuan
Publication venue
Publication date: 16/08/2023
Field of study

The integration of different modalities, such as audio and visual information, plays a crucial role in human perception of the surrounding environment. Recent research has made significant progress in designing fusion modules for audio-visual speech separation. However, they predominantly focus on multi-modal fusion architectures situated either at the top or bottom positions, rather than comprehensively considering multi-modal fusion at various hierarchical positions within the network. In this paper, we propose a novel model called self- and cross-attention network (SCANet), which leverages the attention mechanism for efficient audio-visual feature fusion. SCANet consists of two types of attention blocks: self-attention (SA) and cross-attention (CA) blocks, where the CA blocks are distributed at the top (TCA), middle (MCA) and bottom (BCA) of SCANet. These blocks maintain the ability to learn modality-specific features and enable the extraction of different semantics from audio-visual features. Comprehensive experiments on three standard audio-visual separation benchmarks (LRS2, LRS3, and VoxCeleb2) demonstrate the effectiveness of SCANet, outperforming existing state-of-the-art (SOTA) methods while maintaining comparable inference time.Comment: 14 pages, 3 figure

arXiv.org e-Print Archive

Recommended from our members

Endocytic recycling and vesicular transport systems mediate transcytosis of Leptospira interrogans across cell monolayer.

Author: Fang Jia-Qi
Hu Wei-Lin
Li Kai-Xuan
Li Shi-Jun
Li Yang
Lin Xu'ai
Ojcius David M
Yan Jie
Publication venue: eScholarship, University of California
Publication date: 01/04/2019
Field of study

Many bacterial pathogens can cause septicemia and spread from the bloodstream into internal organs. During leptospirosis, individuals are infected by contact with Leptospira-containing animal urine-contaminated water. The spirochetes invade internal organs after septicemia to cause disease aggravation, but the mechanism of leptospiral excretion and spreading remains unknown. Here, we demonstrated that Leptospira interrogans entered human/mouse endothelial and epithelial cells and fibroblasts by caveolae/integrin-β1-PI3K/FAK-mediated microfilament-dependent endocytosis to form Leptospira (Lep)-vesicles that did not fuse with lysosomes. Lep-vesicles recruited Rab5/Rab11 and Sec/Exo-SNARE proteins in endocytic recycling and vesicular transport systems for intracellular transport and release by SNARE-complex/FAK-mediated microfilament/microtubule-dependent exocytosis. Both intracellular leptospires and infected cells maintained their viability. Leptospiral propagation was only observed in mouse fibroblasts. Our study revealed that L. interrogans utilizes endocytic recycling and vesicular transport systems for transcytosis across endothelial or epithelial barrier in blood vessels or renal tubules, which contributes to spreading in vivo and transmission of leptospirosis

eScholarship - University of California

Pacific McGeorge School of Law

Scholarly Commons

Variability in the impacts of partisan conflict: a new perspective from bank credit

Author: Hu Xuan
Nie Li
Shi Kai
Yan Meng
Publication venue: Taylor and Francis Group and Juraj Dobrila University of Pula, Faculty of economics and tourism Dr. Mijo Mirković
Publication date: 01/01/2023
Field of study

The purpose of this article is to analyse the impact of partisan conflict on bank credit, and take the global financial crisis as the time node to analyse the variability of this impact before and after the financial crisis. This article examines the impacts of partisan conflict on the bank credit by employing the US data covering the past 40 years and captures the variability in the effects of partisan conflict based on the rolling sample and time-varying parameter VAR analysis. The full sample results reveal that one standard deviation partisan conflict shock will shrink the bank credit growth rate to nonfinancial sectors, and the negative effects of partisan conflict on bank credit are more substantial after the global financial crisis. The rolling sample and time-varying parameter VAR analysis further confirm that the impacts of partisan conflict shock have varied substantially over time, where bank credit still negatively reacts to the impacts of partisan conflict in recent periods. Additionally, we estimate two extended models and support the intermediate role of economic policy uncertainty in transmitting the partisan conflict and the substitution effect of cross-border bank lending on domestic bank credit. Finally, our major results are unchanged by performing a series of robustness checks. The conclusion of this article is that partisan conflict has a significant impact on bank credit and shows obvious variability, which is more significant after the global financial crisis

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Learning Motor Skills of Reactive Reaching and Grasping of Objects

Author: Hu Wenbin
Li Zhibin
Yang Chuanyu
Yuan Kai
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/03/2022
Field of study

Reactive grasping of objects is an essential capability of autonomous robot manipulation, which is yet challenging to learn such sensorimotor control to coordinate coherent hand-finger motions and be robust against disturbances and failures. This work proposed a deep reinforcement learning based scheme to train feedback control policies which can coordinate reaching and grasping actions in presence of uncertainties. We formulated geometric metrics and task-orientated quantities to design the reward, which enabled efficient exploration of grasping policies. Further, to improve the success rate, we deployed key initial states of difficult hand-finger poses to train policies to overcome potential failures due to challenging configurations. The extensive simulation validations and benchmarks demonstrated that the learned policy was robust to grasp both static and moving objects. Moreover, the policy generated successful failure recoveries within a short time in difficult configurations and was robust with synthetic noises in the state feedback which were unseen during training

UCL Discovery

Edinburgh Research Explorer