6,015 research outputs found
SCANet: A Self- and Cross-Attention Network for Audio-Visual Speech Separation
The integration of different modalities, such as audio and visual
information, plays a crucial role in human perception of the surrounding
environment. Recent research has made significant progress in designing fusion
modules for audio-visual speech separation. However, they predominantly focus
on multi-modal fusion architectures situated either at the top or bottom
positions, rather than comprehensively considering multi-modal fusion at
various hierarchical positions within the network. In this paper, we propose a
novel model called self- and cross-attention network (SCANet), which leverages
the attention mechanism for efficient audio-visual feature fusion. SCANet
consists of two types of attention blocks: self-attention (SA) and
cross-attention (CA) blocks, where the CA blocks are distributed at the top
(TCA), middle (MCA) and bottom (BCA) of SCANet. These blocks maintain the
ability to learn modality-specific features and enable the extraction of
different semantics from audio-visual features. Comprehensive experiments on
three standard audio-visual separation benchmarks (LRS2, LRS3, and VoxCeleb2)
demonstrate the effectiveness of SCANet, outperforming existing
state-of-the-art (SOTA) methods while maintaining comparable inference time.Comment: 14 pages, 3 figure
Recommended from our members
Endocytic recycling and vesicular transport systems mediate transcytosis of Leptospira interrogans across cell monolayer.
Many bacterial pathogens can cause septicemia and spread from the bloodstream into internal organs. During leptospirosis, individuals are infected by contact with Leptospira-containing animal urine-contaminated water. The spirochetes invade internal organs after septicemia to cause disease aggravation, but the mechanism of leptospiral excretion and spreading remains unknown. Here, we demonstrated that Leptospira interrogans entered human/mouse endothelial and epithelial cells and fibroblasts by caveolae/integrin-β1-PI3K/FAK-mediated microfilament-dependent endocytosis to form Leptospira (Lep)-vesicles that did not fuse with lysosomes. Lep-vesicles recruited Rab5/Rab11 and Sec/Exo-SNARE proteins in endocytic recycling and vesicular transport systems for intracellular transport and release by SNARE-complex/FAK-mediated microfilament/microtubule-dependent exocytosis. Both intracellular leptospires and infected cells maintained their viability. Leptospiral propagation was only observed in mouse fibroblasts. Our study revealed that L. interrogans utilizes endocytic recycling and vesicular transport systems for transcytosis across endothelial or epithelial barrier in blood vessels or renal tubules, which contributes to spreading in vivo and transmission of leptospirosis
Variability in the impacts of partisan conflict: a new perspective from bank credit
The purpose of this article is to analyse the impact of partisan conflict
on bank credit, and take the global financial crisis as the time
node to analyse the variability of this impact before and after the
financial crisis. This article examines the impacts of partisan conflict
on the bank credit by employing the US data covering the past
40 years and captures the variability in the effects of partisan conflict
based on the rolling sample and time-varying parameter VAR
analysis. The full sample results reveal that one standard deviation
partisan conflict shock will shrink the bank credit growth rate to
nonfinancial sectors, and the negative effects of partisan conflict on
bank credit are more substantial after the global financial crisis. The
rolling sample and time-varying parameter VAR analysis further confirm
that the impacts of partisan conflict shock have varied substantially
over time, where bank credit still negatively reacts to the
impacts of partisan conflict in recent periods. Additionally, we estimate
two extended models and support the intermediate role of
economic policy uncertainty in transmitting the partisan conflict
and the substitution effect of cross-border bank lending on domestic
bank credit. Finally, our major results are unchanged by performing
a series of robustness checks. The conclusion of this article is
that partisan conflict has a significant impact on bank credit and
shows obvious variability, which is more significant after the global
financial crisis
Learning Motor Skills of Reactive Reaching and Grasping of Objects
Reactive grasping of objects is an essential capability of autonomous robot manipulation, which is yet challenging to learn such sensorimotor control to coordinate coherent hand-finger motions and be robust against disturbances and failures. This work proposed a deep reinforcement learning based scheme to train feedback control policies which can coordinate reaching and grasping actions in presence of uncertainties. We formulated geometric metrics and task-orientated quantities to design the reward, which enabled efficient exploration of grasping policies. Further, to improve the success rate, we deployed key initial states of difficult hand-finger poses to train policies to overcome potential failures due to challenging configurations. The extensive simulation validations and benchmarks demonstrated that the learned policy was robust to grasp both static and moving objects. Moreover, the policy generated successful failure recoveries within a short time in difficult configurations and was robust with synthetic noises in the state feedback which were unseen during training
- …