157 research outputs found
Tackling Data Bias in MUSIC-AVQA: Crafting a Balanced Dataset for Unbiased Question-Answering
In recent years, there has been a growing emphasis on the intersection of
audio, vision, and text modalities, driving forward the advancements in
multimodal research. However, strong bias that exists in any modality can lead
to the model neglecting the others. Consequently, the model's ability to
effectively reason across these diverse modalities is compromised, impeding
further advancement. In this paper, we meticulously review each question type
from the original dataset, selecting those with pronounced answer biases. To
counter these biases, we gather complementary videos and questions, ensuring
that no answers have outstanding skewed distribution. In particular, for binary
questions, we strive to ensure that both answers are almost uniformly spread
within each question category. As a result, we construct a new dataset, named
MUSIC-AVQA v2.0, which is more challenging and we believe could better foster
the progress of AVQA task. Furthermore, we present a novel baseline model that
delves deeper into the audio-visual-text interrelation. On MUSIC-AVQA v2.0,
this model surpasses all the existing benchmarks, improving accuracy by 2% on
MUSIC-AVQA v2.0, setting a new state-of-the-art performance
SimpleNet: A Simple Network for Image Anomaly Detection and Localization
We propose a simple and application-friendly network (called SimpleNet) for
detecting and localizing anomalies. SimpleNet consists of four components: (1)
a pre-trained Feature Extractor that generates local features, (2) a shallow
Feature Adapter that transfers local features towards target domain, (3) a
simple Anomaly Feature Generator that counterfeits anomaly features by adding
Gaussian noise to normal features, and (4) a binary Anomaly Discriminator that
distinguishes anomaly features from normal features. During inference, the
Anomaly Feature Generator would be discarded. Our approach is based on three
intuitions. First, transforming pre-trained features to target-oriented
features helps avoid domain bias. Second, generating synthetic anomalies in
feature space is more effective, as defects may not have much commonality in
the image space. Third, a simple discriminator is much efficient and practical.
In spite of simplicity, SimpleNet outperforms previous methods quantitatively
and qualitatively. On the MVTec AD benchmark, SimpleNet achieves an anomaly
detection AUROC of 99.6%, reducing the error by 55.5% compared to the next best
performing model. Furthermore, SimpleNet is faster than existing methods, with
a high frame rate of 77 FPS on a 3080ti GPU. Additionally, SimpleNet
demonstrates significant improvements in performance on the One-Class Novelty
Detection task. Code: https://github.com/DonaldRR/SimpleNet.Comment: Accepted to CVPR 202
Spin attributes of structured vector fields constructed by Hertz potentials
In this paper, we use the Hertz vector potential to define the
electromagnetic vector of different structured wavefields, and analyze the spin
properties of the wavefields. We show that for the single evanescent waves, the
total spin provides by the transverse spin and originates from the spatial
inhomogeneity of the momentum density of the field. However, for non-single
evanescent wave, there may be a part of the extraordinary spin component sE,
and the direction of sE is also perpendicular to the wave propagation
direction. In other words, it is transverse, but it does not originate from the
curl of the wave field momentum density. In addition, we also calculate the
spins of non-planar propagating waves, and analyze the spin characteristics of
these wave fields
Tuberous Sclerosis complex protein 2-independent activation of mTORC1 by human cytomegalovirus pUL38
The mammalian target of rapamycin complex 1 (mTORC1) controls cell growth and anabolic metabolism and is a critical host factor activated by human cytomegalovirus (HCMV) for successful infection. The multifunctional HCMV protein pUL38 previously has been reported to activate mTORC1 by binding to and antagonizing tuberous sclerosis complex protein 2 (TSC2) (J. N. Moorman et al., Cell Host Microbe 3:253–262, 2008, http://dx.doi.org/10.1016/j.chom.2008.03.002). pUL38 also plays a role in blocking endoplasmic reticulum stress-induced cell death during HCMV infection. In this study, we showed that a mutant pUL38 lacking the N-terminal 24 amino acids (pHA-UL38(25–331)) was fully functional in suppressing cell death during infection. Interestingly, pHA-UL38(25–331) lost the ability to interact with TSC2 but retained the ability to activate mTORC1, although to a lesser extent than full-length pHA-UL38. Recombinant virus expressing pHA-UL38(25–331) replicated with ∼10-fold less efficiency than the wild-type virus at a low multiplicity of infection (MOI), but it grew similarly well at a high MOI, suggesting an MOI-dependent importance of pUL38-TSC2 interaction in supporting virus propagation. Site-directed mutational analysis identified a TQ motif at amino acid residues 23 and 24 as critical for pUL38 interaction with TSC2. Importantly, when expressed in isolation, the TQ/AA substitution mutant pHA-UL38 TQ/AA was capable of activating mTORC1 just like pHA-UL38(25–331). We also created TSC2-null U373-MG cell lines by CRISPR genome editing and showed that pUL38 was capable of further increasing mTORC1 activity in TSC2-null cells. Therefore, this study identified the residues important for pUL38-TSC2 interaction and demonstrated that pUL38 can activate mTORC1 in both TSC2-dependent and -independent manners. IMPORTANCE HCMV, like other viruses, depends exclusively on its host cell to propagate. Therefore, it has developed methods to protect against host stress responses and to usurp cellular processes to complete its life cycle. mTORC1 is believed to be important for virus replication, and HCMV maintains high mTORC1 activity despite the stressful cellular environment associated with infection. mTORC1 inhibitors suppressed HCMV replication in vitro and reduced the incidence of HCMV reactivation in transplant recipients. We demonstrated that mTORC1 was activated by HCMV protein pUL38 in both TSC2-dependent and TSC2-independent manners. The pUL38-independent mode of mTORC1 activation also has been reported. These novel findings suggest the evolution of sophisticated approaches whereby HCMV activates mTORC1, indicating its importance in the biology and pathogenesis of HCMV
Percutaneous Nephrolithotomy under Local Infiltration Anesthesia in Kneeling Prone Position for a Patient with Spinal Deformity
Urolithiasis, a common condition in patients with spinal deformity, poses a challenge to surgical procedures and anesthetic management. A 51-year-old Chinese male presented with bilateral complex renal calculi. He was also affected by severe kyphosis deformity and spinal stiffness due to ankylosing spondylitis. Dr. Li performed the percutaneous nephrolithotomy under local infiltration anesthesia with the patient in a kneeling prone position, achieving satisfactory stone clearance with no severe complications. We found this protocol safe and effective to manage kidney stones in patients with spinal deformity. Local infiltration anesthesia may benefit patients for whom epidural anesthesia and intubation anesthesia are difficult
Effect of Football Shoe Collar Type on Ankle Biomechanics and Dynamic Stability During Anterior and Lateral Single-Leg Jump Landings
In this study, we investigated the effects of football shoes with different collar heights on ankle biomechanics and dynamic postural stability. Fifteen healthy college football players performed anterior and lateral single-leg jump landings when wearing high collar, elastic collar, or low collar football shoes. The kinematics of lower limbs and ground reaction forces were collected by simultaneously using a stereo-photogrammetric system with markers (Vicon) and a force plate (Kistler). During the anterior single-leg jump landing, a high collar shoe resulted in a significantly smaller ankle dorsiflexion range of motion (ROM), compared to both elastic (p = 0.031, dz = 0.511) and low collar (p = 0.043, dz = 0.446) types, while also presenting lower total ankle sagittal ROM, compared to the low collar type (p = 0.023, dz = 0.756). Ankle joint stiffness was significantly greater for the high collar, compared to the elastic collar (p = 0.003, dz = 0.629) and low collar (p = 0.030, dz = 1.040). Medial-lateral stability was significantly improved with the high collar, compared to the low collar (p = 0.001, dz = 1.232). During the lateral single-leg jump landing, ankle inversion ROM (p = 0.028, dz = 0.615) and total ankle frontal ROM (p = 0.019, dz = 0.873) were significantly smaller for the high collar, compared to the elastic collar. The high collar also resulted in a significantly smaller total ankle sagittal ROM, compared to the low collar (p = 0.001, dz = 0.634). Therefore, the high collar shoe should be effective in decreasing the amount of ROM and increasing the dynamic stability, leading to high ankle joint stiffness due to differences in design and material characteristics of the collar types
SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model
With the development of large language models, many remarkable linguistic
systems like ChatGPT have thrived and achieved astonishing success on many
tasks, showing the incredible power of foundation models. In the spirit of
unleashing the capability of foundation models on vision tasks, the Segment
Anything Model (SAM), a vision foundation model for image segmentation, has
been proposed recently and presents strong zero-shot ability on many downstream
2D tasks. However, whether SAM can be adapted to 3D vision tasks has yet to be
explored, especially 3D object detection. With this inspiration, we explore
adapting the zero-shot ability of SAM to 3D object detection in this paper. We
propose a SAM-powered BEV processing pipeline to detect objects and get
promising results on the large-scale Waymo open dataset. As an early attempt,
our method takes a step toward 3D object detection with vision foundation
models and presents the opportunity to unleash their power on 3D vision tasks.
The code is released at https://github.com/DYZhang09/SAM3D.Comment: Technical Report. The code is released at
https://github.com/DYZhang09/SAM3
- …