157 research outputs found

    Tackling Data Bias in MUSIC-AVQA: Crafting a Balanced Dataset for Unbiased Question-Answering

    Full text link
    In recent years, there has been a growing emphasis on the intersection of audio, vision, and text modalities, driving forward the advancements in multimodal research. However, strong bias that exists in any modality can lead to the model neglecting the others. Consequently, the model's ability to effectively reason across these diverse modalities is compromised, impeding further advancement. In this paper, we meticulously review each question type from the original dataset, selecting those with pronounced answer biases. To counter these biases, we gather complementary videos and questions, ensuring that no answers have outstanding skewed distribution. In particular, for binary questions, we strive to ensure that both answers are almost uniformly spread within each question category. As a result, we construct a new dataset, named MUSIC-AVQA v2.0, which is more challenging and we believe could better foster the progress of AVQA task. Furthermore, we present a novel baseline model that delves deeper into the audio-visual-text interrelation. On MUSIC-AVQA v2.0, this model surpasses all the existing benchmarks, improving accuracy by 2% on MUSIC-AVQA v2.0, setting a new state-of-the-art performance

    SimpleNet: A Simple Network for Image Anomaly Detection and Localization

    Full text link
    We propose a simple and application-friendly network (called SimpleNet) for detecting and localizing anomalies. SimpleNet consists of four components: (1) a pre-trained Feature Extractor that generates local features, (2) a shallow Feature Adapter that transfers local features towards target domain, (3) a simple Anomaly Feature Generator that counterfeits anomaly features by adding Gaussian noise to normal features, and (4) a binary Anomaly Discriminator that distinguishes anomaly features from normal features. During inference, the Anomaly Feature Generator would be discarded. Our approach is based on three intuitions. First, transforming pre-trained features to target-oriented features helps avoid domain bias. Second, generating synthetic anomalies in feature space is more effective, as defects may not have much commonality in the image space. Third, a simple discriminator is much efficient and practical. In spite of simplicity, SimpleNet outperforms previous methods quantitatively and qualitatively. On the MVTec AD benchmark, SimpleNet achieves an anomaly detection AUROC of 99.6%, reducing the error by 55.5% compared to the next best performing model. Furthermore, SimpleNet is faster than existing methods, with a high frame rate of 77 FPS on a 3080ti GPU. Additionally, SimpleNet demonstrates significant improvements in performance on the One-Class Novelty Detection task. Code: https://github.com/DonaldRR/SimpleNet.Comment: Accepted to CVPR 202

    Spin attributes of structured vector fields constructed by Hertz potentials

    Full text link
    In this paper, we use the Hertz vector potential to define the electromagnetic vector of different structured wavefields, and analyze the spin properties of the wavefields. We show that for the single evanescent waves, the total spin provides by the transverse spin and originates from the spatial inhomogeneity of the momentum density of the field. However, for non-single evanescent wave, there may be a part of the extraordinary spin component sE, and the direction of sE is also perpendicular to the wave propagation direction. In other words, it is transverse, but it does not originate from the curl of the wave field momentum density. In addition, we also calculate the spins of non-planar propagating waves, and analyze the spin characteristics of these wave fields

    Tuberous Sclerosis complex protein 2-independent activation of mTORC1 by human cytomegalovirus pUL38

    Get PDF
    The mammalian target of rapamycin complex 1 (mTORC1) controls cell growth and anabolic metabolism and is a critical host factor activated by human cytomegalovirus (HCMV) for successful infection. The multifunctional HCMV protein pUL38 previously has been reported to activate mTORC1 by binding to and antagonizing tuberous sclerosis complex protein 2 (TSC2) (J. N. Moorman et al., Cell Host Microbe 3:253–262, 2008, http://dx.doi.org/10.1016/j.chom.2008.03.002). pUL38 also plays a role in blocking endoplasmic reticulum stress-induced cell death during HCMV infection. In this study, we showed that a mutant pUL38 lacking the N-terminal 24 amino acids (pHA-UL38(25–331)) was fully functional in suppressing cell death during infection. Interestingly, pHA-UL38(25–331) lost the ability to interact with TSC2 but retained the ability to activate mTORC1, although to a lesser extent than full-length pHA-UL38. Recombinant virus expressing pHA-UL38(25–331) replicated with ∼10-fold less efficiency than the wild-type virus at a low multiplicity of infection (MOI), but it grew similarly well at a high MOI, suggesting an MOI-dependent importance of pUL38-TSC2 interaction in supporting virus propagation. Site-directed mutational analysis identified a TQ motif at amino acid residues 23 and 24 as critical for pUL38 interaction with TSC2. Importantly, when expressed in isolation, the TQ/AA substitution mutant pHA-UL38 TQ/AA was capable of activating mTORC1 just like pHA-UL38(25–331). We also created TSC2-null U373-MG cell lines by CRISPR genome editing and showed that pUL38 was capable of further increasing mTORC1 activity in TSC2-null cells. Therefore, this study identified the residues important for pUL38-TSC2 interaction and demonstrated that pUL38 can activate mTORC1 in both TSC2-dependent and -independent manners. IMPORTANCE HCMV, like other viruses, depends exclusively on its host cell to propagate. Therefore, it has developed methods to protect against host stress responses and to usurp cellular processes to complete its life cycle. mTORC1 is believed to be important for virus replication, and HCMV maintains high mTORC1 activity despite the stressful cellular environment associated with infection. mTORC1 inhibitors suppressed HCMV replication in vitro and reduced the incidence of HCMV reactivation in transplant recipients. We demonstrated that mTORC1 was activated by HCMV protein pUL38 in both TSC2-dependent and TSC2-independent manners. The pUL38-independent mode of mTORC1 activation also has been reported. These novel findings suggest the evolution of sophisticated approaches whereby HCMV activates mTORC1, indicating its importance in the biology and pathogenesis of HCMV

    Percutaneous Nephrolithotomy under Local Infiltration Anesthesia in Kneeling Prone Position for a Patient with Spinal Deformity

    Get PDF
    Urolithiasis, a common condition in patients with spinal deformity, poses a challenge to surgical procedures and anesthetic management. A 51-year-old Chinese male presented with bilateral complex renal calculi. He was also affected by severe kyphosis deformity and spinal stiffness due to ankylosing spondylitis. Dr. Li performed the percutaneous nephrolithotomy under local infiltration anesthesia with the patient in a kneeling prone position, achieving satisfactory stone clearance with no severe complications. We found this protocol safe and effective to manage kidney stones in patients with spinal deformity. Local infiltration anesthesia may benefit patients for whom epidural anesthesia and intubation anesthesia are difficult

    Parameter Stability Region Analysis of Islanded Microgrid Based on Bifurcation Theory

    Get PDF

    Effect of Football Shoe Collar Type on Ankle Biomechanics and Dynamic Stability During Anterior and Lateral Single-Leg Jump Landings

    Get PDF
    In this study, we investigated the effects of football shoes with different collar heights on ankle biomechanics and dynamic postural stability. Fifteen healthy college football players performed anterior and lateral single-leg jump landings when wearing high collar, elastic collar, or low collar football shoes. The kinematics of lower limbs and ground reaction forces were collected by simultaneously using a stereo-photogrammetric system with markers (Vicon) and a force plate (Kistler). During the anterior single-leg jump landing, a high collar shoe resulted in a significantly smaller ankle dorsiflexion range of motion (ROM), compared to both elastic (p = 0.031, dz = 0.511) and low collar (p = 0.043, dz = 0.446) types, while also presenting lower total ankle sagittal ROM, compared to the low collar type (p = 0.023, dz = 0.756). Ankle joint stiffness was significantly greater for the high collar, compared to the elastic collar (p = 0.003, dz = 0.629) and low collar (p = 0.030, dz = 1.040). Medial-lateral stability was significantly improved with the high collar, compared to the low collar (p = 0.001, dz = 1.232). During the lateral single-leg jump landing, ankle inversion ROM (p = 0.028, dz = 0.615) and total ankle frontal ROM (p = 0.019, dz = 0.873) were significantly smaller for the high collar, compared to the elastic collar. The high collar also resulted in a significantly smaller total ankle sagittal ROM, compared to the low collar (p = 0.001, dz = 0.634). Therefore, the high collar shoe should be effective in decreasing the amount of ROM and increasing the dynamic stability, leading to high ankle joint stiffness due to differences in design and material characteristics of the collar types

    SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model

    Full text link
    With the development of large language models, many remarkable linguistic systems like ChatGPT have thrived and achieved astonishing success on many tasks, showing the incredible power of foundation models. In the spirit of unleashing the capability of foundation models on vision tasks, the Segment Anything Model (SAM), a vision foundation model for image segmentation, has been proposed recently and presents strong zero-shot ability on many downstream 2D tasks. However, whether SAM can be adapted to 3D vision tasks has yet to be explored, especially 3D object detection. With this inspiration, we explore adapting the zero-shot ability of SAM to 3D object detection in this paper. We propose a SAM-powered BEV processing pipeline to detect objects and get promising results on the large-scale Waymo open dataset. As an early attempt, our method takes a step toward 3D object detection with vision foundation models and presents the opportunity to unleash their power on 3D vision tasks. The code is released at https://github.com/DYZhang09/SAM3D.Comment: Technical Report. The code is released at https://github.com/DYZhang09/SAM3
    • …
    corecore