62 research outputs found

    Rethink Cross-Modal Fusion in Weakly-Supervised Audio-Visual Video Parsing

    Full text link
    Existing works on weakly-supervised audio-visual video parsing adopt hybrid attention network (HAN) as the multi-modal embedding to capture the cross-modal context. It embeds the audio and visual modalities with a shared network, where the cross-attention is performed at the input. However, such an early fusion method highly entangles the two non-fully correlated modalities and leads to sub-optimal performance in detecting single-modality events. To deal with this problem, we propose the messenger-guided mid-fusion transformer to reduce the uncorrelated cross-modal context in the fusion. The messengers condense the full cross-modal context into a compact representation to only preserve useful cross-modal information. Furthermore, due to the fact that microphones capture audio events from all directions, while cameras only record visual events within a restricted field of view, there is a more frequent occurrence of unaligned cross-modal context from audio for visual event predictions. We thus propose cross-audio prediction consistency to suppress the impact of irrelevant audio information on visual event prediction. Experiments consistently illustrate the superior performance of our framework compared to existing state-of-the-art methods.Comment: WACV 202

    Generalized Few-Shot Point Cloud Segmentation Via Geometric Words

    Full text link
    Existing fully-supervised point cloud segmentation methods suffer in the dynamic testing environment with emerging new classes. Few-shot point cloud segmentation algorithms address this problem by learning to adapt to new classes at the sacrifice of segmentation accuracy for the base classes, which severely impedes its practicality. This largely motivates us to present the first attempt at a more practical paradigm of generalized few-shot point cloud segmentation, which requires the model to generalize to new categories with only a few support point clouds and simultaneously retain the capability to segment base classes. We propose the geometric words to represent geometric components shared between the base and novel classes, and incorporate them into a novel geometric-aware semantic representation to facilitate better generalization to the new classes without forgetting the old ones. Moreover, we introduce geometric prototypes to guide the segmentation with geometric prior knowledge. Extensive experiments on S3DIS and ScanNet consistently illustrate the superior performance of our method over baseline methods. Our code is available at: https://github.com/Pixie8888/GFS-3DSeg_GWs.Comment: Accepted by ICCV 202

    ContraNeRF: Generalizable Neural Radiance Fields for Synthetic-to-real Novel View Synthesis via Contrastive Learning

    Full text link
    Although many recent works have investigated generalizable NeRF-based novel view synthesis for unseen scenes, they seldom consider the synthetic-to-real generalization, which is desired in many practical applications. In this work, we first investigate the effects of synthetic data in synthetic-to-real novel view synthesis and surprisingly observe that models trained with synthetic data tend to produce sharper but less accurate volume densities. For pixels where the volume densities are correct, fine-grained details will be obtained. Otherwise, severe artifacts will be produced. To maintain the advantages of using synthetic data while avoiding its negative effects, we propose to introduce geometry-aware contrastive learning to learn multi-view consistent features with geometric constraints. Meanwhile, we adopt cross-view attention to further enhance the geometry perception of features by querying features across input views. Experiments demonstrate that under the synthetic-to-real setting, our method can render images with higher quality and better fine-grained details, outperforming existing generalizable novel view synthesis methods in terms of PSNR, SSIM, and LPIPS. When trained on real data, our method also achieves state-of-the-art results

    Some conceptual difficulties regarding "net" multipliers

    Get PDF
    Multipliers are routinely used for impact evaluation of private projects and public policies at the national and subnational levels. Oosterhaven and Stelder (2002) correctly pointed out the misuse of standard 'gross' multipliers and proposed the concept of 'net' multiplier as a solution to this bad practice. We prove their proposal is not well founded. We do so by showing that supporting theorems are faulty in enunciation and demonstration. The proofs are flawed due to an analytical error but the theorems themselves cannot be salvaged as generic, non-curiosum counterexamples demonstrate. We also provide a general analytical framework for multipliers and, using it, we show that standard 'gross' multipliers are all that is needed within the interindustry model since they follow the causal logic of the economic model, are well defined and independent of exogenous shocks, and are interpretable as predictors for change

    Spintronics: Fundamentals and applications

    Get PDF
    Spintronics, or spin electronics, involves the study of active control and manipulation of spin degrees of freedom in solid-state systems. This article reviews the current status of this subject, including both recent advances and well-established results. The primary focus is on the basic physical principles underlying the generation of carrier spin polarization, spin dynamics, and spin-polarized transport in semiconductors and metals. Spin transport differs from charge transport in that spin is a nonconserved quantity in solids due to spin-orbit and hyperfine coupling. The authors discuss in detail spin decoherence mechanisms in metals and semiconductors. Various theories of spin injection and spin-polarized transport are applied to hybrid structures relevant to spin-based devices and fundamental studies of materials properties. Experimental work is reviewed with the emphasis on projected applications, in which external electric and magnetic fields and illumination by light will be used to control spin and charge dynamics to create new functionalities not feasible or ineffective with conventional electronics.Comment: invited review, 36 figures, 900+ references; minor stylistic changes from the published versio

    A prospective cohort study of dietary patterns of non-western migrants in the Netherlands in relation to risk factors for cardiovascular diseases: HELIUS-Dietary Patterns

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In Western countries the prevalence of cardiovascular disease (CVD) is often higher in non-Western migrants as compared to the host population. Diet is an important modifiable determinant of CVD. Increasingly, dietary patterns rather than single nutrients are the focus of research in an attempt to account for the complexity of nutrient interactions in foods. Research on dietary patterns in non-Western migrants is limited and may be hampered by a lack of validated instruments that can be used to assess the habitual diet of non-western migrants in large scale epidemiological studies. The ultimate aims of this study are to (1) understand whether differences in dietary patterns explain differences in CVD risk between ethnic groups, by developing and validating ethnic-specific Food Frequency Questionnaires (FFQs), and (2) to investigate the determinants of these dietary patterns. This paper outlines the design and methods used in the HELIUS-Dietary Patterns study and describes a systematic approach to overcome difficulties in the assessment and analysis of dietary intake data in ethnically diverse populations.</p> <p>Methods/Design</p> <p>The HELIUS-Dietary Patterns study is embedded in the HELIUS study, a Dutch multi-ethnic cohort study. After developing ethnic-specific FFQs, we will gather data on the habitual intake of 5000 participants (18-70 years old) of ethnic Dutch, Surinamese of African and of South Asian origin, Turkish or Moroccan origin. Dietary patterns will be derived using factor analysis, but we will also evaluate diet quality using hypothesis-driven approaches. The relation between dietary patterns and CVD risk factors will be analysed using multiple linear regression analysis. Potential underlying determinants of dietary patterns like migration history, acculturation, socio-economic factors and lifestyle, will be considered.</p> <p>Discussion</p> <p>This study will allow us to investigate the contribution of the dietary patterns on CVD risk factors in a multi-ethnic population. Inclusion of five ethnic groups residing in one setting makes this study highly innovative as confounding by local environment characteristics is limited. Heterogeneity in the study population will provide variance in dietary patterns which is a great advantage when studying the link between diet and disease.</p

    ConsistentNeRF: Enhancing Neural Radiance Fields with 3D Consistency for Sparse View Synthesis

    Full text link
    Neural Radiance Fields (NeRF) has demonstrated remarkable 3D reconstruction capabilities with dense view images. However, its performance significantly deteriorates under sparse view settings. We observe that learning the 3D consistency of pixels among different views is crucial for improving reconstruction quality in such cases. In this paper, we propose ConsistentNeRF, a method that leverages depth information to regularize both multi-view and single-view 3D consistency among pixels. Specifically, ConsistentNeRF employs depth-derived geometry information and a depth-invariant loss to concentrate on pixels that exhibit 3D correspondence and maintain consistent depth relationships. Extensive experiments on recent representative works reveal that our approach can considerably enhance model performance in sparse view conditions, achieving improvements of up to 94% in PSNR, 76% in SSIM, and 31% in LPIPS compared to the vanilla baselines across various benchmarks, including DTU, NeRF Synthetic, and LLFF.Comment: https://github.com/skhu101/ConsistentNeR
    corecore