723 research outputs found

    One-Versus-Others Attention: Scalable Multimodal Integration

    Full text link
    Multimodal learning models have become increasingly important as they surpass single-modality approaches on diverse tasks ranging from question-answering to autonomous driving. Despite the importance of multimodal learning, existing efforts focus on NLP applications, where the number of modalities is typically less than four (audio, video, text, images). However, data inputs in other domains, such as the medical field, may include X-rays, PET scans, MRIs, genetic screening, clinical notes, and more, creating a need for both efficient and accurate information fusion. Many state-of-the-art models rely on pairwise cross-modal attention, which does not scale well for applications with more than three modalities. For nn modalities, computing attention will result in (n2)n \choose 2 operations, potentially requiring considerable amounts of computational resources. To address this, we propose a new domain-neutral attention mechanism, One-Versus-Others (OvO) attention, that scales linearly with the number of modalities and requires only nn attention operations, thus offering a significant reduction in computational complexity compared to existing cross-modal attention algorithms. Using three diverse real-world datasets as well as an additional simulation experiment, we show that our method improves performance compared to popular fusion techniques while decreasing computation costs

    GraFT: Gradual Fusion Transformer for Multimodal Re-Identification

    Full text link
    Object Re-Identification (ReID) is pivotal in computer vision, witnessing an escalating demand for adept multimodal representation learning. Current models, although promising, reveal scalability limitations with increasing modalities as they rely heavily on late fusion, which postpones the integration of specific modality insights. Addressing this, we introduce the \textbf{Gradual Fusion Transformer (GraFT)} for multimodal ReID. At its core, GraFT employs learnable fusion tokens that guide self-attention across encoders, adeptly capturing both modality-specific and object-specific features. Further bolstering its efficacy, we introduce a novel training paradigm combined with an augmented triplet loss, optimizing the ReID feature embedding space. We demonstrate these enhancements through extensive ablation studies and show that GraFT consistently surpasses established multimodal ReID benchmarks. Additionally, aiming for deployment versatility, we've integrated neural network pruning into GraFT, offering a balance between model size and performance.Comment: 3 Borderline Reviews at WACV, 8 pages, 5 figures, 8 table

    Wastewater treatment using artificial wetlands

    Get PDF
    In the study described in this paper, pilot scale vertical flow wetlands were evaluated as a potential wastewater treatment system for agricultural wastewater exiting from swine farm. The criteria used for evaluation were based on water quality requirements for irrigation

    Nutzung gewalthaltiger Bildschirmspiele als lĂ€ngsschnittlicher Risikofaktor fĂŒr Cyberbullying in der frĂŒhen Adoleszenz

    Get PDF
    "Die Studie untersucht mittels Cross-Lagged-Panel-Design lĂ€ngsschnittliche ZusammenhĂ€nge zwischen gewalthaltiger Bildschirmspielnutzung und Cyberbullying. Zur ErklĂ€rung dieser ZusammenhĂ€nge werden die Selektions- und die Sozialisationshypothese ĂŒberprĂŒft. Traditionelles Bullying, Viktimisierung, offene und relationale Aggression werden als mögliche Kovariaten von Cyberbullying mitberĂŒcksichtigt. Im Zeitabstand eines Jahres wurden SelbsteinschĂ€tzungen von 271 Jugendlichen (10-13 Jahre) zu zwei Messzeitpunkten erhoben. Mittels Strukturgleichungsmodellen konnte gezeigt werden, dass die Nutzung gewalthaltiger Bildschirmspiele ein Risikofaktor fĂŒr Cyberbullying, traditionelles Bullying und offene Aggression (Sozialisationseffekt) ist. Zudem ist traditionelles Bullying ein Risikofaktor fĂŒr gewalthaltige Bildschirmspielnutzung (Selektionseffekt)." (Autorenreferat)"This study investigates longitudinal associations between violent video game playing und cyberbullying using a Cross-Lagged-Panel Design. The selection and the socialization hypothesis were tested as possible explanations. Traditional bullying, victimization, overt and relational aggression were considered as covariates in the analyses. Within one year, self-reports were collected from 271 adolescents, aged between 10 and 13 years, at two measurement occasions. Structural-Equation-Models identified violent video game playing as a longitudinal risk factor for cyberbullying, traditional bullying and overt aggression (socialization effect). Furthermore, traditional bullying was found as a longitudinal risk factor for violent video game playing (selection effect)." (author's abstract

    Transformations of the National:Rammstein’s “Deutschland” as a Provocation of German History

    Get PDF
    This contribution examines the nexus between ‘the political’ and popular music from an interdisciplinary perspective. Using Rammstein’s highly provocative single “Deutschland” (2019) as an example, this case study showcases multiple different and often contradictory readings of the band’s work with a view to its textual, visual, sonic and performative dimensions. Overall, this contribution suggests that the political in Rammstein oscillates between self-reference and historical reference, between deconstruction and marketability, and between scandalous irony and ambiguous sincerity
    • 

    corecore