253 research outputs found

    EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE

    Full text link
    Building scalable vision-language models to learn from diverse, multimodal data remains an open challenge. In this paper, we introduce an Efficient Vision-languagE foundation model, namely EVE, which is one unified multimodal Transformer pre-trained solely by one unified pre-training task. Specifically, EVE encodes both vision and language within a shared Transformer network integrated with modality-aware sparse Mixture-of-Experts (MoE) modules, which capture modality-specific information by selectively switching to different experts. To unify pre-training tasks of vision and language, EVE performs masked signal modeling on image-text pairs to reconstruct masked signals, i.e., image pixels and text tokens, given visible signals. This simple yet effective pre-training objective accelerates training by 3.5x compared to the model pre-trained with Image-Text Contrastive and Image-Text Matching losses. Owing to the combination of the unified architecture and pre-training task, EVE is easy to scale up, enabling better downstream performance with fewer resources and faster training speed. Despite its simplicity, EVE achieves state-of-the-art performance on various vision-language downstream tasks, including visual question answering, visual reasoning, and image-text retrieval.Comment: Accepted by AAAI 202

    Efficient Secure Multiparty Computation for Multidimensional Arithmetics and Its Application in Privacy-Preserving Biometric Identification

    Get PDF
    Over years of the development of secure multi-party computation (MPC), many sophisticated functionalities have been made pratical and multi-dimensional operations occur more and more frequently in MPC protocols, especially in protocols involving datasets of vector elements, such as privacy-preserving biometric identification and privacy-preserving machine learning. In this paper, we introduce a new kind of correlation, called tensor triples, which is designed to make multi-dimensional MPC protocols more efficient. We will discuss the generation process, the usage, as well as the applications of tensor triples and show that it can accelerate privacy-preserving biometric identification protocols, such as FingerCode, Eigenfaces and FaceNet, by more than 1000 times

    Judging a Book by Its Cover: The Effect of Facial Perception on Centrality in Social Networks

    Full text link
    Facial appearance matters in social networks. Individuals frequently make trait judgments from facial clues. Although these face-based impressions lack the evidence to determine validity, they are of vital importance, because they may relate to human network-based social behavior, such as seeking certain individuals for help, advice, dating, and cooperation, and thus they may relate to centrality in social networks. However, little to no work has investigated the apparent facial traits that influence network centrality, despite the large amount of research on attributions of the central position including personality and behavior. In this paper, we examine whether perceived traits based on facial appearance affect network centrality by exploring the initial stage of social network formation in a first-year college residential area. We took face photos of participants who are freshmen living in the same residential area, and we asked them to nominate community members linking to different networks. We then collected facial perception data by requiring other participants to rate facial images for three main attributions: dominance, trustworthiness, and attractiveness. Meanwhile, we proposed a framework to discover how facial appearance affects social networks. Our results revealed that perceived facial traits were correlated with the network centrality and that they were indicative to predict the centrality of people in different networks. Our findings provide psychological evidence regarding the interaction between faces and network centrality. Our findings also offer insights in to a combination of psychological and social network techniques, and they highlight the function of facial bias in cuing and signaling social traits. To the best of our knowledge, we are the first to explore the influence of facial perception on centrality in social networks.Comment: 11 pages, 8 figure

    Research Progress on Preparation and Functions of Alginate Oligosaccharides with Different Structures

    Get PDF
    Alginate is one of the most abundant marine polysaccharides, which can be degraded into alginate oligosaccharides (AOS) by chemical, physical and biological methods. AOS, linear oligosaccharide composed of guluronic acid and mannuronic acid, have a variety of biological activities such as antimicrobial, anti-inflammatory, and immunoregulatory activities, and these biological activities are closely related to their structural diversity. AOS prepared by different degradation methods have different uronic acid compositions, polymerization degrees and special structures. Therefore, exploring the structure-function relationship of AOS will help to fully understand the activity of AOS and improve its application value. In this paper, the reaction mechanisms of the various methods to prepare AOS, the structures of the resulting products, and the structure-function relationship of AOS are reviewed in order to provide a reference for further research and application of AOS

    Enhancing Textual Personality Detection toward Social Media: Integrating Long-term and Short-term Perspectives

    Full text link
    Textual personality detection aims to identify personality characteristics by analyzing user-generated content toward social media platforms. Numerous psychological literature highlighted that personality encompasses both long-term stable traits and short-term dynamic states. However, existing studies often concentrate only on either long-term or short-term personality representations, without effectively combining both aspects. This limitation hinders a comprehensive understanding of individuals' personalities, as both stable traits and dynamic states are vital. To bridge this gap, we propose a Dual Enhanced Network(DEN) to jointly model users' long-term and short-term personality for textual personality detection. In DEN, a Long-term Personality Encoding is devised to effectively model long-term stable personality traits. Short-term Personality Encoding is presented to capture short-term dynamic personality states. The Bi-directional Interaction component facilitates the integration of both personality aspects, allowing for a comprehensive representation of the user's personality. Experimental results on two personality detection datasets demonstrate the effectiveness of the DEN model and the benefits of considering both the dynamic and stable nature of personality characteristics for textual personality detection.Comment: 11 pages, 9 figure
    corecore