253 research outputs found
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Building scalable vision-language models to learn from diverse, multimodal
data remains an open challenge. In this paper, we introduce an Efficient
Vision-languagE foundation model, namely EVE, which is one unified multimodal
Transformer pre-trained solely by one unified pre-training task. Specifically,
EVE encodes both vision and language within a shared Transformer network
integrated with modality-aware sparse Mixture-of-Experts (MoE) modules, which
capture modality-specific information by selectively switching to different
experts. To unify pre-training tasks of vision and language, EVE performs
masked signal modeling on image-text pairs to reconstruct masked signals, i.e.,
image pixels and text tokens, given visible signals. This simple yet effective
pre-training objective accelerates training by 3.5x compared to the model
pre-trained with Image-Text Contrastive and Image-Text Matching losses. Owing
to the combination of the unified architecture and pre-training task, EVE is
easy to scale up, enabling better downstream performance with fewer resources
and faster training speed. Despite its simplicity, EVE achieves
state-of-the-art performance on various vision-language downstream tasks,
including visual question answering, visual reasoning, and image-text
retrieval.Comment: Accepted by AAAI 202
Efficient Secure Multiparty Computation for Multidimensional Arithmetics and Its Application in Privacy-Preserving Biometric Identification
Over years of the development of secure multi-party computation (MPC), many sophisticated functionalities have been made pratical and multi-dimensional operations occur more and more frequently in MPC protocols, especially in protocols involving datasets of vector elements, such as privacy-preserving biometric identification and privacy-preserving machine learning. In this paper, we introduce a new kind of correlation, called tensor triples, which is designed to make multi-dimensional MPC protocols more efficient. We will discuss the generation process, the usage, as well as the applications of tensor triples and show that it can accelerate privacy-preserving biometric identification protocols, such as FingerCode, Eigenfaces and FaceNet, by more than 1000 times
Judging a Book by Its Cover: The Effect of Facial Perception on Centrality in Social Networks
Facial appearance matters in social networks. Individuals frequently make
trait judgments from facial clues. Although these face-based impressions lack
the evidence to determine validity, they are of vital importance, because they
may relate to human network-based social behavior, such as seeking certain
individuals for help, advice, dating, and cooperation, and thus they may relate
to centrality in social networks. However, little to no work has investigated
the apparent facial traits that influence network centrality, despite the large
amount of research on attributions of the central position including
personality and behavior. In this paper, we examine whether perceived traits
based on facial appearance affect network centrality by exploring the initial
stage of social network formation in a first-year college residential area. We
took face photos of participants who are freshmen living in the same
residential area, and we asked them to nominate community members linking to
different networks. We then collected facial perception data by requiring other
participants to rate facial images for three main attributions: dominance,
trustworthiness, and attractiveness. Meanwhile, we proposed a framework to
discover how facial appearance affects social networks. Our results revealed
that perceived facial traits were correlated with the network centrality and
that they were indicative to predict the centrality of people in different
networks. Our findings provide psychological evidence regarding the interaction
between faces and network centrality. Our findings also offer insights in to a
combination of psychological and social network techniques, and they highlight
the function of facial bias in cuing and signaling social traits. To the best
of our knowledge, we are the first to explore the influence of facial
perception on centrality in social networks.Comment: 11 pages, 8 figure
Research Progress on Preparation and Functions of Alginate Oligosaccharides with Different Structures
Alginate is one of the most abundant marine polysaccharides, which can be degraded into alginate oligosaccharides (AOS) by chemical, physical and biological methods. AOS, linear oligosaccharide composed of guluronic acid and mannuronic acid, have a variety of biological activities such as antimicrobial, anti-inflammatory, and immunoregulatory activities, and these biological activities are closely related to their structural diversity. AOS prepared by different degradation methods have different uronic acid compositions, polymerization degrees and special structures. Therefore, exploring the structure-function relationship of AOS will help to fully understand the activity of AOS and improve its application value. In this paper, the reaction mechanisms of the various methods to prepare AOS, the structures of the resulting products, and the structure-function relationship of AOS are reviewed in order to provide a reference for further research and application of AOS
Enhancing Textual Personality Detection toward Social Media: Integrating Long-term and Short-term Perspectives
Textual personality detection aims to identify personality characteristics by
analyzing user-generated content toward social media platforms. Numerous
psychological literature highlighted that personality encompasses both
long-term stable traits and short-term dynamic states. However, existing
studies often concentrate only on either long-term or short-term personality
representations, without effectively combining both aspects. This limitation
hinders a comprehensive understanding of individuals' personalities, as both
stable traits and dynamic states are vital. To bridge this gap, we propose a
Dual Enhanced Network(DEN) to jointly model users' long-term and short-term
personality for textual personality detection. In DEN, a Long-term Personality
Encoding is devised to effectively model long-term stable personality traits.
Short-term Personality Encoding is presented to capture short-term dynamic
personality states. The Bi-directional Interaction component facilitates the
integration of both personality aspects, allowing for a comprehensive
representation of the user's personality. Experimental results on two
personality detection datasets demonstrate the effectiveness of the DEN model
and the benefits of considering both the dynamic and stable nature of
personality characteristics for textual personality detection.Comment: 11 pages, 9 figure
- …
