15,783 research outputs found
Empowering truth discovery with multi-truth prediction
Truth discovery is the problem of detecting true values from the con icting data provided by multiple sources on the same data items. Since sources' reliability is unknown a priori, a truth discovery method usually estimates sources' reliability along with the truth discovery process. A major limitation of existing truth discovery methods is that they commonly assume exactly one true value on each data item and therefore cannot deal with the more general case that a data item may have multiple true values (or multi-truth). Since the number of true values may vary from data item to data item, this requires truth discovery methods being able to detect varying numbers of truth values from the multi source data. In this paper, we propose a multi-truth discovery approach, which addresses the above challenges by providing a generic framework for enhancing existing truth discovery methods. In particular, we redeem the numbers of true values as an important clue for facilitating multi-truth discovery. We present the procedure and components of our approach, and propose three models, namely the byproduct model, the joint model, and the synthesis model to implement our approach. We further propose two extensions to enhance our approach, by leveraging the implications of similar numerical values and values' co-occurrence informa- tion in sources' claims to improve the truth discovery accuracy. Experimental studies on real-world datasets demonstrate the effectiveness of our approach.Xianzhi Wang, Quan Z. Sheng, Lina Yao, Xue Li, Xiu Susie Fang, Xiaofei Xu, and Boualem Benatalla
GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text
Large language models have made significant strides in natural language
processing, paving the way for innovative applications including molecular
representation and generation. However, most existing single-modality
approaches cannot capture the abundant and complex information in molecular
data. Here, we introduce GIT-Mol, a multi-modal large language model that
integrates the structure Graph, Image, and Text information, including the
Simplified Molecular Input Line Entry System (SMILES) and molecular captions.
To facilitate the integration of multi-modal molecular data, we propose
GIT-Former, a novel architecture capable of mapping all modalities into a
unified latent space. Our study develops an innovative any-to-language
molecular translation strategy and achieves a 10%-15% improvement in molecular
captioning, a 5%-10% accuracy increase in property prediction, and a 20% boost
in molecule generation validity compared to baseline or single-modality models.Comment: 16 pages, 5 figure
Recommended from our members
Zapping index: Using smile to measure advertisement zapping likelihood
In marketing and advertising research, 'zapping' is defined as the action when a viewer stops watching a commercial. Researchers analyze users' behavior in order to prevent zapping which helps advertisers to design effective commercials. Since emotions can be used to engage consumers, in this paper, we leverage automated facial expression analysis to understand consumers' zapping behavior. Firstly, we provide an accurate moment-to-moment smile detection algorithm. Secondly, we formulate a binary classification problem (zapping/non-zapping) based on real-world scenarios, and adopt smile response as the feature to predict zapping. Thirdly, to cope with the lack of a metric in advertising evaluation, we propose a new metric called Zapping Index (ZI). ZI is a moment-to-moment measurement of a user's zapping probability. It gauges not only the reaction of a user, but also the preference of a user to commercials. Finally, extensive experiments are performed to provide insights and we make recommendations that will be useful to both advertisers and advertisement publishers
Prospects for Theranostics in Neurosurgical Imaging: Empowering Confocal Laser Endomicroscopy Diagnostics via Deep Learning
Confocal laser endomicroscopy (CLE) is an advanced optical fluorescence
imaging technology that has the potential to increase intraoperative precision,
extend resection, and tailor surgery for malignant invasive brain tumors
because of its subcellular dimension resolution. Despite its promising
diagnostic potential, interpreting the gray tone fluorescence images can be
difficult for untrained users. In this review, we provide a detailed
description of bioinformatical analysis methodology of CLE images that begins
to assist the neurosurgeon and pathologist to rapidly connect on-the-fly
intraoperative imaging, pathology, and surgical observation into a
conclusionary system within the concept of theranostics. We present an overview
and discuss deep learning models for automatic detection of the diagnostic CLE
images and discuss various training regimes and ensemble modeling effect on the
power of deep learning predictive models. Two major approaches reviewed in this
paper include the models that can automatically classify CLE images into
diagnostic/nondiagnostic, glioma/nonglioma, tumor/injury/normal categories and
models that can localize histological features on the CLE images using weakly
supervised methods. We also briefly review advances in the deep learning
approaches used for CLE image analysis in other organs. Significant advances in
speed and precision of automated diagnostic frame selection would augment the
diagnostic potential of CLE, improve operative workflow and integration into
brain tumor surgery. Such technology and bioinformatics analytics lend
themselves to improved precision, personalization, and theranostics in brain
tumor treatment.Comment: See the final version published in Frontiers in Oncology here:
https://www.frontiersin.org/articles/10.3389/fonc.2018.00240/ful
eX-ViT: A Novel eXplainable Vision Transformer for Weakly Supervised Semantic Segmentation
Recently vision transformer models have become prominent models for a range
of vision tasks. These models, however, are usually opaque with weak feature
interpretability. Moreover, there is no method currently built for an
intrinsically interpretable transformer, which is able to explain its reasoning
process and provide a faithful explanation. To close these crucial gaps, we
propose a novel vision transformer dubbed the eXplainable Vision Transformer
(eX-ViT), an intrinsically interpretable transformer model that is able to
jointly discover robust interpretable features and perform the prediction.
Specifically, eX-ViT is composed of the Explainable Multi-Head Attention
(E-MHA) module, the Attribute-guided Explainer (AttE) module and the
self-supervised attribute-guided loss. The E-MHA tailors explainable attention
weights that are able to learn semantically interpretable representations from
local patches in terms of model decisions with noise robustness. Meanwhile,
AttE is proposed to encode discriminative attribute features for the target
object through diverse attribute discovery, which constitutes faithful evidence
for the model's predictions. In addition, a self-supervised attribute-guided
loss is developed for our eX-ViT, which aims at learning enhanced
representations through the attribute discriminability mechanism and attribute
diversity mechanism, to localize diverse and discriminative attributes and
generate more robust explanations. As a result, we can uncover faithful and
robust interpretations with diverse attributes through the proposed eX-ViT
- …