320 research outputs found
Multi-task Neural Network for Non-discrete Attribute Prediction in Knowledge Graphs
Many popular knowledge graphs such as Freebase, YAGO or DBPedia maintain a
list of non-discrete attributes for each entity. Intuitively, these attributes
such as height, price or population count are able to richly characterize
entities in knowledge graphs. This additional source of information may help to
alleviate the inherent sparsity and incompleteness problem that are prevalent
in knowledge graphs. Unfortunately, many state-of-the-art relational learning
models ignore this information due to the challenging nature of dealing with
non-discrete data types in the inherently binary-natured knowledge graphs. In
this paper, we propose a novel multi-task neural network approach for both
encoding and prediction of non-discrete attribute information in a relational
setting. Specifically, we train a neural network for triplet prediction along
with a separate network for attribute value regression. Via multi-task
learning, we are able to learn representations of entities, relations and
attributes that encode information about both tasks. Moreover, such attributes
are not only central to many predictive tasks as an information source but also
as a prediction target. Therefore, models that are able to encode, incorporate
and predict such information in a relational learning context are highly
attractive as well. We show that our approach outperforms many state-of-the-art
methods for the tasks of relational triplet classification and attribute value
prediction.Comment: Accepted at CIKM 201
Relational Collaborative Filtering:Modeling Multiple Item Relations for Recommendation
Existing item-based collaborative filtering (ICF) methods leverage only the
relation of collaborative similarity. Nevertheless, there exist multiple
relations between items in real-world scenarios. Distinct from the
collaborative similarity that implies co-interact patterns from the user
perspective, these relations reveal fine-grained knowledge on items from
different perspectives of meta-data, functionality, etc. However, how to
incorporate multiple item relations is less explored in recommendation
research. In this work, we propose Relational Collaborative Filtering (RCF), a
general framework to exploit multiple relations between items in recommender
system. We find that both the relation type and the relation value are crucial
in inferring user preference. To this end, we develop a two-level hierarchical
attention mechanism to model user preference. The first-level attention
discriminates which types of relations are more important, and the second-level
attention considers the specific relation values to estimate the contribution
of a historical item in recommending the target item. To make the item
embeddings be reflective of the relational structure between items, we further
formulate a task to preserve the item relations, and jointly train it with the
recommendation task of preference modeling. Empirical results on two real
datasets demonstrate the strong performance of RCF. Furthermore, we also
conduct qualitative analyses to show the benefits of explanations brought by
the modeling of multiple item relations
Deep Structural Knowledge Exploitation and Synergy for Estimating Node Importance Value on Heterogeneous Information Networks
Node importance estimation problem has been studied conventionally with
homogeneous network topology analysis. To deal with network heterogeneity, a
few recent methods employ graph neural models to automatically learn diverse
sources of information. However, the major concern revolves around that their
full adaptive learning process may lead to insufficient information
exploration, thereby formulating the problem as the isolated node value
prediction with underperformance and less interpretability. In this work, we
propose a novel learning framework: SKES. Different from previous automatic
learning designs, SKES exploits heterogeneous structural knowledge to enrich
the informativeness of node representations. Based on a sufficiently
uninformative reference, SKES estimates the importance value for any input
node, by quantifying its disparity against the reference. This establishes an
interpretable node importance computation paradigm. Furthermore, SKES dives
deep into the understanding that "nodes with similar characteristics are prone
to have similar importance values" whilst guaranteeing that such
informativeness disparity between any different nodes is orderly reflected by
the embedding distance of their associated latent features. Extensive
experiments on three widely-evaluated benchmarks demonstrate the performance
superiority of SKES over several recent competing methods.Comment: Accepted by AAAI 202
USimAgent: Large Language Models for Simulating Search Users
Due to the advantages in the cost-efficiency and reproducibility, user
simulation has become a promising solution to the user-centric evaluation of
information retrieval systems. Nonetheless, accurately simulating user search
behaviors has long been a challenge, because users' actions in search are
highly complex and driven by intricate cognitive processes such as learning,
reasoning, and planning. Recently, Large Language Models (LLMs) have
demonstrated remarked potential in simulating human-level intelligence and have
been used in building autonomous agents for various tasks. However, the
potential of using LLMs in simulating search behaviors has not yet been fully
explored. In this paper, we introduce a LLM-based user search behavior
simulator, USimAgent. The proposed simulator can simulate users' querying,
clicking, and stopping behaviors during search, and thus, is capable of
generating complete search sessions for specific search tasks. Empirical
investigation on a real user behavior dataset shows that the proposed simulator
outperforms existing methods in query generation and is comparable to
traditional methods in predicting user clicks and stopping behaviors. These
results not only validate the effectiveness of using LLMs for user simulation
but also shed light on the development of a more robust and generic user
simulators
Systematic Analysis of Impact of Sampling Regions and Storage Methods on Fecal Gut Microbiome and Metabolome Profiles.
The contribution of human gastrointestinal (GI) microbiota and metabolites to host health has recently become much clearer. However, many confounding factors can influence the accuracy of gut microbiome and metabolome studies, resulting in inconsistencies in published results. In this study, we systematically investigated the effects of fecal sampling regions and storage and retrieval conditions on gut microbiome and metabolite profiles from three healthy children. Our analysis indicated that compared to homogenized and snap-frozen samples (standard control [SC]), different sampling regions did not affect microbial community alpha diversity, while a total of 22 of 176 identified metabolites varied significantly across different sampling regions. In contrast, storage conditions significantly influenced the microbiome and metabolome. Short-term room temperature storage had a minimal effect on the microbiome and metabolome profiles. Sample storage in RNALater showed a significant level of variation in both microbiome and metabolome profiles, independent of the storage or retrieval conditions. The effect of RNALater on the metabolome was stronger than the effect on the microbiome, and individual variability between study participants outweighed the effect of RNALater on the microbiome. We conclude that homogenizing stool samples was critical for metabolomic analysis but not necessary for microbiome analysis. Short-term room temperature storage had a minimal effect on the microbiome and metabolome profiles and is recommended for short-term fecal sample storage. In addition, our study indicates that the use of RNALater as a storage medium of stool samples for microbial and metabolomic analyses is not recommended.IMPORTANCE The gastrointestinal microbiome and metabolome can provide a new angle to understand the development of health and disease. Stool samples are most frequently used for large-scale cohort studies. Standardized procedures for stool sample handling and storage can be a determining factor for performing microbiome or metabolome studies. In this study, we focused on the effects of stool sampling regions and stool sample storage conditions on variations in the gut microbiome composition and metabolome profile
Interaction Embeddings for Prediction and Explanation in Knowledge Graphs
Knowledge graph embedding aims to learn distributed representations for
entities and relations, and is proven to be effective in many applications.
Crossover interactions --- bi-directional effects between entities and
relations --- help select related information when predicting a new triple, but
haven't been formally discussed before. In this paper, we propose CrossE, a
novel knowledge graph embedding which explicitly simulates crossover
interactions. It not only learns one general embedding for each entity and
relation as most previous methods do, but also generates multiple triple
specific embeddings for both of them, named interaction embeddings. We evaluate
embeddings on typical link prediction tasks and find that CrossE achieves
state-of-the-art results on complex and more challenging datasets. Furthermore,
we evaluate embeddings from a new perspective --- giving explanations for
predicted triples, which is important for real applications. In this work, an
explanation for a triple is regarded as a reliable closed-path between the head
and the tail entity. Compared to other baselines, we show experimentally that
CrossE, benefiting from interaction embeddings, is more capable of generating
reliable explanations to support its predictions.Comment: This paper is accepted by WSDM201
Interactions between Exposure to Environmental Polycyclic Aromatic Hydrocarbons and DNA Repair Gene Polymorphisms on Bulky DNA Adducts in Human Sperm
BACKGROUND: Nucleotide excision repair (NER) and base excision repair (BER) are the primary mechanisms for repair of bulky adducts caused by chemical agents, such as PAHs. It is expected that polymorphisms in NER or BER genes may modulate individual susceptibility to PAHs exposure. Here, we evaluate the effects of PAHs exposure and polymorphisms in NER and BER pathway, alone or combined, on polycyclic aromatic hydrocarbon-DNA (PAH-DNA) adducts in human sperm. METHODOLOGY/PRINCIPAL FINDINGS: Sperm PAH-DNA adducts were measured by immunofluorescent assay using flow cytometry in a sample of 465 infertile adults. Polymorphisms of XPA, XPD, ERCC1, XPF, and XRCC1 were determined by polymerase chain reaction (PCR) and restriction fragment length polymorphism (RFLP) techniques. The PAHs exposure was detected as urinary 1-hydroxypyrene (1-OHP) levels. In multivariate models adjusted for potential confounders, we observed that XRCC1 5'pUTR -T/C, Arg194Trp, Arg399Gln polymorphisms were associated with increased sperm adduct levels. Furthermore, the stratified analysis indicated that adverse effects of XRCC1 Arg194Trp, Arg399Gln polymorphisms on PAH-DNA adducts were detected only in the high PAHs exposure group. CONCLUSIONS/SIGNIFICANCE: These findings provided the first evidence that polymorphisms of XRCC1 may modify sperm PAH-DNA adduct levels and may be useful biomarkers to identify individuals susceptible to DNA damage resulting from PAHs exposure
Infection and Infertility
Infection is a multifactorial process, which can be induced by a virus, bacterium, or parasite. It may cause many diseases, including obesity, cancer, and infertility. In this chapter, we focus our attention on the association of infection and fertility alteration. Numerous studies have suggested that genetic polymorphisms influencing infection are associated with infertility. So we also review the genetic influence on infection and risk of infertility
Contextual Graph Attention for Answering Logical Queries over Incomplete Knowledge Graphs
Recently, several studies have explored methods for using KG embedding to
answer logical queries. These approaches either treat embedding learning and
query answering as two separated learning tasks, or fail to deal with the
variability of contributions from different query paths. We proposed to
leverage a graph attention mechanism to handle the unequal contribution of
different query paths. However, commonly used graph attention assumes that the
center node embedding is provided, which is unavailable in this task since the
center node is to be predicted. To solve this problem we propose a multi-head
attention-based end-to-end logical query answering model, called Contextual
Graph Attention model(CGA), which uses an initial neighborhood aggregation
layer to generate the center embedding, and the whole model is trained jointly
on the original KG structure as well as the sampled query-answer pairs. We also
introduce two new datasets, DB18 and WikiGeo19, which are rather large in size
compared to the existing datasets and contain many more relation types, and use
them to evaluate the performance of the proposed model. Our result shows that
the proposed CGA with fewer learnable parameters consistently outperforms the
baseline models on both datasets as well as Bio dataset.Comment: 8 pages, 3 figures, camera ready version of article accepted to K-CAP
2019, Marina del Rey, California, United State
- …