Search CORE

320 research outputs found

Multi-task Neural Network for Non-discrete Attribute Prediction in Knowledge Graphs

Author: Bollacker Kurt D.
Diederik
Guo Shu
Gupta Abhijeet
Hoffart Johannes
Jenatton Rodolphe
Ji Guoliang
Li Ya
Lin Yankai
Lin Yankai
Lin Yankai
Liu Pengfei
Luu Anh Tuan
Mikolov Tomas
Nickel Maximilian
Qiu Xipeng
Tay Yi
Wang Zhen
Xiao Han
Xie Ruobing
Zhong Huaping
Publication venue
Publication date: 16/08/2017
Field of study

Many popular knowledge graphs such as Freebase, YAGO or DBPedia maintain a list of non-discrete attributes for each entity. Intuitively, these attributes such as height, price or population count are able to richly characterize entities in knowledge graphs. This additional source of information may help to alleviate the inherent sparsity and incompleteness problem that are prevalent in knowledge graphs. Unfortunately, many state-of-the-art relational learning models ignore this information due to the challenging nature of dealing with non-discrete data types in the inherently binary-natured knowledge graphs. In this paper, we propose a novel multi-task neural network approach for both encoding and prediction of non-discrete attribute information in a relational setting. Specifically, we train a neural network for triplet prediction along with a separate network for attribute value regression. Via multi-task learning, we are able to learn representations of entities, relations and attributes that encode information about both tasks. Moreover, such attributes are not only central to many predictive tasks as an information source but also as a prediction target. Therefore, models that are able to encode, incorporate and predict such information in a relational learning context are highly attractive as well. We show that our approach outperforms many state-of-the-art methods for the tasks of relational triplet classification and attribute value prediction.Comment: Accepted at CIKM 201

arXiv.org e-Print Archive

Crossref

Relational Collaborative Filtering:Modeling Multiple Item Relations for Recommendation

Author: Ai Qingyao
He Xiangnan
Kang Wang-Cheng
Lee Joonseok
Lin Yankai
Rendle Steffen
Xu Kelvin
Publication venue
Publication date: 01/01/2019
Field of study

Existing item-based collaborative filtering (ICF) methods leverage only the relation of collaborative similarity. Nevertheless, there exist multiple relations between items in real-world scenarios. Distinct from the collaborative similarity that implies co-interact patterns from the user perspective, these relations reveal fine-grained knowledge on items from different perspectives of meta-data, functionality, etc. However, how to incorporate multiple item relations is less explored in recommendation research. In this work, we propose Relational Collaborative Filtering (RCF), a general framework to exploit multiple relations between items in recommender system. We find that both the relation type and the relation value are crucial in inferring user preference. To this end, we develop a two-level hierarchical attention mechanism to model user preference. The first-level attention discriminates which types of relations are more important, and the second-level attention considers the specific relation values to estimate the contribution of a historical item in recommending the target item. To make the item embeddings be reflective of the relational structure between items, we further formulate a task to preserve the item relations, and jointly train it with the recommendation task of preference modeling. Empirical results on two real datasets demonstrate the strong performance of RCF. Furthermore, we also conduct qualitative analyses to show the benefits of explanations brought by the modeling of multiple item relations

arXiv.org e-Print Archive

Crossref

Enlighten

Deep Structural Knowledge Exploitation and Synergy for Estimating Node Importance Value on Heterogeneous Information Networks

Author: Cao Xin
Chen Yankai
Fang Yixiang
King Irwin
Wang Qiongyan
Publication venue
Publication date: 18/02/2024
Field of study

Node importance estimation problem has been studied conventionally with homogeneous network topology analysis. To deal with network heterogeneity, a few recent methods employ graph neural models to automatically learn diverse sources of information. However, the major concern revolves around that their full adaptive learning process may lead to insufficient information exploration, thereby formulating the problem as the isolated node value prediction with underperformance and less interpretability. In this work, we propose a novel learning framework: SKES. Different from previous automatic learning designs, SKES exploits heterogeneous structural knowledge to enrich the informativeness of node representations. Based on a sufficiently uninformative reference, SKES estimates the importance value for any input node, by quantifying its disparity against the reference. This establishes an interpretable node importance computation paradigm. Furthermore, SKES dives deep into the understanding that "nodes with similar characteristics are prone to have similar importance values" whilst guaranteeing that such informativeness disparity between any different nodes is orderly reflected by the embedding distance of their associated latent features. Extensive experiments on three widely-evaluated benchmarks demonstrate the performance superiority of SKES over several recent competing methods.Comment: Accepted by AAAI 202

arXiv.org e-Print Archive

USimAgent: Large Language Models for Simulating Search Users

Author: Gong Peiyuan
Lin Yankai
Mao Jiaxin
Wang Xingzhu
Zhang Erhan
Publication venue
Publication date: 14/03/2024
Field of study

Due to the advantages in the cost-efficiency and reproducibility, user simulation has become a promising solution to the user-centric evaluation of information retrieval systems. Nonetheless, accurately simulating user search behaviors has long been a challenge, because users' actions in search are highly complex and driven by intricate cognitive processes such as learning, reasoning, and planning. Recently, Large Language Models (LLMs) have demonstrated remarked potential in simulating human-level intelligence and have been used in building autonomous agents for various tasks. However, the potential of using LLMs in simulating search behaviors has not yet been fully explored. In this paper, we introduce a LLM-based user search behavior simulator, USimAgent. The proposed simulator can simulate users' querying, clicking, and stopping behaviors during search, and thus, is capable of generating complete search sessions for specific search tasks. Empirical investigation on a real user behavior dataset shows that the proposed simulator outperforms existing methods in query generation and is comparable to traditional methods in predicting user clicks and stopping behaviors. These results not only validate the effectiveness of using LLMs for user simulation but also shed light on the development of a more robust and generic user simulators

arXiv.org e-Print Archive

Systematic Analysis of Impact of Sampling Regions and Storage Methods on Fecal Gut Microbiome and Metabolome Profiles.

Author: Chang Hang
Chen Minjian
Dong Tianyu
Hang Bo
He Lianping
Liang Yali
Liu Xingyin
Mao Jian-Hua
Snijders Antoine M
Wang Tingzhang
Xia Yankai
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

The contribution of human gastrointestinal (GI) microbiota and metabolites to host health has recently become much clearer. However, many confounding factors can influence the accuracy of gut microbiome and metabolome studies, resulting in inconsistencies in published results. In this study, we systematically investigated the effects of fecal sampling regions and storage and retrieval conditions on gut microbiome and metabolite profiles from three healthy children. Our analysis indicated that compared to homogenized and snap-frozen samples (standard control [SC]), different sampling regions did not affect microbial community alpha diversity, while a total of 22 of 176 identified metabolites varied significantly across different sampling regions. In contrast, storage conditions significantly influenced the microbiome and metabolome. Short-term room temperature storage had a minimal effect on the microbiome and metabolome profiles. Sample storage in RNALater showed a significant level of variation in both microbiome and metabolome profiles, independent of the storage or retrieval conditions. The effect of RNALater on the metabolome was stronger than the effect on the microbiome, and individual variability between study participants outweighed the effect of RNALater on the microbiome. We conclude that homogenizing stool samples was critical for metabolomic analysis but not necessary for microbiome analysis. Short-term room temperature storage had a minimal effect on the microbiome and metabolome profiles and is recommended for short-term fecal sample storage. In addition, our study indicates that the use of RNALater as a storage medium of stool samples for microbial and metabolomic analyses is not recommended.IMPORTANCE The gastrointestinal microbiome and metabolome can provide a new angle to understand the development of health and disease. Stool samples are most frequently used for large-scale cohort studies. Standardized procedures for stool sample handling and storage can be a determining factor for performing microbiome or metabolome studies. In this study, we focused on the effects of stool sampling regions and stool sample storage conditions on variations in the gut microbiome composition and metabolome profile

Directory of Open Access Journals

eScholarship - University of California

Interaction Embeddings for Prediction and Explanation in Knowledge Graphs

Author: Bordes Antoine
Bordes Antoine
Glorot Xavier
Ji Guoliang
Lin Yankai
Liu Hanxiao
Nickel Maximilian
Nickel Maximilian
Shi Baoxu
Srivastava Nitish
Trouillon Théo
Wang Quan
Wang Zhen
Xie Ruobing
Yang Bishan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/02/2019
Field of study

Knowledge graph embedding aims to learn distributed representations for entities and relations, and is proven to be effective in many applications. Crossover interactions --- bi-directional effects between entities and relations --- help select related information when predicting a new triple, but haven't been formally discussed before. In this paper, we propose CrossE, a novel knowledge graph embedding which explicitly simulates crossover interactions. It not only learns one general embedding for each entity and relation as most previous methods do, but also generates multiple triple specific embeddings for both of them, named interaction embeddings. We evaluate embeddings on typical link prediction tasks and find that CrossE achieves state-of-the-art results on complex and more challenging datasets. Furthermore, we evaluate embeddings from a new perspective --- giving explanations for predicted triples, which is important for real applications. In this work, an explanation for a triple is regarded as a reliable closed-path between the head and the tail entity. Compared to other baselines, we show experimentally that CrossE, benefiting from interaction embeddings, is more capable of generating reliable explanations to support its predictions.Comment: This paper is accepted by WSDM201

arXiv.org e-Print Archive

Crossref

ZORA

Interactions between Exposure to Environmental Polycyclic Aromatic Hydrocarbons and DNA Repair Gene Polymorphisms on Bulky DNA Adducts in Human Sperm

Author: Aihua Gu
Guixiang Ji
Hongmei Wang
Ling Song
Shoulin Wang
Xiangguo Shi
Xinru Wang
Yan Long
Yankai Xia
Yong Zhou
Publication venue: Public Library of Science
Publication date: 05/10/2010
Field of study

BACKGROUND: Nucleotide excision repair (NER) and base excision repair (BER) are the primary mechanisms for repair of bulky adducts caused by chemical agents, such as PAHs. It is expected that polymorphisms in NER or BER genes may modulate individual susceptibility to PAHs exposure. Here, we evaluate the effects of PAHs exposure and polymorphisms in NER and BER pathway, alone or combined, on polycyclic aromatic hydrocarbon-DNA (PAH-DNA) adducts in human sperm. METHODOLOGY/PRINCIPAL FINDINGS: Sperm PAH-DNA adducts were measured by immunofluorescent assay using flow cytometry in a sample of 465 infertile adults. Polymorphisms of XPA, XPD, ERCC1, XPF, and XRCC1 were determined by polymerase chain reaction (PCR) and restriction fragment length polymorphism (RFLP) techniques. The PAHs exposure was detected as urinary 1-hydroxypyrene (1-OHP) levels. In multivariate models adjusted for potential confounders, we observed that XRCC1 5'pUTR -T/C, Arg194Trp, Arg399Gln polymorphisms were associated with increased sperm adduct levels. Furthermore, the stratified analysis indicated that adverse effects of XRCC1 Arg194Trp, Arg399Gln polymorphisms on PAH-DNA adducts were detected only in the high PAHs exposure group. CONCLUSIONS/SIGNIFICANCE: These findings provided the first evidence that polymorphisms of XRCC1 may modify sperm PAH-DNA adduct levels and may be useful biomarkers to identify individuals susceptible to DNA damage resulting from PAHs exposure

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Infection and Infertility

Author: Chen Yiqiu
Gu Hao
Sha Jiahao
Tang Qiuqin
Wang Xinru
Wu Wei
Xia Yankai
Publication venue: 'IntechOpen'
Publication date: 29/06/2016
Field of study

Infection is a multifactorial process, which can be induced by a virus, bacterium, or parasite. It may cause many diseases, including obesity, cancer, and infertility. In this chapter, we focus our attention on the association of infection and fertility alteration. Numerous studies have suggested that genetic polymorphisms influencing infection are associated with infertility. So we also review the genetic influence on infection and risk of infertility

IntechOpen

Contextual Graph Attention for Answering Logical Queries over Incomplete Knowledge Graphs

Author: Bahdanau Dzmitry
Kipf Thomas N
Liang Chen
Lin Yankai
Mai Gengchen
Mai Gengchen
Petar Velivc
Wang Meng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/09/2019
Field of study

Recently, several studies have explored methods for using KG embedding to answer logical queries. These approaches either treat embedding learning and query answering as two separated learning tasks, or fail to deal with the variability of contributions from different query paths. We proposed to leverage a graph attention mechanism to handle the unequal contribution of different query paths. However, commonly used graph attention assumes that the center node embedding is provided, which is unavailable in this task since the center node is to be predicted. To solve this problem we propose a multi-head attention-based end-to-end logical query answering model, called Contextual Graph Attention model(CGA), which uses an initial neighborhood aggregation layer to generate the center embedding, and the whole model is trained jointly on the original KG structure as well as the sampled query-answer pairs. We also introduce two new datasets, DB18 and WikiGeo19, which are rather large in size compared to the existing datasets and contain many more relation types, and use them to evaluate the performance of the proposed model. Our result shows that the proposed CGA with fewer learnable parameters consistently outperforms the baseline models on both datasets as well as Bio dataset.Comment: 8 pages, 3 figures, camera ready version of article accepted to K-CAP 2019, Marina del Rey, California, United State

arXiv.org e-Print Archive

Crossref

Explore Bristol Research