Search CORE

151 research outputs found

DsMtGCN: A Direction-sensitive Multi-task framework for Knowledge Graph Completion

Author: Chen Chuan
Wang Jining
Zheng Zibin
Zhou Yuren
Publication venue
Publication date: 17/06/2023
Field of study

To solve the inherent incompleteness of knowledge graphs (KGs), numbers of knowledge graph completion (KGC) models have been proposed to predict missing links from known triples. Among those, several works have achieved more advanced results via exploiting the structure information on KGs with Graph Convolutional Networks (GCN). However, we observe that entity embeddings aggregated from neighbors in different directions are just simply averaged to complete single-tasks by existing GCN based models, ignoring the specific requirements of forward and backward sub-tasks. In this paper, we propose a Direction-sensitive Multi-task GCN (DsMtGCN) to make full use of the direction information, the multi-head self-attention is applied to specifically combine embeddings in different directions based on various entities and sub-tasks, the geometric constraints are imposed to adjust the distribution of embeddings, and the traditional binary cross-entropy loss is modified to reflect the triple uncertainty. Moreover, the competitive experiments results on several benchmark datasets verify the effectiveness of our model

arXiv.org e-Print Archive

Contextual Dictionary Lookup for Knowledge Graph Completion

Author: Chen Chuan
Liu YouMing
Qiu Delai
Wang Jining
Wang Yining
Zheng Zibin
Zhou Yuren
Publication venue
Publication date: 13/06/2023
Field of study

Knowledge graph completion (KGC) aims to solve the incompleteness of knowledge graphs (KGs) by predicting missing links from known triples, numbers of knowledge graph embedding (KGE) models have been proposed to perform KGC by learning embeddings. Nevertheless, most existing embedding models map each relation into a unique vector, overlooking the specific fine-grained semantics of them under different entities. Additionally, the few available fine-grained semantic models rely on clustering algorithms, resulting in limited performance and applicability due to the cumbersome two-stage training process. In this paper, we present a novel method utilizing contextual dictionary lookup, enabling conventional embedding models to learn fine-grained semantics of relations in an end-to-end manner. More specifically, we represent each relation using a dictionary that contains multiple latent semantics. The composition of a given entity and the dictionary's central semantics serves as the context for generating a lookup, thus determining the fine-grained semantics of the relation adaptively. The proposed loss function optimizes both the central and fine-grained semantics simultaneously to ensure their semantic consistency. Besides, we introduce two metrics to assess the validity and accuracy of the dictionary lookup operation. We extend several KGE models with the method, resulting in substantial performance improvements on widely-used benchmark datasets

arXiv.org e-Print Archive

Towards an Understanding of Large Language Models in Software Engineering Tasks

Author: Chen Jiachi
Chen Wenqing
Guo Lianghong
Ning Kaiwen
Wang Weicheng
Wang Yanlin
Zheng Zibin
Publication venue
Publication date: 22/08/2023
Field of study

Large Language Models (LLMs) have drawn widespread attention and research due to their astounding performance in tasks such as text generation and reasoning. Derivative products, like ChatGPT, have been extensively deployed and highly sought after. Meanwhile, the evaluation and optimization of LLMs in software engineering tasks, such as code generation, have become a research focus. However, there is still a lack of systematic research on the application and evaluation of LLMs in the field of software engineering. Therefore, this paper is the first to comprehensively investigate and collate the research and products combining LLMs with software engineering, aiming to answer two questions: (1) What are the current integrations of LLMs with software engineering? (2) Can LLMs effectively handle software engineering tasks? To find the answers, we have collected related literature as extensively as possible from seven mainstream databases, and selected 123 papers for analysis. We have categorized these papers in detail and reviewed the current research status of LLMs from the perspective of seven major software engineering tasks, hoping this will help researchers better grasp the research trends and address the issues when applying LLMs. Meanwhile, we have also organized and presented papers with evaluation content to reveal the performance and effectiveness of LLMs in various software engineering tasks, providing guidance for researchers and developers to optimize

arXiv.org e-Print Archive

When ChatGPT Meets Smart Contract Vulnerability Detection: How Far Are We?

Author: Bi Tingting
Chen Chong
Chen Jiachi
Chen Ting
Lin Xingwei
Su Jianzhong
Wang Yanli
Wang Yanlin
Zheng Zibin
Publication venue
Publication date: 14/09/2023
Field of study

With the development of blockchain technology, smart contracts have become an important component of blockchain applications. Despite their crucial role, the development of smart contracts may introduce vulnerabilities and potentially lead to severe consequences, such as financial losses. Meanwhile, large language models, represented by ChatGPT, have gained great attentions, showcasing great capabilities in code analysis tasks. In this paper, we presented an empirical study to investigate the performance of ChatGPT in identifying smart contract vulnerabilities. Initially, we evaluated ChatGPT's effectiveness using a publicly available smart contract dataset. Our findings discover that while ChatGPT achieves a high recall rate, its precision in pinpointing smart contract vulnerabilities is limited. Furthermore, ChatGPT's performance varies when detecting different vulnerability types. We delved into the root causes for the false positives generated by ChatGPT, and categorized them into four groups. Second, by comparing ChatGPT with other state-of-the-art smart contract vulnerability detection tools, we found that ChatGPT's F-score is lower than others for 3 out of the 7 vulnerabilities. In the case of the remaining 4 vulnerabilities, ChatGPT exhibits a slight advantage over these tools. Finally, we analyzed the limitation of ChatGPT in smart contract vulnerability detection, revealing that the robustness of ChatGPT in this field needs to be improved from two aspects: its uncertainty in answering questions; and the limited length of the detected code. In general, our research provides insights into the strengths and weaknesses of employing large language models, specifically ChatGPT, for the detection of smart contract vulnerabilities

arXiv.org e-Print Archive

A Novel Two-Layer DAG-based Reactive Protocol for IoT Data Reliability in Metaverse

Author: Chin Kwan-Wu
Huang Huawei
Liu Ying
Wang Jiguang
Yang Changlin
Zheng Zibin
Publication venue
Publication date: 11/04/2023
Field of study

Many applications, e.g., digital twins, rely on sensing data from Internet of Things (IoT) networks, which is used to infer event(s) and initiate actions to affect an environment. This gives rise to concerns relating to data integrity and provenance. One possible solution to address these concerns is to employ blockchain. However, blockchain has high resource requirements, thereby making it unsuitable for use on resource-constrained IoT devices. To this end, this paper proposes a novel approach, called two-layer directed acyclic graph (2LDAG), whereby IoT devices only store a digital fingerprint of data generated by their neighbors. Further, it proposes a novel proof-of-path (PoP) protocol that allows an operator or digital twin to verify data in an on-demand manner. The simulation results show 2LDAG has storage and communication cost that is respectively two and three orders of magnitude lower than traditional blockchain and also blockchains that use a DAG structure. Moreover, 2LDAG achieves consensus even when 49\% of nodes are malicious

arXiv.org e-Print Archive

High serum mannose in colorectal cancer: a novel biomarker of lymph node metastasis and poor prognosis

Author: Haoran Li
Xiaotian Chang
Xueling Wang
Xueling Wang
Zibin Tian
Publication venue: Frontiers Media S.A.
Publication date: 01/08/2023
Field of study

BackgroundLymph node status is an important prognostic indicator and it significantly influences treatment decisions for colorectal cancer (CRC). The objective of this study was to evaluate the ability of serum monosaccharides in predicting lymph node metastasis (LNM) and prognosis.MethodsHigh performance anion exchange chromatography coupled with pulsed amperometric detector (HPAEC-PAD) was used to quantify serum monosaccharides from 252 CRC patients. Receiver operating characteristic (ROC) curves were used to evaluate predictive performance of parameters. Predictors of LNM were evaluated by univariate and multivariate analyses. The prognostic role of the factors was evaluated by survival analysis.ResultsThe levels of serum mannose (Man) and galactose (Gal) were significantly increased in patients with LNM (p <0.0001, p =0.0017, respectively). The area under the curves (AUCs) of Man was 0.8140, which was higher than carcinoembryonic antigen (CEA) (AUC =0.6523). Univariate and multivariate analyses demonstrated histologic grade (G3) (odds ratio [OR] =2.60, p =0.043), histologic grade (mucin-producing subtype) (odds ratio [OR] =3.38, p =0.032), lymphovascular invasion (LVI) (OR =2.42, p <0.01), CEA (>5ng/ml) (OR =1.85, p =0.042) and high Man (OR =2.65, p =0.006) to be independent risk factors of LNM. The survival analysis showed that the high serum Man was independent risk factor for poor prognosis in CRC patients (HR=1.75, p =0.004).ConclusionsThe Man is superior to CEA in prediction of LNM for CRC patients. Man is expected to be a predictor for LNM in CRC. High serum Man is associated with poor prognosis of CRC patients

Directory of Open Access Journals

Phased Treatment Strategies for Cerebral Ischemia Based on Glutamate Receptors

Author: Jun Yao
Long Wang
Mengting Li
Xue Feng
Yongjun Sun
Yongjun Sun
Yue Ding
Zibin Gao
Zibin Gao
Publication venue: 'Frontiers Media SA'
Publication date: 01/04/2019
Field of study

Extracellular glutamate accumulation following cerebral ischemia leads to overactivation of glutamate receptors, thereby resulting in intracellular Ca2+ overload and excitotoxic neuronal injury. Multiple attempts have been made to counteract such effects by reducing glutamate receptor function, but none have been successful. In this minireview, we present the available evidence regarding the role of all types of ionotropic and metabotropic glutamate receptors in cerebral ischemia and propose phased treatment strategies based on glutamate receptors in both the acute and post-acute phases of cerebral ischemia, which may help realize the clinical application of glutamate receptor antagonists

Directory of Open Access Journals

What's Behind the Mask: Understanding Masked Graph Modeling for Graph Autoencoders

Author: Chen Liang
Li Jintang
Meng Changhua
Sun Wangbin
Tian Sheng
Wang Weiqiang
Wu Ruofan
Zheng Zibin
Zhu Liang
Publication venue
Publication date: 29/05/2023
Field of study

The last years have witnessed the emergence of a promising self-supervised learning strategy, referred to as masked autoencoding. However, there is a lack of theoretical understanding of how masking matters on graph autoencoders (GAEs). In this work, we present masked graph autoencoder (MaskGAE), a self-supervised learning framework for graph-structured data. Different from standard GAEs, MaskGAE adopts masked graph modeling (MGM) as a principled pretext task - masking a portion of edges and attempting to reconstruct the missing part with partially visible, unmasked graph structure. To understand whether MGM can help GAEs learn better representations, we provide both theoretical and empirical evidence to comprehensively justify the benefits of this pretext task. Theoretically, we establish close connections between GAEs and contrastive learning, showing that MGM significantly improves the self-supervised learning scheme of GAEs. Empirically, we conduct extensive experiments on a variety of graph benchmarks, demonstrating the superiority of MaskGAE over several state-of-the-arts on both link prediction and node classification tasks.Comment: KDD 2023 research track. Code available at https://github.com/EdisonLeeeee/MaskGA

arXiv.org e-Print Archive

Language Agents for Detecting Implicit Stereotypes in Text-to-image Models at Scale

Author: Bian Tian
Chen Liang
Cheng Hong
Meng Helen M.
Wang Qichao
Wu Bingzhe
Xu Tingyang
Yin Yian
Zheng Zibin
Publication venue
Publication date: 02/11/2023
Field of study

The recent surge in the research of diffusion models has accelerated the adoption of text-to-image models in various Artificial Intelligence Generated Content (AIGC) commercial products. While these exceptional AIGC products are gaining increasing recognition and sparking enthusiasm among consumers, the questions regarding whether, when, and how these models might unintentionally reinforce existing societal stereotypes remain largely unaddressed. Motivated by recent advancements in language agents, here we introduce a novel agent architecture tailored for stereotype detection in text-to-image models. This versatile agent architecture is capable of accommodating free-form detection tasks and can autonomously invoke various tools to facilitate the entire process, from generating corresponding instructions and images, to detecting stereotypes. We build the stereotype-relevant benchmark based on multiple open-text datasets, and apply this architecture to commercial products and popular open source text-to-image models. We find that these models often display serious stereotypes when it comes to certain prompts about personal characteristics, social cultural context and crime-related aspects. In summary, these empirical findings underscore the pervasive existence of stereotypes across social dimensions, including gender, race, and religion, which not only validate the effectiveness of our proposed approach, but also emphasize the critical necessity of addressing potential ethical risks in the burgeoning realm of AIGC. As AIGC continues its rapid expansion trajectory, with new models and plugins emerging daily in staggering numbers, the challenge lies in the timely detection and mitigation of potential biases within these models

arXiv.org e-Print Archive