151 research outputs found
DsMtGCN: A Direction-sensitive Multi-task framework for Knowledge Graph Completion
To solve the inherent incompleteness of knowledge graphs (KGs), numbers of
knowledge graph completion (KGC) models have been proposed to predict missing
links from known triples. Among those, several works have achieved more
advanced results via exploiting the structure information on KGs with Graph
Convolutional Networks (GCN). However, we observe that entity embeddings
aggregated from neighbors in different directions are just simply averaged to
complete single-tasks by existing GCN based models, ignoring the specific
requirements of forward and backward sub-tasks. In this paper, we propose a
Direction-sensitive Multi-task GCN (DsMtGCN) to make full use of the direction
information, the multi-head self-attention is applied to specifically combine
embeddings in different directions based on various entities and sub-tasks, the
geometric constraints are imposed to adjust the distribution of embeddings, and
the traditional binary cross-entropy loss is modified to reflect the triple
uncertainty. Moreover, the competitive experiments results on several benchmark
datasets verify the effectiveness of our model
Contextual Dictionary Lookup for Knowledge Graph Completion
Knowledge graph completion (KGC) aims to solve the incompleteness of
knowledge graphs (KGs) by predicting missing links from known triples, numbers
of knowledge graph embedding (KGE) models have been proposed to perform KGC by
learning embeddings. Nevertheless, most existing embedding models map each
relation into a unique vector, overlooking the specific fine-grained semantics
of them under different entities. Additionally, the few available fine-grained
semantic models rely on clustering algorithms, resulting in limited performance
and applicability due to the cumbersome two-stage training process. In this
paper, we present a novel method utilizing contextual dictionary lookup,
enabling conventional embedding models to learn fine-grained semantics of
relations in an end-to-end manner. More specifically, we represent each
relation using a dictionary that contains multiple latent semantics. The
composition of a given entity and the dictionary's central semantics serves as
the context for generating a lookup, thus determining the fine-grained
semantics of the relation adaptively. The proposed loss function optimizes both
the central and fine-grained semantics simultaneously to ensure their semantic
consistency. Besides, we introduce two metrics to assess the validity and
accuracy of the dictionary lookup operation. We extend several KGE models with
the method, resulting in substantial performance improvements on widely-used
benchmark datasets
Towards an Understanding of Large Language Models in Software Engineering Tasks
Large Language Models (LLMs) have drawn widespread attention and research due
to their astounding performance in tasks such as text generation and reasoning.
Derivative products, like ChatGPT, have been extensively deployed and highly
sought after. Meanwhile, the evaluation and optimization of LLMs in software
engineering tasks, such as code generation, have become a research focus.
However, there is still a lack of systematic research on the application and
evaluation of LLMs in the field of software engineering. Therefore, this paper
is the first to comprehensively investigate and collate the research and
products combining LLMs with software engineering, aiming to answer two
questions: (1) What are the current integrations of LLMs with software
engineering? (2) Can LLMs effectively handle software engineering tasks? To
find the answers, we have collected related literature as extensively as
possible from seven mainstream databases, and selected 123 papers for analysis.
We have categorized these papers in detail and reviewed the current research
status of LLMs from the perspective of seven major software engineering tasks,
hoping this will help researchers better grasp the research trends and address
the issues when applying LLMs. Meanwhile, we have also organized and presented
papers with evaluation content to reveal the performance and effectiveness of
LLMs in various software engineering tasks, providing guidance for researchers
and developers to optimize
When ChatGPT Meets Smart Contract Vulnerability Detection: How Far Are We?
With the development of blockchain technology, smart contracts have become an
important component of blockchain applications. Despite their crucial role, the
development of smart contracts may introduce vulnerabilities and potentially
lead to severe consequences, such as financial losses. Meanwhile, large
language models, represented by ChatGPT, have gained great attentions,
showcasing great capabilities in code analysis tasks. In this paper, we
presented an empirical study to investigate the performance of ChatGPT in
identifying smart contract vulnerabilities. Initially, we evaluated ChatGPT's
effectiveness using a publicly available smart contract dataset. Our findings
discover that while ChatGPT achieves a high recall rate, its precision in
pinpointing smart contract vulnerabilities is limited. Furthermore, ChatGPT's
performance varies when detecting different vulnerability types. We delved into
the root causes for the false positives generated by ChatGPT, and categorized
them into four groups. Second, by comparing ChatGPT with other state-of-the-art
smart contract vulnerability detection tools, we found that ChatGPT's F-score
is lower than others for 3 out of the 7 vulnerabilities. In the case of the
remaining 4 vulnerabilities, ChatGPT exhibits a slight advantage over these
tools. Finally, we analyzed the limitation of ChatGPT in smart contract
vulnerability detection, revealing that the robustness of ChatGPT in this field
needs to be improved from two aspects: its uncertainty in answering questions;
and the limited length of the detected code. In general, our research provides
insights into the strengths and weaknesses of employing large language models,
specifically ChatGPT, for the detection of smart contract vulnerabilities
A Novel Two-Layer DAG-based Reactive Protocol for IoT Data Reliability in Metaverse
Many applications, e.g., digital twins, rely on sensing data from Internet of
Things (IoT) networks, which is used to infer event(s) and initiate actions to
affect an environment. This gives rise to concerns relating to data integrity
and provenance. One possible solution to address these concerns is to employ
blockchain. However, blockchain has high resource requirements, thereby making
it unsuitable for use on resource-constrained IoT devices. To this end, this
paper proposes a novel approach, called two-layer directed acyclic graph
(2LDAG), whereby IoT devices only store a digital fingerprint of data generated
by their neighbors. Further, it proposes a novel proof-of-path (PoP) protocol
that allows an operator or digital twin to verify data in an on-demand manner.
The simulation results show 2LDAG has storage and communication cost that is
respectively two and three orders of magnitude lower than traditional
blockchain and also blockchains that use a DAG structure. Moreover, 2LDAG
achieves consensus even when 49\% of nodes are malicious
High serum mannose in colorectal cancer: a novel biomarker of lymph node metastasis and poor prognosis
BackgroundLymph node status is an important prognostic indicator and it significantly influences treatment decisions for colorectal cancer (CRC). The objective of this study was to evaluate the ability of serum monosaccharides in predicting lymph node metastasis (LNM) and prognosis.MethodsHigh performance anion exchange chromatography coupled with pulsed amperometric detector (HPAEC-PAD) was used to quantify serum monosaccharides from 252 CRC patients. Receiver operating characteristic (ROC) curves were used to evaluate predictive performance of parameters. Predictors of LNM were evaluated by univariate and multivariate analyses. The prognostic role of the factors was evaluated by survival analysis.ResultsThe levels of serum mannose (Man) and galactose (Gal) were significantly increased in patients with LNM (p <0.0001, p =0.0017, respectively). The area under the curves (AUCs) of Man was 0.8140, which was higher than carcinoembryonic antigen (CEA) (AUC =0.6523). Univariate and multivariate analyses demonstrated histologic grade (G3) (odds ratio [OR] =2.60, p =0.043), histologic grade (mucin-producing subtype) (odds ratio [OR] =3.38, p =0.032), lymphovascular invasion (LVI) (OR =2.42, p <0.01), CEA (>5ng/ml) (OR =1.85, p =0.042) and high Man (OR =2.65, p =0.006) to be independent risk factors of LNM. The survival analysis showed that the high serum Man was independent risk factor for poor prognosis in CRC patients (HR=1.75, p =0.004).ConclusionsThe Man is superior to CEA in prediction of LNM for CRC patients. Man is expected to be a predictor for LNM in CRC. High serum Man is associated with poor prognosis of CRC patients
Phased Treatment Strategies for Cerebral Ischemia Based on Glutamate Receptors
Extracellular glutamate accumulation following cerebral ischemia leads to overactivation of glutamate receptors, thereby resulting in intracellular Ca2+ overload and excitotoxic neuronal injury. Multiple attempts have been made to counteract such effects by reducing glutamate receptor function, but none have been successful. In this minireview, we present the available evidence regarding the role of all types of ionotropic and metabotropic glutamate receptors in cerebral ischemia and propose phased treatment strategies based on glutamate receptors in both the acute and post-acute phases of cerebral ischemia, which may help realize the clinical application of glutamate receptor antagonists
What's Behind the Mask: Understanding Masked Graph Modeling for Graph Autoencoders
The last years have witnessed the emergence of a promising self-supervised
learning strategy, referred to as masked autoencoding. However, there is a lack
of theoretical understanding of how masking matters on graph autoencoders
(GAEs). In this work, we present masked graph autoencoder (MaskGAE), a
self-supervised learning framework for graph-structured data. Different from
standard GAEs, MaskGAE adopts masked graph modeling (MGM) as a principled
pretext task - masking a portion of edges and attempting to reconstruct the
missing part with partially visible, unmasked graph structure. To understand
whether MGM can help GAEs learn better representations, we provide both
theoretical and empirical evidence to comprehensively justify the benefits of
this pretext task. Theoretically, we establish close connections between GAEs
and contrastive learning, showing that MGM significantly improves the
self-supervised learning scheme of GAEs. Empirically, we conduct extensive
experiments on a variety of graph benchmarks, demonstrating the superiority of
MaskGAE over several state-of-the-arts on both link prediction and node
classification tasks.Comment: KDD 2023 research track. Code available at
https://github.com/EdisonLeeeee/MaskGA
Language Agents for Detecting Implicit Stereotypes in Text-to-image Models at Scale
The recent surge in the research of diffusion models has accelerated the
adoption of text-to-image models in various Artificial Intelligence Generated
Content (AIGC) commercial products. While these exceptional AIGC products are
gaining increasing recognition and sparking enthusiasm among consumers, the
questions regarding whether, when, and how these models might unintentionally
reinforce existing societal stereotypes remain largely unaddressed. Motivated
by recent advancements in language agents, here we introduce a novel agent
architecture tailored for stereotype detection in text-to-image models. This
versatile agent architecture is capable of accommodating free-form detection
tasks and can autonomously invoke various tools to facilitate the entire
process, from generating corresponding instructions and images, to detecting
stereotypes. We build the stereotype-relevant benchmark based on multiple
open-text datasets, and apply this architecture to commercial products and
popular open source text-to-image models. We find that these models often
display serious stereotypes when it comes to certain prompts about personal
characteristics, social cultural context and crime-related aspects. In summary,
these empirical findings underscore the pervasive existence of stereotypes
across social dimensions, including gender, race, and religion, which not only
validate the effectiveness of our proposed approach, but also emphasize the
critical necessity of addressing potential ethical risks in the burgeoning
realm of AIGC. As AIGC continues its rapid expansion trajectory, with new
models and plugins emerging daily in staggering numbers, the challenge lies in
the timely detection and mitigation of potential biases within these models
- …