Search CORE

91 research outputs found

Inductive Relation Prediction from Relational Paths and Context with Hierarchical Transformers

Author: Li Jiaang
Mao Zhendong
Wang Quan
Publication venue
Publication date: 19/04/2023
Field of study

Relation prediction on knowledge graphs (KGs) is a key research topic. Dominant embedding-based methods mainly focus on the transductive setting and lack the inductive ability to generalize to new entities for inference. Existing methods for inductive reasoning mostly mine the connections between entities, i.e., relational paths, without considering the nature of head and tail entities contained in the relational context. This paper proposes a novel method that captures both connections between entities and the intrinsic nature of entities, by simultaneously aggregating RElational Paths and cOntext with a unified hieRarchical Transformer framework, namely REPORT. REPORT relies solely on relation semantics and can naturally generalize to the fully-inductive setting, where KGs for training and inference have no common entities. In the experiments, REPORT performs consistently better than all baselines on almost all the eight version subsets of two fully-inductive datasets. Moreover. REPORT is interpretable by providing each element's contribution to the prediction results.Comment: Accepted by ICASSP 2023 (Oral

arXiv.org e-Print Archive

Copyright Violations and Large Language Models

Author: Karamolegkou Antonia
Li Jiaang
Søgaard Anders
Zhou Li
Publication venue
Publication date: 20/10/2023
Field of study

Language models may memorize more than just facts, including entire chunks of texts seen during training. Fair use exemptions to copyright laws typically allow for limited use of copyrighted material without permission from the copyright holder, but typically for extraction of information from copyrighted materials, rather than {\em verbatim} reproduction. This work explores the issue of copyright violations and large language models through the lens of verbatim memorization, focusing on possible redistribution of copyrighted text. We present experiments with a range of language models over a collection of popular books and coding problems, providing a conservative characterization of the extent to which language models can redistribute these materials. Overall, this research highlights the need for further examination and the potential impact on future developments in natural language processing to ensure adherence to copyright regulations. Code is at \url{https://github.com/coastalcph/CopyrightLLMs}.Comment: EMNLP 202

arXiv.org e-Print Archive

Random Entity Quantization for Parameter-Efficient Compositional Knowledge Graph Representation

Author: Li Jiaang
Liu Yi
Mao Zhendong
Wang Quan
Zhang Licheng
Publication venue
Publication date: 24/10/2023
Field of study

Representation Learning on Knowledge Graphs (KGs) is essential for downstream tasks. The dominant approach, KG Embedding (KGE), represents entities with independent vectors and faces the scalability challenge. Recent studies propose an alternative way for parameter efficiency, which represents entities by composing entity-corresponding codewords matched from predefined small-scale codebooks. We refer to the process of obtaining corresponding codewords of each entity as entity quantization, for which previous works have designed complicated strategies. Surprisingly, this paper shows that simple random entity quantization can achieve similar results to current strategies. We analyze this phenomenon and reveal that entity codes, the quantization outcomes for expressing entities, have higher entropy at the code level and Jaccard distance at the codeword level under random entity quantization. Therefore, different entities become more easily distinguished, facilitating effective KG representation. The above results show that current quantization strategies are not critical for KG representation, and there is still room for improvement in entity distinguishability beyond current strategies. The code to reproduce our results is available at https://github.com/JiaangL/RandomQuantization.Comment: Accepted to EMNLP 202

arXiv.org e-Print Archive

Large Language Models Converge on Brain-Like Word Representations

Author: Abdou Mostafa
Karamolegkou Antonia
Kementchedjhieva Yova
Lehmann Sune
Li Jiaang
Søgaard Anders
Publication venue
Publication date: 02/06/2023
Field of study

One of the greatest puzzles of all time is how understanding arises from neural mechanics. Our brains are networks of billions of biological neurons transmitting chemical and electrical signals along their connections. Large language models are networks of millions or billions of digital neurons, implementing functions that read the output of other functions in complex networks. The failure to see how meaning would arise from such mechanics has led many cognitive scientists and philosophers to various forms of dualism -- and many artificial intelligence researchers to dismiss large language models as stochastic parrots or jpeg-like compressions of text corpora. We show that human-like representations arise in large language models. Specifically, the larger neural language models get, the more their representations are structurally similar to neural response measurements from brain imaging.Comment: Work in proces

arXiv.org e-Print Archive

[[alternative]]Pyrazole compounds and thiazole compounds as protein kinases inhibitors, pharmaceutical composition and the use thereof

Author: Jiaang WT
Publication venue
Publication date
Field of study

[[abstract]]本發明係有關於一種式(I)化合物：其中A、B、D、X、Y、R1 、R2 、R3 、m、p及q定義於內文。亦揭露一種類FMS酪胺酸激酶3、極光激酶或血管內皮生長因子受體之抑制方法。A compound of formula (I):wherein A, B, D, X, Y, R1 , R2 , R3 , m, p, and q are defined herein. Also disclosed is a method for inhibiting FMS-like tyrosine kinase 3, aurora kinase, or vascular endothelial growth factor receptor

National Health Research Institues

Substituted Pyrrolidine-2,4-dicarboxylic Acid Amides as Potent Dipeptidyl Peptidase IV Inhibitors.

Author: Jiaang
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

吡咯烷化合物

Author: Jiaang WT
Publication venue
Publication date
Field of study

[[abstract]]本发明涉及一种吡咯烷化合物，及使用该类化合物抑制二肽基肽酶-IV与治疗第二型糖尿病的方法

National Health Research Institues

[[alternative]]Pyrrolidine compounds

Author: Jiaang WT
Publication venue
Publication date
Field of study

[[abstract]]本發明係有關於一種下式之化合物：其中R、R1 、R2 、R3 、R4 、R5 、R6 、R7 、R8 、及X如內文定義。本發明亦揭露一種以此種化合物有效抑制纖維母細胞活化蛋白或治療癌症或發炎情況之方法。A compound of the following formula: wherein R, R1 , R2 , R3 , R4 , R5 , R6 , R7 , R8 , and X are as defined herein. Also disclosed is a method for inhibiting actively of fibroblast activation protein 或 for treating cancer 或 inflammation conditions with such a compound

National Health Research Institues

Novel Isoindoline Compounds for Potent and Selective Inhibition of Prolyl Dipeptidase DPP8.

Author: Jiaang
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

3-[2-((2S)-2-Cyano-pyrrolidin-1-yl)-2-oxo-ethylamino] -3-methylbutyramide Analogues as Selective DPP-IV Inhibitors for the Treatment of Type-II Diabetes.

Author: Jiaang
Publication venue: 'Wiley'
Publication date
Field of study

Crossref