Search CORE

140 research outputs found

Active Relation Discovery: Towards General and Label-aware Open Relation Extraction

Author: Chen Xi
Kim Hong-Gee
Li Yangning
Li Yinghui
Shen Ying
Zheng Hai-Tao
Publication venue
Publication date: 08/11/2022
Field of study

Open Relation Extraction (OpenRE) aims to discover novel relations from open domains. Previous OpenRE methods mainly suffer from two problems: (1) Insufficient capacity to discriminate between known and novel relations. When extending conventional test settings to a more general setting where test data might also come from seen classes, existing approaches have a significant performance decline. (2) Secondary labeling must be performed before practical application. Existing methods cannot label human-readable and meaningful types for novel relations, which is urgently required by the downstream tasks. To address these issues, we propose the Active Relation Discovery (ARD) framework, which utilizes relational outlier detection for discriminating known and novel relations and involves active learning for labeling novel relations. Extensive experiments on three real-world datasets show that ARD significantly outperforms previous state-of-the-art methods on both conventional and our proposed general OpenRE settings. The source code and datasets will be available for reproducibility.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

arXiv.org e-Print Archive

Bidirectional End-to-End Learning of Retriever-Reader Paradigm for Entity Linking

Author: Huang Fei
Huang Shen
Jiang Yong
Li Yangning
Li Yinghui
Lu Xingyu
Shen Ying
Xie Pengjun
Zheng Hai-Tao
Publication venue
Publication date: 03/07/2023
Field of study

Entity Linking (EL) is a fundamental task for Information Extraction and Knowledge Graphs. The general form of EL (i.e., end-to-end EL) aims to first find mentions in the given input document and then link the mentions to corresponding entities in a specific knowledge base. Recently, the paradigm of retriever-reader promotes the progress of end-to-end EL, benefiting from the advantages of dense entity retrieval and machine reading comprehension. However, the existing study only trains the retriever and the reader separately in a pipeline manner, which ignores the benefit that the interaction between the retriever and the reader can bring to the task. To advance the retriever-reader paradigm to perform more perfectly on end-to-end EL, we propose BEER

^2

, a Bidirectional End-to-End training framework for Retriever and Reader. Through our designed bidirectional end-to-end training, BEER

^2

guides the retriever and the reader to learn from each other, make progress together, and ultimately improve EL performance. Extensive experiments on benchmarks of multiple domains demonstrate the effectiveness of our proposed BEER

^2

.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

arXiv.org e-Print Archive

A Survey of Natural Language Generation

Author: Chen Miaoxin
Dong Chenhe
Gong Haifan
Li Junxin
Li Yinghui
Shen Ying
Yang Min
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/08/2022
Field of study

This paper offers a comprehensive review of the research on Natural Language Generation (NLG) over the past two decades, especially in relation to data-to-text generation and text-to-text generation deep learning methods, as well as new applications of NLG technology. This survey aims to (a) give the latest synthesis of deep learning research on the NLG core tasks, as well as the architectures adopted in the field; (b) detail meticulously and comprehensively various NLG tasks and datasets, and draw attention to the challenges in NLG evaluation, focusing on different evaluation methods and their relationships; (c) highlight some future emphasis and relatively recent research issues that arise due to the increasing synergy between NLG and other artificial intelligence areas, such as computer vision, text and computational creativity.Comment: Accepted by ACM Computing Survey (CSUR) 202

arXiv.org e-Print Archive

Towards All-around Knowledge Transferring: Learning From Task-irrelevant Labels

Author: Ding Ning
Li Yinghui
Liu Ruiyang
Shen Ying
Tao Linmi
Zhang ZiHao
Zheng Hai-Tao
Publication venue
Publication date: 17/11/2020
Field of study

Deep neural models have hitherto achieved significant performances on numerous classification tasks, but meanwhile require sufficient manually annotated data. Since it is extremely time-consuming and expensive to annotate adequate data for each classification task, learning an empirically effective model with generalization on small dataset has received increased attention. Existing efforts mainly focus on transferring task-relevant knowledge from other similar data to tackle the issue. These approaches have yielded remarkable improvements, yet neglecting the fact that the task-irrelevant features could bring out massive negative transfer effects. To date, no large-scale studies have been performed to investigate the impact of task-irrelevant features, let alone the utilization of this kind of features. In this paper, we firstly propose Task-Irrelevant Transfer Learning (TIRTL) to exploit task-irrelevant features, which mainly are extracted from task-irrelevant labels. Particularly, we suppress the expression of task-irrelevant information and facilitate the learning process of classification. We also provide a theoretical explanation of our method. In addition, TIRTL does not conflict with those that have previously exploited task-relevant knowledge and can be well combined to enable the simultaneous utilization of task-relevant and task-irrelevant features for the first time. In order to verify the effectiveness of our theory and method, we conduct extensive experiments on facial expression recognition and digit recognition tasks. Our source code will be also available in the future for reproducibility

arXiv.org e-Print Archive

Accelerating Primal Solution Findings for Mixed Integer Programs Based on Solution Prediction

Author: Ding Jian-Ya
Li Shengyin
Shen Lei
Song Le
Wang Bing
Xu Yinghui
Zhang Chao
Publication venue
Publication date: 09/09/2019
Field of study

Mixed Integer Programming (MIP) is one of the most widely used modeling techniques for combinatorial optimization problems. In many applications, a similar MIP model is solved on a regular basis, maintaining remarkable similarities in model structures and solution appearances but differing in formulation coefficients. This offers the opportunity for machine learning methods to explore the correlations between model structures and the resulting solution values. To address this issue, we propose to represent an MIP instance using a tripartite graph, based on which a Graph Convolutional Network (GCN) is constructed to predict solution values for binary variables. The predicted solutions are used to generate a local branching type cut which can be either treated as a global (invalid) inequality in the formulation resulting in a heuristic approach to solve the MIP, or as a root branching rule resulting in an exact approach. Computational evaluations on 8 distinct types of MIP problems show that the proposed framework improves the primal solution finding performance significantly on a state-of-the-art open-source MIP solver

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Automatic Context Pattern Generation for Entity Set Expansion

Author: Cao Yunbo
Huang Shulin
Li Yangning
Li Yinghui
Liu Ruiyang
Shen Ying
Zhang Xinwei
Zheng Hai-Tao
Zhou Qingyu
Publication venue
Publication date: 18/07/2022
Field of study

Entity Set Expansion (ESE) is a valuable task that aims to find entities of the target semantic class described by given seed entities. Various NLP and IR downstream applications have benefited from ESE due to its ability to discover knowledge. Although existing bootstrapping methods have achieved great progress, most of them still rely on manually pre-defined context patterns. A non-negligible shortcoming of the pre-defined context patterns is that they cannot be flexibly generalized to all kinds of semantic classes, and we call this phenomenon as "semantic sensitivity". To address this problem, we devise a context pattern generation module that utilizes autoregressive language models (e.g., GPT-2) to automatically generate high-quality context patterns for entities. In addition, we propose the GAPA, a novel ESE framework that leverages the aforementioned GenerAted PAtterns to expand target entities. Extensive experiments and detailed analyses on three widely used datasets demonstrate the effectiveness of our method. All the codes of our experiments will be available for reproducibility.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

arXiv.org e-Print Archive

Towards Real-World Writing Assistance: A Chinese Character Checking Benchmark with Faked and Misspelled Characters

Author: Chen Shaoshen
Huang Haojing
Jiang Yong
Li Yangning
Li Yinghui
Li Zhongli
Shen Ying
Xu Zishan
Zheng Hai-Tao
Zhou Qingyu
Publication venue
Publication date: 19/11/2023
Field of study

Writing assistance is an application closely related to human life and is also a fundamental Natural Language Processing (NLP) research field. Its aim is to improve the correctness and quality of input texts, with character checking being crucial in detecting and correcting wrong characters. From the perspective of the real world where handwriting occupies the vast majority, characters that humans get wrong include faked characters (i.e., untrue characters created due to writing errors) and misspelled characters (i.e., true characters used incorrectly due to spelling errors). However, existing datasets and related studies only focus on misspelled characters mainly caused by phonological or visual confusion, thereby ignoring faked characters which are more common and difficult. To break through this dilemma, we present Visual-C

^3

, a human-annotated Visual Chinese Character Checking dataset with faked and misspelled Chinese characters. To the best of our knowledge, Visual-C

^3

is the first real-world visual and the largest human-crafted dataset for the Chinese character checking scenario. Additionally, we also propose and evaluate novel baseline methods on Visual-C

^3

. Extensive empirical results and analyses show that Visual-C

^3

is high-quality yet challenging. The Visual-C

^3

dataset and the baseline methods will be publicly available to facilitate further research in the community.Comment: Work in progres

arXiv.org e-Print Archive

Recommended from our members

SNARE Zippering Is Suppressed by a Conformational Constraint that Is Removed by v-SNARE Splitting

Author: Liu Yinghui
Rathore Shailendra S
Shen Jingshi
Stowell Michael
Wan Chun
Yu Haijia
Publication venue
Publication date: 01/01/2021
Field of study

Intracellular vesicle fusion is catalyzed by soluble N-ethylmaleimide-sensitive factor attachment protein receptors (SNAREs). Vesicle-anchored v-SNAREs pair with target membrane-associated t-SNAREs to form trans-SNARE complexes, releasing free energy to drive membrane fusion. However, trans-SNARE complexes are unable to assemble efficiently unless activated by Sec1/Munc18 (SM) proteins. Here, we demonstrate that SNAREs become fully active when the v-SNARE is split into two fragments, eliminating the requirement of SM protein activation. Mechanistically, v-SNARE splitting accelerates the zippering of trans-SNARE complexes, mimicking the stimulatory function of SM proteins. Thus, SNAREs possess the full potential to drive efficient membrane fusion but are suppressed by a conformational constraint. This constraint is removed by SM protein activation or v-SNARE splitting. We suggest that ancestral SNAREs originally evolved to be fully active in the absence of SM proteins. Later, a conformational constraint coevolved with SM proteins to achieve the vesicle fusion specificity demanded by complex endomembrane systems.  </p

CU Scholar Institutional Repository

SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding

Author: Cai Jiong
Huang Fei
Huang Shen
Jiang Chengyue
Jiang Yong
Li Yangning
Li Yinghui
Liu Wei
Lou Chao
Tu Kewei
Wang Xiaobin
Xie Pengjun
Yu Tianyu
Zhang Ningyu
Zheng Hai-Tao
Publication venue
Publication date: 21/08/2023
Field of study

Large language models (LLMs) have shown impressive ability for open-domain NLP tasks. However, LLMs are sometimes too footloose for natural language understanding (NLU) tasks which always have restricted output and input format. Their performances on NLU tasks are highly related to prompts or demonstrations and are shown to be poor at performing several representative NLU tasks, such as event extraction and entity typing. To this end, we present SeqGPT, a bilingual (i.e., English and Chinese) open-source autoregressive model specially enhanced for open-domain natural language understanding. We express all NLU tasks with two atomic tasks, which define fixed instructions to restrict the input and output format but still ``open'' for arbitrarily varied label sets. The model is first instruction-tuned with extremely fine-grained labeled data synthesized by ChatGPT and then further fine-tuned by 233 different atomic tasks from 152 datasets across various domains. The experimental results show that SeqGPT has decent classification and extraction ability, and is capable of performing language understanding tasks on unseen domains. We also conduct empirical studies on the scaling of data and model size as well as on the transfer across tasks. Our model is accessible at https://github.com/Alibaba-NLP/SeqGPT.Comment: Initial version of SeqGP

arXiv.org e-Print Archive