Search CORE

625 research outputs found

Unsupervised Summarization by Jointly Extracting Sentences and Keywords

Author: Li Zongyi
Zheng Xiaoqing
Publication venue
Publication date: 16/09/2020
Field of study

We present RepRank, an unsupervised graph-based ranking model for extractive multi-document summarization in which the similarity between words, sentences, and word-to-sentence can be estimated by the distances between their vector representations in a unified vector space. In order to obtain desirable representations, we propose a self-attention based learning method that represent a sentence by the weighted sum of its word embeddings, and the weights are concentrated to those words hopefully better reflecting the content of a document. We show that salient sentences and keywords can be extracted in a joint and mutual reinforcement process using our learned representations, and prove that this process always converges to a unique solution leading to improvement in performance. A variant of absorbing random walk and the corresponding sampling-based algorithm are also described to avoid redundancy and increase diversity in the summaries. Experiment results with multiple benchmark datasets show that RepRank achieved the best or comparable performance in ROUGE.Comment: 10 pages(includes 2 pages references), 1 figur

arXiv.org e-Print Archive

SparseGAN: Sparse Generative Adversarial Network for Text Generation

Author: Yuan Liping
Zeng Jiehang
Zheng Xiaoqing
Publication venue
Publication date: 22/03/2021
Field of study

It is still a challenging task to learn a neural text generation model under the framework of generative adversarial networks (GANs) since the entire training process is not differentiable. The existing training strategies either suffer from unreliable gradient estimations or imprecise sentence representations. Inspired by the principle of sparse coding, we propose a SparseGAN that generates semantic-interpretable, but sparse sentence representations as inputs to the discriminator. The key idea is that we treat an embedding matrix as an over-complete dictionary, and use a linear combination of very few selected word embeddings to approximate the output feature representation of the generator at each time step. With such semantic-rich representations, we not only reduce unnecessary noises for efficient adversarial training, but also make the entire training process fully differentiable. Experiments on multiple text generation datasets yield performance improvements, especially in sequence-level metrics, such as BLEU

arXiv.org e-Print Archive

Improving Coreference Resolution by Leveraging Entity-Centric Features with Graph Neural Networks and Second-order Inference

Author: Liu Lu
Song Zhenqiao
Zheng Xiaoqing
Publication venue
Publication date: 09/09/2020
Field of study

One of the major challenges in coreference resolution is how to make use of entity-level features defined over clusters of mentions rather than mention pairs. However, coreferent mentions usually spread far apart in an entire text, which makes it extremely difficult to incorporate entity-level features. We propose a graph neural network-based coreference resolution method that can capture the entity-centric information by encouraging the sharing of features across all mentions that probably refer to the same real-world entity. Mentions are linked to each other via the edges modeling how likely two linked mentions point to the same entity. Modeling by such graphs, the features between mentions can be shared by message passing operations in an entity-centric manner. A global inference algorithm up to second-order features is also presented to optimally cluster mentions into consistent groups. Experimental results show our graph neural network-based method combing with the second-order decoding algorithm (named GNNCR) achieved close to state-of-the-art performance on the English CoNLL-2012 Shared Task dataset

arXiv.org e-Print Archive

Self-tuning fuzzy controller for air-conditioning systems

Author: ZHENG XIAOQING
Publication venue
Publication date: 29/01/2004
Field of study

Master'sMASTER OF ENGINEERIN

ScholarBank@NUS

SPARQL Query Mediation for Data Integration

Author: Li Xitong
Madnick Stuart E.
Zheng Xiaoqing
Publication venue: Massachusetts Institute of Technology. Engineering Systems Division
Publication date: 01/01/2012
Field of study

The Semantic Web provides a set of promising technologies to make sophisticated data integration much easier, because data on the semantic Web is allowed to be connected by links and complex queries can be executed against the dataset of those linked data. Although the Semantic Web techniques offer RDF/OWL to support schematic mappings between diverse data sources, large-scale data integration is still severely hampered by various types of data-level semantic heterogeneity among the data sources. In the paper, we show that SPARQL queries that are intended to execute over multiple heterogeneous data sources can be mediated automatically

DSpace@MIT

Establishment and Optimization of the ISSR‐PCR Reaction System in \u3ci\u3eStipa krylovii\u3c/i\u3e

Author: Lin Li
Sui Xiaoqing
Wang Kun
Zheng Shuhua
Publication venue: UKnowledge
Publication date: 11/04/2021
Field of study

University of Kentucky