Search CORE

90 research outputs found

An integrated approach for Chinese word segmentation

Author: Fu Guohong
Luke K.K
Publication venue: COLIPS PUBLICATIONS
Publication date: 01/01/2003
Field of study

Waseda University Repository

HKU Scholars Hub

Polarity Classification of Short Product Reviews via Multiple Cluster-based SVM Classifiers

Author: Fu Guohong
He Yu
Song Jiaying
Publication venue
Publication date: 01/01/2015
Field of study

Waseda University Repository

A CRF Sequence Labeling Approach to Chinese Punctuation Prediction

Author: Fu Guohong
Wang Chaoyue
Zhao Yanqing
Publication venue: 'Faculty of Computer Science, Universitas Indonesia'
Publication date: 01/01/2012
Field of study

Waseda University Repository

RSpell: Retrieval-augmented Framework for Domain Adaptive Chinese Spelling Check

Author: Cao Ziqiang
Fu Guohong
Geng Lei
Lv Qi
Song Siqi
Publication venue
Publication date: 16/08/2023
Field of study

Chinese Spelling Check (CSC) refers to the detection and correction of spelling errors in Chinese texts. In practical application scenarios, it is important to make CSC models have the ability to correct errors across different domains. In this paper, we propose a retrieval-augmented spelling check framework called RSpell, which searches corresponding domain terms and incorporates them into CSC models. Specifically, we employ pinyin fuzzy matching to search for terms, which are combined with the input and fed into the CSC model. Then, we introduce an adaptive process control mechanism to dynamically adjust the impact of external knowledge on the model. Additionally, we develop an iterative strategy for the RSpell framework to enhance reasoning capabilities. We conducted experiments on CSC datasets in three domains: law, medicine, and official document writing. The results demonstrate that RSpell achieves state-of-the-art performance in both zero-shot and fine-tuning scenarios, demonstrating the effectiveness of the retrieval-augmented CSC framework. Our code is available at https://github.com/47777777/Rspell

arXiv.org e-Print Archive

Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments

Author: Fu Guohong
Jiang Yong
Xia Qingrong
Zhang Min
Zhang Yu
Zhou Shilin
Publication venue
Publication date: 17/09/2022
Field of study

Semantic role labeling (SRL) is a fundamental yet challenging task in the NLP community. Recent works of SRL mainly fall into two lines: 1) BIO-based; 2) span-based. Despite ubiquity, they share some intrinsic drawbacks of not considering internal argument structures, potentially hindering the model's expressiveness. The key challenge is arguments are flat structures, and there are no determined subtree realizations for words inside arguments. To remedy this, in this paper, we propose to regard flat argument spans as latent subtrees, accordingly reducing SRL to a tree parsing task. In particular, we equip our formulation with a novel span-constrained TreeCRF to make tree structures span-aware and further extend it to the second-order case. We conduct extensive experiments on CoNLL05 and CoNLL12 benchmarks. Results reveal that our methods perform favorably better than all previous syntax-agnostic works, achieving new state-of-the-art under both end-to-end and w/ gold predicates settings.Comment: COLING 202

arXiv.org e-Print Archive

Syntax-aware Neural Semantic Role Labeling

Author: Fu Guohong
Li Zhenghua
Si Luo
Wang Rui
Xia Qingrong
Zhang Meishan
Zhang Min
Publication venue
Publication date: 17/07/2019
Field of study

Semantic role labeling (SRL), also known as shallow semantic parsing, is an important yet challenging task in NLP. Motivated by the close correlation between syntactic and semantic structures, traditional discrete-feature-based SRL approaches make heavy use of syntactic features. In contrast, deep-neural-network-based approaches usually encode the input sentence as a word sequence without considering the syntactic structures. In this work, we investigate several previous approaches for encoding syntactic trees, and make a thorough study on whether extra syntax-aware representations are beneficial for neural SRL models. Experiments on the benchmark CoNLL-2005 dataset show that syntax-aware SRL approaches can effectively improve performance over a strong baseline with external word representations from ELMo. With the extra syntax-aware representations, our approaches achieve new state-of-the-art 85.6 F1 (single model) and 86.6 F1 (ensemble) on the test data, outperforming the corresponding strong baselines with ELMo by 0.8 and 1.0, respectively. Detailed error analysis are conducted to gain more insights on the investigated approaches.Comment: AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications