Search CORE

358 research outputs found

Dependency Grammar Induction with Neural Lexicalization and Big Training Data

Author: Han Wenjuan
Jiang Yong
Tu Kewei
Publication venue
Publication date: 01/01/2017
Field of study

We study the impact of big models (in terms of the degree of lexicalization) and big data (in terms of the training corpus size) on dependency grammar induction. We experimented with L-DMV, a lexicalized version of Dependency Model with Valence and L-NDMV, our lexicalized extension of the Neural Dependency Model with Valence. We find that L-DMV only benefits from very small degrees of lexicalization and moderate sizes of training corpora. L-NDMV can benefit from big training data and lexicalization of greater degrees, especially when enhanced with good model initialization, and it achieves a result that is competitive with the current state-of-the-art.Comment: EMNLP 201

arXiv.org e-Print Archive

Crossref

ネツ　ユウキ　ソウブンリ　ヲ　リヨウ　シタ　ハンノウセイ　アクリル　ジュシ　モノリス　ノ　サクセイ　ト　オウヨウ

Author: Han Wenjuan
カンブンケン
Publication venue: 'Springer Publishing Company'
Publication date
Field of study

Osaka University Knowledge Archive

On the Robustness of Question Rewriting Systems to Questions of Varying Hardness

Author: Han Wenjuan
Ng Hwee Tou
Ye Hai
Publication venue
Publication date: 12/11/2023
Field of study

In conversational question answering (CQA), the task of question rewriting~(QR) in context aims to rewrite a context-dependent question into an equivalent self-contained question that gives the same answer. In this paper, we are interested in the robustness of a QR system to questions varying in rewriting hardness or difficulty. Since there is a lack of questions classified based on their rewriting hardness, we first propose a heuristic method to automatically classify questions into subsets of varying hardness, by measuring the discrepancy between a question and its rewrite. To find out what makes questions hard or easy for rewriting, we then conduct a human evaluation to annotate the rewriting hardness of questions. Finally, to enhance the robustness of QR systems to questions of varying hardness, we propose a novel learning framework for QR that first trains a QR model independently on each subset of questions of a certain level of hardness, then combines these QR models as one joint model for inference. Experimental results on two datasets show that our framework improves the overall performance compared to the baselines.Comment: ACL'22, main, long pape

arXiv.org e-Print Archive

VGStore: A Multimodal Extension to SPARQL for Querying RDF Scene Graph

Author: Han Wenjuan
Li Yanzeng
Zheng Zilong
Zou Lei
Publication venue
Publication date: 07/09/2022
Field of study

Semantic Web technology has successfully facilitated many RDF models with rich data representation methods. It also has the potential ability to represent and store multimodal knowledge bases such as multimodal scene graphs. However, most existing query languages, especially SPARQL, barely explore the implicit multimodal relationships like semantic similarity, spatial relations, etc. We first explored this issue by organizing a large-scale scene graph dataset, namely Visual Genome, in the RDF graph database. Based on the proposed RDF-stored multimodal scene graph, we extended SPARQL queries to answer questions containing relational reasoning about color, spatial, etc. Further demo (i.e., VGStore) shows the effectiveness of customized queries and displaying multimodal data.Comment: ISWC 2022 Posters, Demos, and Industry Track

arXiv.org e-Print Archive

Modeling Instance Interactions for Joint Information Extraction with Neural High-Order Conditional Random Field

Author: Han Wenjuan
Jia Zixia
Tu Kewei
Yan Zhaohui
Zheng Zilong
Publication venue
Publication date: 28/05/2023
Field of study

Prior works on joint Information Extraction (IE) typically model instance (e.g., event triggers, entities, roles, relations) interactions by representation enhancement, type dependencies scoring, or global decoding. We find that the previous models generally consider binary type dependency scoring of a pair of instances, and leverage local search such as beam search to approximate global solutions. To better integrate cross-instance interactions, in this work, we introduce a joint IE framework (CRFIE) that formulates joint IE as a high-order Conditional Random Field. Specifically, we design binary factors and ternary factors to directly model interactions between not only a pair of instances but also triplets. Then, these factors are utilized to jointly predict labels of all instances. To address the intractability problem of exact high-order inference, we incorporate a high-order neural decoder that is unfolded from a mean-field variational inference method, which achieves consistent learning and inference. The experimental results show that our approach achieves consistent improvements on three IE tasks compared with our baseline and prior work

arXiv.org e-Print Archive

Vasculogenic mimicry contributes to lymph node metastasis of laryngeal squamous cell carcinoma

Author: Cai Wenjuan
Han Chunrong
Lin Peng
Sun Baocun
Wang Wei
Zhao Xiulan
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Survival of laryngeal squamous cell carcinoma (LSCC) patients has remained unchanged over recent years due to its uncontrolled recurrence and local lymph node metastasis. Vasculogenic mimicry (VM) is an alternative type of blood supplement related to more aggressive tumor biology and increased tumor-related mortality. This study aimed to investigate the unique role of VM in the progression of LSCC. Methods We reviewed clinical pathological data of 203 cases of LSCC both prospectively and retrospectively. VM and endothelium-dependent vessel (EDV) were detected by immunohistochemistry and double staining to compare their different clinical pathological significance in LSCC. Survival analyses were performed to assess their prognostic significance as well. Results Both VM and EDV existed in LSCC type of blood supply. VM is related to pTNM stage, lymph node metastasis and pathology grade. In contrust, EDV related to location, pTNM stage, T stage and distant metastasis. Univariate analysis showed VM, pTNM stage, T classification, nodal status, histopathological grade, tumor size, and radiotherapy to be related to overall survival (OS). While, VM, location, tumor size and radiotherapy were found to relate to disease free survival (DFS). Multivariate analysis indicated that VM, but not EDV, was an adverse predictor for both OS and DFS. Conclusions VM existed in LSCC. It contributed to the progression of LSCC by promoting lymph node metastasis. It is an independent predictors of a poor prognosis of LSCC.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central