Search CORE

5 research outputs found

Neural-Hidden-CRF: A Robust Weakly-Supervised Sequence Labeler

Author: Chen Pengpeng
Chen Zhijun
Mao Qianren
Sun Hailong
Xu Chunyi
Zhang Wanhao
Publication venue
Publication date: 28/09/2023
Field of study

We propose a neuralized undirected graphical model called Neural-Hidden-CRF to solve the weakly-supervised sequence labeling problem. Under the umbrella of probabilistic undirected graph theory, the proposed Neural-Hidden-CRF embedded with a hidden CRF layer models the variables of word sequence, latent ground truth sequence, and weak label sequence with the global perspective that undirected graphical models particularly enjoy. In Neural-Hidden-CRF, we can capitalize on the powerful language model BERT or other deep models to provide rich contextual semantic knowledge to the latent ground truth sequence, and use the hidden CRF layer to capture the internal label dependencies. Neural-Hidden-CRF is conceptually simple and empirically powerful. It obtains new state-of-the-art results on one crowdsourcing benchmark and three weak-supervision benchmarks, including outperforming the recent advanced model CHMM by 2.80 F1 points and 2.23 F1 points in average generalization and inference performance, respectively.Comment: 13 pages, 4 figures, accepted by SIGKDD-202

arXiv.org e-Print Archive

Bipartite Graph Pre-training for Unsupervised Extractive Summarization with Graph Convolutional Auto-Encoders

Author: Gu Xiaolei
He Shizhu
Li Bo
Li Jianxin
Li Jiarui
Mao Qianren
Zhao Shaobo
Publication venue
Publication date: 29/10/2023
Field of study

Pre-trained sentence representations are crucial for identifying significant sentences in unsupervised document extractive summarization. However, the traditional two-step paradigm of pre-training and sentence-ranking, creates a gap due to differing optimization objectives. To address this issue, we argue that utilizing pre-trained embeddings derived from a process specifically designed to optimize cohensive and distinctive sentence representations helps rank significant sentences. To do so, we propose a novel graph pre-training auto-encoder to obtain sentence embeddings by explicitly modelling intra-sentential distinctive features and inter-sentential cohesive features through sentence-word bipartite graphs. These pre-trained sentence representations are then utilized in a graph-based ranking algorithm for unsupervised summarization. Our method produces predominant performance for unsupervised summarization frameworks by providing summary-worthy sentence representations. It surpasses heavy BERT- or RoBERTa-based sentence representations in downstream tasks.Comment: Accepted by the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023

arXiv.org e-Print Archive

Automated Timeline Length Selection for Flexible Timeline Summarization

Author: Li Jianxin
Li Xi
Mao Qianren
Peng Hao
Wang Zheng
Zhu Hongdong
Publication venue
Publication date: 28/05/2021
Field of study

By producing summaries for long-running events, timeline summarization (TLS) underpins many information retrieval tasks. Successful TLS requires identifying an appropriate set of key dates (the timeline length) to cover. However, doing so is challenging as the right length can change from one topic to another. Existing TLS solutions either rely on an event-agnostic fixed length or an expert-supplied setting. Neither of the strategies is desired for real-life TLS scenarios. A fixed, event-agnostic setting ignores the diversity of events and their development and hence can lead to low-quality TLS. Relying on expert-crafted settings is neither scalable nor sustainable for processing many dynamically changing events. This paper presents a better TLS approach for automatically and dynamically determining the TLS timeline length. We achieve this by employing the established elbow method from the machine learning community to automatically find the minimum number of dates within the time series to generate concise and informative summaries. We applied our approach to four TLS datasets of English and Chinese and compared them against three prior methods. Experimental results show that our approach delivers comparable or even better summaries over state-of-art TLS methods, but it achieves this without expert involvement

arXiv.org e-Print Archive

Event prediction based on evolutionary event ontology knowledge

Author: Bahdanau
Bizer
Bobrow
Chambers
Che
Cho
Decroos
Ding
Dongxiao He
Duchi
Esteban
Gaugaz
Gmati
Granroth-Wilding
Hao Peng
Hashimoto
Hashimoto
Hossny
Jans
Ji
Jianxin Li
Jin
Le
Lee
Lei
Li
Lihong Wang
Lin
Liu
Liu
Liu
Martino
Miller
Min He
Modi
Okawa
Pan
Peng
Pichotta
Pichotta
Qianren Mao
Qiu
Radinsky
Rajpurkar
Rajpurkar
Ristea
Rodrigues
Schank
Shu Guo
Sowriraghavan
Tang
Tran
Tran
Wang
Wang
Wang
Xi Li
Zhang
Zhang
Zhao
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

DYRK1B-dependent autocrine-to-paracrine shift of Hedgehog signaling by mutant RAS

Author: A Steg
AF Hezel
B Stecca
C Guerra
C Guerra
Carmen Guerra
CJ Haycraft
D Huangfu
DM Berman
DN Watkins
ES Seeley
G Feldmann
G Feldmann
H Nakashima
H Tian
J Mao
J Svard
J Taipale
JK Chen
JP De La O
JP Morton
K Jin
K Kasai
K Quint
KE Galvin
L Schneider
LL Rubin
M Heidenblad
M Lauth
M Lauth
M Lauth
M Malumbres
M Varjosalo
Mariano Barbacid
Matthias Lauth
MS Ikram
N Habbe
NB Prasad
O Nolan-Stevaux
Qianren Jin
RL Yauch
Rune Toftgård
S Dennler
S Jones
S Schubbert
SK Nielsen
SP Thayer
SR Hingorani
T Shimokawa
Takashi Shimokawa
Ulrica Tostar
V Fendrich
Volker Fendrich
Y Wakabayashi
Z Ji
Åsa Bergström
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref