Search CORE

31 research outputs found

Constituent Structure for Filipino: Induction through Probabilistic Approaches

Author: Alcantara Danniel
Borra Allan
Publication venue: De La Salle University - Dasmarinas
Publication date: 01/01/2008
Field of study

PACLIC / The University of the Philippines Visayas Cebu College Cebu City, Philippines / November 20-22, 200

Waseda University Repository

A fully Bayesian approach to unsupervised part-of-speech tagging

Author: Goldwater Sharon
Griffiths Tom
Publication venue
Publication date: 01/06/2007
Field of study

Edinburgh Research Explorer

Increasing the Recall of Corpus Annotation Error Detection

Author: Boyd Adriane
Dickinson Markus
Meurers Detmar
Publication venue
Publication date: 01/01/2007
Field of study

Proceedings of the Sixth International Workshop on Treebanks and Linguistic Theories. Editors: Koenraad De Smedt, Jan Hajič and Sandra Kübler. NEALT Proceedings Series, Vol. 1 (2007), 19-30. © 2007 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/4476

CiteSeerX

DSpace at Tartu University Library

Tree Transformer: Integrating Tree Structures into Self-Attention

Author: Chen Yun-Nung
Lee Hung-Yi
Wang Yau-Shian
Publication venue
Publication date: 01/01/2019
Field of study

Pre-training Transformer from large-scale raw texts and fine-tuning on the desired task have achieved state-of-the-art results on diverse NLP tasks. However, it is unclear what the learned attention captures. The attention computed by attention heads seems not to match human intuitions about hierarchical structures. This paper proposes Tree Transformer, which adds an extra constraint to attention heads of the bidirectional Transformer encoder in order to encourage the attention heads to follow tree structures. The tree structures can be automatically induced from raw texts by our proposed "Constituent Attention" module, which is simply implemented by self-attention between two adjacent words. With the same training procedure identical to BERT, the experiments demonstrate the effectiveness of Tree Transformer in terms of inducing tree structures, better language modeling, and further learning more explainable attention scores.Comment: accepted by EMNLP 201

arXiv.org e-Print Archive

Crossref