Search CORE

14 research outputs found

Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?

Author: Dambre Joni
De Neve Wesley
Demeester Thomas
Demuynck Kris
Godin Fréderic
Publication venue
Publication date: 01/01/2018
Field of study

Character-level features are currently used in different neural network-based natural language processing algorithms. However, little is known about the character-level patterns those models learn. Moreover, models are often compared only quantitatively while a qualitative analysis is missing. In this paper, we investigate which character-level patterns neural networks learn and if those patterns coincide with manually-defined word segmentations and annotations. To that end, we extend the contextual decomposition technique (Murdoch et al. 2018) to convolutional neural networks which allows us to compare convolutional neural networks and bidirectional long short-term memory networks. We evaluate and compare these models for the task of morphological tagging on three morphologically different languages and show that these models implicitly discover understandable linguistic rules. Our implementation can be found at https://github.com/FredericGodin/ContextualDecomposition-NLP .Comment: Accepted at EMNLP 201

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Tagging a Japanese Learner Corpus of English and Comparing Trigrams with Those in a Corpus of British Students\u27 Essays

Author: Kamakura Yoshihito
Publication venue: 愛知大学語学教育研究室
Publication date: 01/07/2012
Field of study

Aichi University of Education: AUE Repository / 愛知教育大学学術情報リポジトリ

Natural Language Processing

Author: Steedman Mark
Publication venue: ScholarlyCommons
Publication date: 01/06/1994
Field of study

The subject of Natural Language Processing can be considered in both broad and narrow senses. In the broad sense, it covers processing issues at all levels of natural language understanding, including speech recognition, syntactic and semantic analysis of sentences, reference to the discourse context (including anaphora, inference of referents, and more extended relations of discourse coherence and narrative structure), conversational inference and implicature, and discourse planning and generation. In the narrower sense, it covers the syntactic and semantic processing sentences to deliver semantic objects suitable for referring, inferring, and the like. Of course, the results of inference and reference may under some circumstances play a part in processing in the narrow sense. But the processes that are characteristic of these other modules are not the primary concern

ScholarlyCommons@Penn

Theory and applications in information extraction from unstructured text

Author: Wu Tianhao
Publication venue: Lehigh Preserve
Publication date
Field of study

Lehigh University: Lehigh Preserve

Proceedings of the International Workshop on Text Mining Research, Practice and Opportunities

Author
Publication venue
Publication date: 24/09/2005
Field of study

The University of Manchester - Institutional Repository

Larger-first partial parsing

Author: Van Delden Sebastian Alexander
Publication venue: University of Central Florida
Publication date: 01/01/2003
Field of study

Larger-first partial parsing is a primarily top-down approach to partial parsing that is opposite to current easy-first, or primarily bottom-up, strategies. A rich partial tree structure is captured by an algorithm that assigns a hierarchy of structural tags to each of the input tokens in a sentence. Part-of-speech tags are first assigned to the words in a sentence by a part-of-speech tagger. A cascade of Deterministic Finite State Automata then uses this part-of-speech information to identify syntactic relations primarily in a descending order of their size. The cascade is divided into four specialized sections: (1) a Comma Network, which identifies syntactic relations associated with commas; (2) a Conjunction Network, which partially disambiguates phrasal conjunctions and llly disambiguates clausal conjunctions; (3) a Clause Network, which identifies non-comma-delimited clauses; and (4) a Phrase Network, which identifies the remaining base phrases in the sentence. Each automaton is capable of adding one or more levels of structural tags to the tokens in a sentence. The larger-first approach is compared against a well-known easy-first approach. The results indicate that this larger-first approach is capable of (1) producing a more detailed partial parse than an easy first approach; (2) providing better containment of attachment ambiguity; (3) handling overlapping syntactic relations; and (4) achieving a higher accuracy than the easy-first approach. The automata of each network were developed by an empirical analysis of several sources and are presented here in detail

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

La catégorisation grammaticale automatique : adaptation du catégoriseur de Brill au français et modification de l'approche

Author: Thibeault Mélanie
Publication venue: Bibliotheque de l' Universite Laval
Publication date: 01/01/2004
Field of study

Tableau d’honneur de la Faculté des études supérieures et postdoctorales, 2004-2005La catégorisation grammaticale automatique est un domaine où il reste encore beaucoup à faire. De très bons catégoriseurs existent pour l'anglais, mais ceux dont dispose la communauté francophone sont beaucoup moins efficaces. Nous avons donc entraîné le catégoriseur de Brill pour le français pour ensuite en améliorer les résultats. Par ailleurs, quelle que soit la technique utilisée, certains problèmes restent irrésolus. Les mots inconnus sont toujours difficiles à catégoriser correctement. Nous avons tenté de trouver des solutions à ce problème. En somme, nous avons apporté une série de modifications à l'approche de Brill et évalué l'impact de celles-ci sur les performances. Les modifications apportées ont permis de faire passer les performances du traitement des mots inconnus français de 70,7% à 78,6%. Nous avons donc amélioré sensiblement les performances bien qu'il reste encore beaucoup de travail à faire avant que le traitement des mots inconnus français soit satisfaisant

CorpusUL

Designing Service-Oriented Chatbot Systems Using a Construction Grammar-Driven Natural Language Generation System

Author: Jenkins Marie-Claire
Publication venue
Publication date: 30/04/2011
Field of study

Service oriented chatbot systems are used to inform users in a conversational manner about a particular service or product on a website. Our research shows that current systems are time consuming to build and not very accurate or satisfying to users. We find that natural language understanding and natural language generation methods are central to creating an e�fficient and useful system. In this thesis we investigate current and past methods in this research area and place particular emphasis on Construction Grammar and its computational implementation. Our research shows that users have strong emotive reactions to how these systems behave, so we also investigate the human computer interaction component. We present three systems (KIA, John and KIA2), and carry out extensive user tests on all of them, as well as comparative tests. KIA is built using existing methods, John is built with the user in mind and KIA2 is built using the construction grammar method. We found that the construction grammar approach performs well in service oriented chatbots systems, and that users preferred it over other systems

University of East Anglia digital repository