Search CORE

2,202 research outputs found

Lexical typology : a programmatic sketch

Author: Behrens Leila
Sasse Hans-Jürgen
Publication venue
Publication date: 01/01/1997
Field of study

The present paper is an attempt to lay the foundation for Lexical Typology as a new kind of linguistic typology.1 The goal of Lexical Typology is to investigate crosslinguistically significant patterns of interaction between lexicon and grammar

Hochschulschriftenserver - Universität Frankfurt am Main

Natural language processing

Author: Adams
Amsler
Bangalore
Barker
Benoît
Bian
Bondale
Carrick
Ceric
Chandrasekar
Chang
Charniak
Chen
Chowdhury
Chowdhury
Costantino
Cowie
Craven
Craven
Craven
Dogru
Evans
Feldman
Fernandez
Gaizauskas
Glasgow
Haas
Hayes
Hayes
Hedlund
Herath
Ide
Isahara
Jelinek
Jeong
Jurafsky
Kazakov
Kehler
Khoo
Kim
King
Lange
Lee
Lehmam
Lehtokangas
Lewis
Liddy
Liddy
Lovis
Ma
Magnini
Mani
Manning
Marquez
Martinez
Martinez
McMurchie
Meyer
Mihalcea
Mock
Moens
Morin
Narita
Nerbonne
Oard
Ogura
Oudet
Owei
Paris
Pasero
Pedersen
Perez-Carballo
Petreley
Pirkola
Poesio
Rosenfield
Roux
Say
Scarlett
Schenker
Silber
Smeaton
Smeaton
Smith
Sokol
Song
Sparck Jones
Staab
Stock
Tolle
Trybula
Tsuda
Vickery
Waldrop
Warner
Weigard
Wilks
Wong
Yang
Yang
Zadrozny
Zweigenbaum
Publication venue: 'Wiley'
Publication date: 01/01/2003
Field of study

Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

Crossref

University of Strathclyde Institutional Repository

OPUS - University of Technology Sydney

Automated system for the creation and replenishment of users' electronic lexicographical resources

Author: Borysova N. V.
Melnyk K. V.
Publication venue: 'Science and Education A New Dimension'
Publication date: 01/01/2018
Field of study

This article proposes a solution to improve the efficiency of automated generation of electronic lexicographical resources based on strongly-structured electronic information arrays processing. The developed automated information system for lexicographical resources creation and replenishment have been described is this article. Several supporting subsystems of developed automated system have been characterized. The effectiveness of the information system has been evaluated

Electronic National Technical University "Kharkiv Polytechnic Institute" Institutional Repository (eNTUKhPIIR)

Marrying Universal Dependencies and Universal Morphology

Author: Cotterell Ryan
Hulden Mans
McCarthy Arya D.
Silfverberg Miikka
Yarowsky David
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

The Universal Dependencies (UD) and Universal Morphology (UniMorph) projects each present schemata for annotating the morphosyntactic details of language. Each project also provides corpora of annotated text in many languages - UD at the token level and UniMorph at the type level. As each corpus is built by different annotators, language-specific decisions hinder the goal of universal schemata. With compatibility of tags, each project's annotations could be used to validate the other's. Additionally, the availability of both type- and token-level resources would be a boon to tasks such as parsing and homograph disambiguation. To ease this interoperability, we present a deterministic mapping from Universal Dependencies v2 features into the UniMorph schema. We validate our approach by lookup in the UniMorph corpora and find a macro-average of 64.13% recall. We also note incompatibilities due to paucity of data on either side. Finally, we present a critical evaluation of the foundations, strengths, and weaknesses of the two annotation projects.Comment: UDW1

arXiv.org e-Print Archive

Crossref

Recognition and translation Arabic-French of Named Entities: case of the Sport places

Author: Fehri Héla
Hamadou Abdelmajid Ben
Piton Odile
Publication venue
Publication date: 08/09/2009
Field of study

The recognition of Arabic Named Entities (NE) is a problem in different domains of Natural Language Processing (NLP) like automatic translation. Indeed, NE translation allows the access to multilingual in-formation. This translation doesn't always lead to expected result especially when NE contains a person name. For this reason and in order to ameliorate translation, we can transliterate some part of NE. In this context, we propose a method that integrates translation and transliteration together. We used the linguis-tic NooJ platform that is based on local grammars and transducers. In this paper, we focus on sport domain. We will firstly suggest a refinement of the typological model presented at the MUC Conferences we will describe the integration of an Arabic transliteration module into translation system. Finally, we will detail our method and give the results of the evaluation

arXiv.org e-Print Archive

HAL-Paris1

Automatic indexing and retrieval as a tool to improve information and technology transfer

Author: Zimmermann Harald H.
Publication venue: Fakultät 5 - Philosophische Fakultät III. Fachrichtung 5.6 - Informationswissenschaft
Publication date: 01/01/1982
Field of study

During the last 20 years, linguistic data processing mainly has been seen as a tool to develop linguistic regularities (or detect irregularities) of a given natural language, especially to handle large textual databases ("Corpora"). A second motivation to use a computer was to test some theories or models of a language system (or a part of it) using a simulation program. As a result of both strategies, the "Saarbrücken Text Analysis System" has been implemented. At present, a very large lexical database is available to analyse written German texts morphologically and syntactically. The syntactic parser is able to handle every German sentence with more than 90% "correct" results. On the other hand, the development of large (textual) databases within different fields (e.g. law, patent specifications, medicine) is increasing rapidly. Therefore, a computer aided indexing system ("Computergestützte Texterschließung: CTX") has been developed at Regensburg and Saarbrücken University to improve the (even natural language oriented) access to textual data ("free text") applying linguistic strategies to information retrieval processes. Main results of feasibility studies, especially in the field of German Patent Documentation, are presented