344,750 research outputs found
Introduction to the special issue on cross-language algorithms and applications
With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and efective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of
Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special
issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment
analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.Postprint (published version
On staying grounded and avoiding Quixotic dead ends
The 15 articles in this special issue on The Representation of Concepts illustrate the rich variety of theoretical positions and supporting research that characterize the area. Although much agreement exists among contributors, much disagreement exists as well, especially about the roles of grounding and abstraction in conceptual processing. I first review theoretical approaches raised in these articles that I believe are Quixotic dead ends, namely, approaches that are principled and inspired but likely to fail. In the process, I review various theories of amodal symbols, their distortions of grounded theories, and fallacies in the evidence used to support them. Incorporating further contributions across articles, I then sketch a theoretical approach that I believe is likely to be successful, which includes grounding, abstraction, flexibility, explaining classic conceptual phenomena, and making contact with real-world situations. This account further proposes that (1) a key element of grounding is neural reuse, (2) abstraction takes the forms of multimodal compression, distilled abstraction, and distributed linguistic representation (but not amodal symbols), and (3) flexible context-dependent representations are a hallmark of conceptual processing
Natural language processing
Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems
Sequence Mining and Pattern Analysis in Drilling Reports with Deep Natural Language Processing
Drilling activities in the oil and gas industry have been reported over
decades for thousands of wells on a daily basis, yet the analysis of this text
at large-scale for information retrieval, sequence mining, and pattern analysis
is very challenging. Drilling reports contain interpretations written by
drillers from noting measurements in downhole sensors and surface equipment,
and can be used for operation optimization and accident mitigation. In this
initial work, a methodology is proposed for automatic classification of
sentences written in drilling reports into three relevant labels (EVENT,
SYMPTOM and ACTION) for hundreds of wells in an actual field. Some of the main
challenges in the text corpus were overcome, which include the high frequency
of technical symbols, mistyping/abbreviation of technical terms, and the
presence of incomplete sentences in the drilling reports. We obtain
state-of-the-art classification accuracy within this technical language and
illustrate advanced queries enabled by the tool.Comment: 7 pages, 14 figures, technical repor
CHR as grammar formalism. A first report
Grammars written as Constraint Handling Rules (CHR) can be executed as
efficient and robust bottom-up parsers that provide a straightforward,
non-backtracking treatment of ambiguity. Abduction with integrity constraints
as well as other dynamic hypothesis generation techniques fit naturally into
such grammars and are exemplified for anaphora resolution, coordination and
text interpretation.Comment: 12 pages. Presented at ERCIM Workshop on Constraints, Prague, Czech
Republic, June 18-20, 200
Efficient deep processing of japanese
We present a broad coverage Japanese grammar written in the HPSG formalism with MRS semantics. The grammar is created for use in real world applications, such that robustness and performance issues play an important role. It is connected to a POS tagging and word segmentation tool. This grammar is being developed in a multilingual context, requiring MRS structures that are easily comparable across languages
- …