Search CORE

854 research outputs found

Natural language processing

Author: Adams
Amsler
Bangalore
Barker
Benoît
Bian
Bondale
Carrick
Ceric
Chandrasekar
Chang
Charniak
Chen
Chowdhury
Chowdhury
Costantino
Cowie
Craven
Craven
Craven
Dogru
Evans
Feldman
Fernandez
Gaizauskas
Glasgow
Haas
Hayes
Hayes
Hedlund
Herath
Ide
Isahara
Jelinek
Jeong
Jurafsky
Kazakov
Kehler
Khoo
Kim
King
Lange
Lee
Lehmam
Lehtokangas
Lewis
Liddy
Liddy
Lovis
Ma
Magnini
Mani
Manning
Marquez
Martinez
Martinez
McMurchie
Meyer
Mihalcea
Mock
Moens
Morin
Narita
Nerbonne
Oard
Ogura
Oudet
Owei
Paris
Pasero
Pedersen
Perez-Carballo
Petreley
Pirkola
Poesio
Rosenfield
Roux
Say
Scarlett
Schenker
Silber
Smeaton
Smeaton
Smith
Sokol
Song
Sparck Jones
Staab
Stock
Tolle
Trybula
Tsuda
Vickery
Waldrop
Warner
Weigard
Wilks
Wong
Yang
Yang
Zadrozny
Zweigenbaum
Publication venue: 'Wiley'
Publication date: 01/01/2003
Field of study

Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

Crossref

University of Strathclyde Institutional Repository

OPUS - University of Technology Sydney

ASPICO: Advanced Scientific Portal for International Cooperation on Digital Cultural Conten

Author: Andrès Frédéric
Godard Jérôme
Ono Kinji
Publication venue: Institute of Information Theories and Applications FOI ITHEA
Publication date: 01/01/2004
Field of study

* This research is partially supported by a grant (bourse Lavoisier) from the French Ministry of Foreign Affairs (Ministère des Affaires Etrangères).In this paper, we present the development of an advanced open source multi-lingual cooperative portal system (ASPICO) dedicated to semantic management, and to cooperative exchange for research and education purpose on digital cultural projects. Advantages of using ASPICO include greater flexibility for digital resource management, generic and systematic ontology-based metadata management, and better semantic access and delivery based on an innovative Information Modeling for Adaptive Management (IMAM)

Bulgarian Digital Mathematics Library at IMI-BAS

Ontology-based Information Extraction with SOBA

Author: Buitelaar Paul
Cimiano Philipp
Racioppa Stefania
Siegel Melanie
Publication venue
Publication date: 20/12/2011
Field of study

In this paper we describe SOBA, a sub-component of the SmartWeb multi-modal dialog system. SOBA is a component for ontologybased information extraction from soccer web pages for automatic population of a knowledge base that can be used for domainspecific question answering. SOBA realizes a tight connection between the ontology, knowledge base and the information extraction component. The originality of SOBA is in the fact that it extracts information from heterogeneous sources such as tabular structures, text and image captions in a semantically integrated way. In particular, it stores extracted information in a knowledge base, and in turn uses the knowledge base to interpret and link newly extracted information with respect to already existing entities

Hochschulschriftenserver - Universität Frankfurt am Main

PLuTO: MT for online patent translation

Author: Sheridan Páraic
Tinsley John
Way Andy
Publication venue: Association for Machine Translation in the Americas
Publication date: 01/01/2010
Field of study

PLuTO – Patent Language Translation Online – is a partially EU-funded commercialization project which specializes in the automatic retrieval and translation of patent documents. At the core of the PLuTO framework is a machine translation (MT) engine through which web-based translation services are offered. The fully integrated PLuTO architecture includes a translation engine coupling MT with translation memories (TM), and a patent search and retrieval engine. In this paper, we first describe the motivating factors behind the provision of such a service. Following this, we give an overview of the PLuTO framework as a whole, with particular emphasis on the MT components, and provide a real world use case scenario in which PLuTO MT services are exploited

CiteSeerX

Irish Universities

DCU Online Research Access Service

Semi-Supervised Named Entity Recognition:\ud Learning to Recognize 100 Entity Types with Little Supervision\ud

Author: Nadeau David
Publication venue
Publication date: 28/11/2007
Field of study

Named Entity Recognition (NER) aims to extract and to classify rigid designators in text such as proper names, biological species, and temporal expressions. There has been growing interest in this field of research since the early 1990s. In this thesis, we document a trend moving away from handcrafted rules, and towards machine learning approaches. Still, recent machine learning approaches have a problem with annotated data availability, which is a serious shortcoming in building and maintaining large-scale NER systems. \ud \ud In this thesis, we present an NER system built with very little supervision. Human supervision is indeed limited to listing a few examples of each named entity (NE) type. First, we introduce a proof-of-concept semi-supervised system that can recognize four NE types. Then, we expand its capacities by improving key technologies, and we apply the system to an entire hierarchy comprised of 100 NE types. \ud \ud Our work makes the following contributions: the creation of a proof-of-concept semi-supervised NER system; the demonstration of an innovative noise filtering technique for generating NE lists; the validation of a strategy for learning disambiguation rules using automatically identified, unambiguous NEs; and finally, the development of an acronym detection algorithm, thus solving a rare but very difficult problem in alias resolution. \ud \ud We believe semi-supervised learning techniques are about to break new ground in the machine learning community. In this thesis, we show that limited supervision can build complete NER systems. On standard evaluation corpora, we report performances that compare to baseline supervised systems in the task of annotating NEs in texts. \u

CogPrints Cognitive Sciences Eprint Archive

The six challenges of the Semantic Web

Author: Benjamins R.
Contreras Jesús
Corcho Oscar
Gómez-Pérez A.
Publication venue: Facultad de Informática (UPM)
Publication date: 01/04/2002
Field of study

The Semantic Web has attracted a diverse, but significant, community of researchers, institutes and companies, all sharing the belief that one day the Semantic Web will have as big an impact on life as currently the WWW/Internet has. We share that vision, based on the ever-increasing need to reduce information overload, and to increase task delegation to software agents. However, there is still a long way to go before the Semantic Web dream comes true. In this paper, we identify some of the major challenges the community faces in the coming years, and we outline solution directions. The major challenges concern: (i) the availability of content, (ii) ontology availability, development and evolution, (iii) scalability, (iv) multilinguality, (v) visualization to reduce information overload, and (vi) stability of Semantic Web languages. We will also say some words on the economic impact of the Semantic Web

Archivo Digital UPM

All mixed up? Finding the optimal feature set for general readability prediction and its application to English and Dutch

Author: De Clercq Orphée
Hoste Veronique
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2016
Field of study

Readability research has a long and rich tradition, but there has been too little focus on general readability prediction without targeting a specific audience or text genre. Moreover, though NLP-inspired research has focused on adding more complex readability features there is still no consensus on which features contribute most to the prediction. In this article, we investigate in close detail the feasibility of constructing a readability prediction system for English and Dutch generic text using supervised machine learning. Based on readability assessments by both experts and a crowd, we implement different types of text characteristics ranging from easy-to-compute superficial text characteristics to features requiring a deep linguistic processing, resulting in ten different feature groups. Both a regression and classification setup are investigated reflecting the two possible readability prediction tasks: scoring individual texts or comparing two texts. We show that going beyond correlation calculations for readability optimization using a wrapper-based genetic algorithm optimization approach is a promising task which provides considerable insights in which feature combinations contribute to the overall readability prediction. Since we also have gold standard information available for those features requiring deep processing we are able to investigate the true upper bound of our Dutch system. Interestingly, we will observe that the performance of our fully-automatic readability prediction pipeline is on par with the pipeline using golden deep syntactic and semantic information

Crossref

Ghent University Academic Bibliography

Using HLT for acquiring, retrieving and publishing knowledge in AKT:position paper

Author: Bontcheva Kalina
Brewster Christopher
Ciravegna Fabio
Cunningham Hamish
Gaizauskas Robert
Guthrie Louise
Wilks Yorick
Publication venue
Publication date: 01/01/2001
Field of study

AKT is a major research project applying a variety of technologies to knowledge management. Knowledge is a dynamic, ubiquitous resource, which is to be found equally in an expert's head, under terabytes of data, or explicitly stated in manuals. AKT will extend knowledge management technologies to exploit the potential of the semantic web, covering the use of knowledge over its entire lifecycle, from acquisition to maintenance and deletion. In this paper we discuss how HLT will be used in AKT and how the use of HLT will affect different areas of KM, such as knowledge acquisition, retrieval and publishing

Aston Publications Explorer