Search CORE

2,320 research outputs found

New Methods, Current Trends and Software Infrastructure for NLP

Author: Cunningham Hamish
Gaizauskas Robert J.
Wilks Yorick
Publication venue
Publication date: 01/01/1996
Field of study

The increasing use of `new methods' in NLP, which the NeMLaP conference series exemplifies, occurs in the context of a wider shift in the nature and concerns of the discipline. This paper begins with a short review of this context and significant trends in the field. The review motivates and leads to a set of requirements for support software of general utility for NLP research and development workers. A freely-available system designed to meet these requirements is described (called GATE - a General Architecture for Text Engineering). Information Extraction (IE), in the sense defined by the Message Understanding Conferences (ARPA \cite{Arp95}), is an NLP application in which many of the new methods have found a home (Hobbs \cite{Hob93}; Jacobs ed. \cite{Jac92}). An IE system based on GATE is also available for research purposes, and this is described. Lastly we review related work.Comment: 12 pages, LaTeX, uses nemlap.sty (included

arXiv.org e-Print Archive

CiteSeerX

SWI-Prolog and the Web

Author: Gras
Huang
JAN WIELEMAKER
LOURENS VAN DER MEIJ
Mäkelä
Ramakrishnan
Wielemaker
Wielemaker
Wielemaker
ZHISHENG HUANG
Publication venue
Publication date: 06/11/2007
Field of study

Where Prolog is commonly seen as a component in a Web application that is either embedded or communicates using a proprietary protocol, we propose an architecture where Prolog communicates to other components in a Web application using the standard HTTP protocol. By avoiding embedding in external Web servers development and deployment become much easier. To support this architecture, in addition to the transfer protocol, we must also support parsing, representing and generating the key Web document types such as HTML, XML and RDF. This paper motivates the design decisions in the libraries and extensions to Prolog for handling Web documents and protocols. The design has been guided by the requirement to handle large documents efficiently. The described libraries support a wide range of Web applications ranging from HTML and XML documents to Semantic Web RDF processing. To appear in Theory and Practice of Logic Programming (TPLP)Comment: 31 pages, 24 figures and 2 tables. To appear in Theory and Practice of Logic Programming (TPLP

arXiv.org e-Print Archive

VU Research Portal

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Design issues in the production of hyper‐books and visual‐books

Author: Catenazzi Nadia
Gibb Forbes
Landoni Monica
Publication venue: 'Informa UK Limited'
Publication date: 01/01/1993
Field of study

This paper describes an ongoing research project in the area of electronic books. After a brief overview of the state of the art in this field, two new forms of electronic book are presented: hyper‐books and visual‐books. A flexible environment allows them to be produced in a semi‐automatic way starting from different sources: electronic texts (as input for hyper‐books) and paper books (as input for visual‐books). The translation process is driven by the philosophy of preserving the book metaphor in order to guarantee that electronic information is presented in a familiar way. Another important feature of our research is that hyper‐books and visual‐books are conceived not as isolated objects but as entities within an electronic library, which inherits most of the features of a paper‐based library but introduces a number of new properties resulting from its non‐physical nature

Crossref

ALT Open Access Repository

Directory of Open Access Journals

Multiple hierarchies : new aspects of an old solution

Author: Witt Andreas
Publication venue
Publication date: 01/01/2004
Field of study

In this paper, we present the Multiple Annotation approach, which solves two problems: the problem of annotating overlapping structures, and the problem that occurs when documents should be annotated according to different, possibly heterogeneous tag sets. This approach has many advantages: it is based on XML, the modeling of alternative annotations is possible, each level can be viewed separately, and new levels can be added at any time. The files can be regarded as an interrelated unit, with the text serving as the implicit link. Two representations of the information contained in the multiple files (one in Prolog and one in XML) are described. These representations serve as a base for several applications

CiteSeerX

Hochschulschriftenserver - Universität Frankfurt am Main

Development of Use Cases, Part I

Author: Bolzer Oliver
Bry François
Furche Tim
Kraus Sebastian
Schaffert Sebastian
Publication venue
Publication date: 06/03/2004
Field of study

For determining requirements and constructs appropriate for a Web query language, or in fact any language, use cases are of essence. The W3C has published two sets of use cases for XML and RDF query languages. In this article, solutions for these use cases are presented using Xcerpt. a novel Web and Semantic Web query language that combines access to standard Web data such as XML documents with access to Semantic Web metadata such as RDF resource descriptions with reasoning abilities and rules familiar from logicprogramming. To the best knowledge of the authors, this is the first in depth study of how to solve use cases for accessing XML and RDF in a single language: Integrated access to data and metadata has been recognized by industry and academia as one of the key challenges in data processing for the next decade. This article is a contribution towards addressing this challenge by demonstrating along practical and recognized use cases the usefulness of reasoning abilities, rules, and semistructured query languages for accessing both data (XML) and metadata (RDF)

Open Access LMU

Standardization in the field of information technology and telecommunications. 1990-1991 report. Report from the Commission to the Council and the European Parliament. SEC (92) 1598 final, 2 September 1992

Author
Publication venue
Publication date: 01/01/1992
Field of study

Archive of European Integration

Special Libraries, Spring 1995

Author: Special Libraries Association
Publication venue: SJSU ScholarWorks
Publication date: 01/04/1995
Field of study

Volume 86, Issue 2https://scholarworks.sjsu.edu/sla_sl_1995/1001/thumbnail.jp

SJSU ScholarWorks

The NASA Astrophysics Data System: Data Holdings

Author: Accomazzi A.
Eichhorn G.
Grant C.
Kurtz M. J.
Murray S. S.
Publication venue: 'EDP Sciences'
Publication date: 01/01/1999
Field of study

Since its inception in 1993, the ADS Abstract Service has become an indispensable research tool for astronomers and astrophysicists worldwide. In those seven years, much effort has been directed toward improving both the quantity and the quality of references in the database. From the original database of approximately 160,000 astronomy abstracts, our dataset has grown almost tenfold to approximately 1.5 million references covering astronomy, astrophysics, planetary sciences, physics, optics, and engineering. We collect and standardize data from approximately 200 journals and present the resulting information in a uniform, coherent manner. With the cooperation of journal publishers worldwide, we have been able to place scans of full journal articles on-line back to the first volumes of many astronomical journals, and we are able to link to current version of articles, abstracts, and datasets for essentially all of the current astronomy literature. The trend toward electronic publishing in the field, the use of electronic submission of abstracts for journal articles and conference proceedings, and the increasingly prominent use of the World Wide Web to disseminate information have enabled the ADS to build a database unparalleled in other disciplines. The ADS can be accessed at http://adswww.harvard.eduComment: 24 pages, 1 figure, 6 tables, 3 appendice

arXiv.org e-Print Archive

CiteSeerX

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

CERN Document Server

Topic Map Generation Using Text Mining

Author: Böhm Karsten
Heyer Gerhard
Quasthoff Uwe
Wolff Christian
Publication venue: Springer Verlag
Publication date: 28/06/2002
Field of study

Starting from text corpus analysis with linguistic and statistical analysis algorithms, an infrastructure for text mining is described which uses collocation analysis as a central tool. This text mining method may be applied to different domains as well as languages. Some examples taken form large reference databases motivate the applicability to knowledge management using declarative standards of information structuring and description. The ISO/IEC Topic Map standard is introduced as a candidate for rich metadata description of information resources and it is shown how text mining can be used for automatic topic map generation

University of Regensburg Publication Server

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY