40,022 research outputs found
A modular methodology for converting large, complex books into usable, accessible and standards-compliant ebooks
This report describes the methodology used for ebook creation for the Glasgow Digital Library (GDL), and provides detailed instructions on how the same methodology could be used elsewhere. The document includes a description and explanation of the processes for ebook creation followed by a tutorial
Topic Map Generation Using Text Mining
Starting from text corpus analysis with linguistic and statistical analysis algorithms, an infrastructure for text mining is described which uses collocation analysis as a central tool. This text mining method may be applied to different domains as well as languages. Some examples taken form large reference databases motivate the applicability to knowledge management using declarative standards of information structuring and description. The ISO/IEC Topic Map standard is introduced as a candidate for rich metadata description of information resources and it is shown how text mining can be used for automatic topic map generation
A study of systems implementation languages for the POCCNET system
The results are presented of a study of systems implementation languages for the Payload Operations Control Center Network (POCCNET). Criteria are developed for evaluating the languages, and fifteen existing languages are evaluated on the basis of these criteria
NASA automatic subject analysis technique for extracting retrievable multi-terms (NASA TERM) system
Current methods for information processing and retrieval used at the NASA Scientific and Technical Information Facility are reviewed. A more cost effective computer aided indexing system is proposed which automatically generates print terms (phrases) from the natural text. Satisfactory print terms can be generated in a primarily automatic manner to produce a thesaurus (NASA TERMS) which extends all the mappings presently applied by indexers, specifies the worth of each posting term in the thesaurus, and indicates the areas of use of the thesaurus entry phrase. These print terms enable the computer to determine which of several terms in a hierarchy is desirable and to differentiate ambiguous terms. Steps in the NASA TERMS algorithm are discussed and the processing of surrogate entry phrases is demonstrated using four previously manually indexed STAR abstracts for comparison. The simulation shows phrase isolation, text phrase reduction, NASA terms selection, and RECON display
Artequakt: Generating tailored biographies from automatically annotated fragments from the web
The Artequakt project seeks to automatically generate narrativebiographies of artists from knowledge that has been extracted from the Web and maintained in a knowledge base. An overview of the system architecture is presented here and the three key components of that architecture are explained in detail, namely knowledge extraction, information management and biography construction. Conclusions are drawn from the initial experiences of the project and future progress is detailed
- ā¦