Search CORE

7 research outputs found

Computer-aided Document Indexing System

Author: Bojana Dalbelo Bašić
Igor Vukmirović
Jan Šnajder
Mladen Kolar
Publication venue: 'University of Zagreb - University Computing Centre'
Publication date: 01/01/2005
Field of study

An enormous number of documents is being produced that have to be stored, searched and accessed. Document indexing represents an efficient way to tackle this problem. Contributing to the document indexing process, we developed the Computer-Aided Document Indexing System (CADIS) that applies controlled vocabulary keywords from the EUROVOC thesaurus. The main contribution of this paper is the introduction of the special CADIS internal data structure that copes with the morphological complexity of the Croatian language. CADIS internal data structure ensures efficient statistical analysis of input documents and quick visual feedback generation that helps indexing documents more quickly, accurately and uniformly than by manual indexing

Crossref

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

TMT: Object-Oriented Text Classification Library

Author: FraneŠarić Bojana Dalbelo Arturšilić
JanŠnajder Bašić
Publication venue
Publication date: 06/03/2020
Field of study

CiteSeerX

Usage-driven Maintenance of Knowledge Organization Systems

Author: Eckert Kai
Publication venue: Universität Mannheim
Publication date: 01/01/2012
Field of study

Knowledge Organization Systems (KOS) are typically used as background knowledge for document indexing in information retrieval. They have to be maintained and adapted constantly to reflect changes in the domain and the terminology. In this thesis, approaches are provided that support the maintenance of hierarchical knowledge organization systems, like thesauri, classifications, or taxonomies, by making information about the usage of KOS concepts available to the maintainer. The central contribution is the ICE-Map Visualization, a treemap-based visualization on top of a generalized statistical framework that is able to visualize almost arbitrary usage information. The proper selection of an existing KOS for available documents and the evaluation of a KOS for different indexing techniques by means of the ICE-Map Visualization is demonstrated. For the creation of a new KOS, an approach based on crowdsourcing is presented that uses feedback from Amazon Mechanical Turk to relate terms hierarchically. The extension of an existing KOS with new terms derived from the documents to be indexed is performed with a machine-learning approach that relates the terms to existing concepts in the hierarchy. The features are derived from text snippets in the result list of a web search engine. For the splitting of overpopulated concepts into new subconcepts, an interactive clustering approach is presented that is able to propose names for the new subconcepts. The implementation of a framework is described that integrates all approaches of this thesis and contains the reference implementation of the ICE-Map Visualization. It is extendable and supports the implementation of evaluation methods that build on other evaluations. Additionally, it supports the visualization of the results and the implementation of new visualizations. An important building block for practical applications is the simple linguistic indexer that is presented as minor contribution. It is knowledge-poor and works without any training. This thesis applies computer science approaches in the domain of information science. The introduction describes the foundations in information science; in the conclusion, the focus is set on the relevance for practical applications, especially regarding the handling of different qualities of KOSs due to automatic and semiautomatic maintenance

MAnnheim DOCument Server

In-house indexing of periodical literature : a study of university libraries in Kenya

Author: Matanji Peter Hezron Marisia
Publication venue
Publication date: 01/03/2012
Field of study

The present study investigated identification, access and usage of periodicals in university libraries in Kenya, with a view of recommending a tool for assisting users to identify information. Using questionnaires completed by 316 university library users and 27 librarians, backed with participant observations, document analysis as well as interviews, it was found that usage of periodicals was low as most users browse through periodicals to identify information, a method that is not effective. In-house indexing was investigated and found to be an effective tool in facilitating access to relevant information. The study recommends establishment of in-house indexing programs and databases in university libraries; formulation of consistent indexing policies to achieve quality indexing; and that indexing should be focused on both content and user requirements by specifying points- of- view, and study methodologies to enhance retrieval of relevant information.Information ScienceM. A. (Information Science

Unisa Institutional Repository

Informação e/ou conhecimento : as duas faces de Jano : atas

Author: Cerveira Elisa
Ribeiro Fernanda
Publication venue: Porto : Universidade do Porto. Faculdade de Letras. CETAC.MEDIA
Publication date: 01/01/2013
Field of study

Repositório Aberto da Universidade do Porto