5 research outputs found

    A software processing chain for evaluating thesaurus quality

    Get PDF
    Thesauri are knowledge models commonly used for information classification and retrieval whose structure is defined by standards that describe the main features the concepts and relations must have. However, following these standards requires a deep knowledge of the field the thesaurus is going to cover and experience in their creation. To help in this task, this paper describes a software processing chain that provides different validation components that evaluates the quality of the main thesaurus features

    An automatic method for reporting the quality of thesauri

    Get PDF
    Thesauri are knowledge models commonly used for information classification and retrieval whose structure is defined by standards such as the ISO 25964. However, when creators do not correctly follow the specifications, they construct models with inadequate concepts or relations that provide a limited usability. This paper describes a process that automatically analyzes the thesaurus properties and relations with respect to ISO 25964 specification, and suggests the correction of potential problems. It performs a lexical and syntactic analysis of the concept labels, and a structural and semantic analyses of the relations. The process has been tested with Urbamet and Gemet thesauri and the results have been analyzed to determine how well the proposed process works

    Assessing Knowledge Organization Systems from a gender perspective: Wikipedia Taxonomy and Wikidata Ontologies

    Full text link
    Develop a comprehensive framework for assessing the knowledge organization system (KOS), including the taxonomy of Wikipedia and the ontologies of Wikidata, with a specific focus on enhancing management and retrieval with a gender non-binary perspective.This study employs heuristic and inspection methods to assess Wikipedia's Knowledge Organization Systems, ensuring compliance with international standards. It evaluates the efficiency of retrieving non-masculine gender-related articles using the Catalan Wikipedian category scheme, identifying limitations. Additionally, a novel assessment of Wikidata ontologies examines their structure and coverage of gender-related properties, comparing them to Wikipedia's taxonomy for advantages and enhancements.This study evaluates Wikipedia's taxonomy and Wikidata's ontologies, establishing evaluation criteria for gender-based categorization and exploring their structural effectiveness. The evaluation process suggests that Wikidata ontologies may offer a viable solution to address Wikipedia's categorization challenges.The assessment of Wikipedia categories (taxonomy) based on Knowledge Organization System standards leads to the conclusion that there is ample room for improvement, not only in matters concerning gender identity but also in the overall knowledge organization system to enhance search and retrieval for users. These findings bear relevance for the design of tools to support information retrieval on knowledge-rich websites, as they assist users in exploring topics and concepts.</p

    Arcabouço de arquitetura da informação para ciclo de vida de projeto de vocabulário controlado : uma aplicação em Engenharia de Software

    Get PDF
    Tese (doutorado)—Universidade de Brasília, Faculdade de Ciência da Informação, Programa de Pós-Graduação em Ciência da Informação, 2017.A pesquisa que resultou nesta tese investigou processos de desenvolvimento e de avaliação de vocabulários controlados. Esta tese inclui os seguintes elementos: resultado de pesquisa bibliográfica sobre arquitetura da informação, recuperação da informação, organização da informação e representação da informação; proposta de arcabouço para ciclo de vida de projeto de vocabulário controlado; e exemplo de uso de elementos desse arcabouço na construção de um protótipo de vocabulário controlado no domínio da Engenharia de Software. O arcabouço proposto é composto por arquitetura de referência, modelo de domínio, modelo de qualidade e lista de atividades. Entre os elementos do modelo de qualidade proposto, existe uma lista de características de qualidade de vocabulários controlados. Os modelos propostos estão parcialmente alinhados a ferramentas semânticas existentes.The research that resulted in this thesis has investigated development and evaluation processes of controlled vocabularies. This thesis includes the following elements: results of a bibliographic research on information architecture, information recovery, information organization and information representation; proposal of a framework for controlled vocabulary project life cycle; and example of use of this framework during the construction of a prototype of a controlled vocabulary on the Software Engineering domain. The proposed framework is composed of reference architecture, domain model, quality model and list of activities. Among the elements of the proposed quality model, there is a list of controlled vocabulary quality characteristics. The proposed models are partially aligned to existing semantic tools
    corecore