234 research outputs found

    27 pawns ready for action: A multi-indicator methodology and evaluation of thesaurus management tools from a LOD perspective

    Get PDF
    Purpose – The purpose of this paper is to propose a methodology for assessing thesauri and other controlled vocabularies management tools that can represent content using the Simple Knowledge Organization System (SKOS) data model, and their use in a Linked Open Data (LOD) paradigm. It effectively analyses selected set of tools in order to prove the validity of the method. Design/methodology/approach – A set of 27 criteria grouped in five evaluation indicators is proposed and applied to ten vocabulary management applications which are compliant with the SKOS data model. Previous studies of controlled vocabulary management software are gathered and analyzed, to compare the evaluation parameters used and the results obtained for each tool. Findings – The results indicate that the tool that obtains the highest score in every indicator is Poolparty. The second and third tools are, respectively, TemaTres and Intelligent Theme Manager, but scoring lower in most of the evaluation items. The use of a broad set of criteria to evaluate vocabularies management tools gives satisfactory results. The set of five indicators and 27 criteria proposed here represents a useful evaluation system in the selection of current and future tools to manage vocabularies. Research limitations/implications – The paper only assesses the ten most important/well know software tools applied for thesaurus and vocabulary management until October 2016. However, the evaluation criteria could be applied to new software that could appear in the future to create/manage SKOS vocabularies in compliance with LOD standards. Originality/value – The originality of this paper relies on the proposed indicators and criteria to evaluate vocabulary management tools. Those criteria and indicators can be valuable also for future software that might appear. The indicators are also applied to the most exhaustive and qualified list of this kind of tools. The paper will help designers, information architects, metadata librarians, and other staff involved in the design of digital information systems, to choose the right tool to manage their vocabularies in a LOD/vocabulary scenario

    Towards a Similarity Algorithm for Controlled Vocabularies Within the Digital Humanities

    Get PDF
    With a growing amount and increasing complexity of data and metadata in the Digital Humanities, the use of semantic tools such as controlled vocabularies and taxonomies becomes more and more important to gain new research insights. Their use enables new research possibilities by introducing machine readable semantic links and standardised data and metadata. A validation and recommender system that ensures a quick development of high quality vocabularies is essential in such a scientific workflow. The base of this system is a similarity algorithm. State of the art algorithms and editors for controlled vocabularies do not meet the special requirements of the Digital Humanities domain. Therefore, this work proposes to fill the research gap in the Digital Humanities domain with a similarity algorithm and a recommender and validation system for controlled vocabularies. The methodology and evaluation for achieving this goal as well as preliminary results are presented in this contribution

    A Semantic Web methodological framework to evaluate the support of integrity in thesaurus tools

    Get PDF
    12 p.With the Semantic Web, thesauri recover a relevant role supporting semantic searches and other added-value services. Thesaurus standards define what constructs a thesaurus can have and the integrity rules it must comply with. Thesaurus editors can be helped in their work if thesaurus tools offer them support for integrity, warning when integrity rules are violated and/or helping them to correct these mistakes. The most recent thesaurus standard is ISO 25964, which supersedes ISO 2788, evolving towards concept-based thesauri, better aligned with the Semantic Web approach than the term-based thesauri of ISO 2788. However, the W3C recommendation for KOS (Knowledge Organization System) representation in the semantic web context is SKOS, which is in fact prior to ISO 25964. This paper focuses on thesaurus integrity and the evolution from ISO 2788 to ISO 25964. Its effect on integrity issues is analyzed. A methodological proposal for evaluating integrity support in thesaurus tools, arising from the results of this work, is presented. Its target audience is professionals in charge of thesaurus edition. Besides being adapted to the most recent thesaurus standard, ISO 25964, it also includes the comparison of ISO standards with SKOS. The paper is completed with the presentation of the results of applying it to three thesaurus tools

    Let's Agree to Disagree: On the Evaluation of Vocabulary Alignment

    Get PDF

    Improving search engines with open Web-based SKOS vocabularies

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia InformáticaThe volume of digital information is increasingly larger and even though organiza-tions are making more of this information available, without the proper tools users have great difficulties in retrieving documents about subjects of interest. Good infor-mation retrieval mechanisms are crucial for answering user information needs. Nowadays, search engines are unavoidable - they are an essential feature in docu-ment management systems. However, achieving good relevancy is a difficult problem particularly when dealing with specific technical domains where vocabulary mismatch problems can be prejudicial. Numerous research works found that exploiting the lexi-cal or semantic relations of terms in a collection attenuates this problem. In this dissertation, we aim to improve search results and user experience by inves-tigating the use of potentially connected Web vocabularies in information retrieval en-gines. In the context of open Web-based SKOS vocabularies we propose a query expan-sion framework implemented in a widely used IR system (Lucene/Solr), and evaluated using standard IR evaluation datasets. The components described in this thesis were applied in the development of a new search system that was integrated with a rapid applications development tool in the context of an internship at Quidgest S.A.Fundação para a Ciência e Tecnologia - ImTV research project, in the context of the UTAustin-Portugal collaboration (UTA-Est/MAI/0010/2009); QSearch project (FCT/Quidgest

    Validation Framework for RDF-based Constraint Languages

    Get PDF
    In this thesis, a validation framework is introduced that enables to consistently execute RDF-based constraint languages on RDF data and to formulate constraints of any type. The framework reduces the representation of constraints to the absolute minimum, is based on formal logics, consists of a small lightweight vocabulary, and ensures consistency regarding validation results and enables constraint transformations for each constraint type across RDF-based constraint languages

    Thesauri and Semantic Web: Discussion of the Evolution of Thesauri toward their Integration With the Semantic Web

    Get PDF
    15 p.Thesauri are Knowledge Organization Systems (KOS), that arise from the consensus of wide communities. They have been in use for many years and are regularly updated. Whereas in the past thesauri were designed for information professionals for indexing and searching, today there is a demand for conceptual vocabularies that enable inferencing by machines. The development of the Semantic Web has brought a new opportunity for thesauri, but thesauri also face the challenge of proving that they add value to it. The evolution of thesauri toward their integration with the Semantic Web is examined. Elements and structures in the thesaurus standard, ISO 25964, and SKOS (Simple Knowledge Organization System), the Semantic Web standard for representing KOS, are reviewed and compared. Moreover, the integrity rules of thesauri are contrasted with the axioms of SKOS. How SKOS has been applied to represent some real thesauri is taken into account. Three thesauri are chosen for this aim: AGROVOC, EuroVoc and the UNESCO Thesaurus. Based on the results of this comparison and analysis, the benefits that Semantic Web technologies offer to thesauri, how thesauri can contribute to the Semantic Web, and the challenges that would help to improve their integration with the Semantic Web are discussed.S

    The Darwin Core extension for genebanks opens up new opportunities for sharing genebank datasets

    Get PDF
    Darwin Core (DwC) defines a standard set of terms to describe the primary biodiversity data. Primary biodiversity data are data records derived from direct observation of species occurrences in nature or describing specimens in biological collections. The Darwin Core terms can be seen as an extension to the standard Dublin Core metadata terms. The new Darwin Core extension for genebanks declares the additional terms required for describing genebank datasets, and is based on established standards from the plant genetic resources community. The Global Biodiversity Information Facility (GBIF) provides an information infrastructure for biodiversity data including a suite of software tools for data publishing, distributed data access, and the capture of biodiversity data. The Darwin Core extension for genebanks is a key component that provides access for the genebanks and the plant genetic resources community to the GBIF informatics infrastructure including the new toolkits for data exchange. This paper provides one of the first examples and guidelines for how to create extensions to the Darwin Core standard
    • …
    corecore