11 research outputs found

    Uma linguagem de definição da correlação entre metadados de bibliotecas digitais proprietárias em metadados Dublin Core e seu uso em servidores Z39.50

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-Graduação em Ciência da Computação.O número de organizações que criam suas próprias bibliotecas digitais vem crescendo. Isto, pois as bibliotecas digitais oferecem um modo eficiente de organização no armazenamento do acervo e uma facilidade de recuperação de informação. Todo o conhecimento que está armazenado nas bibliotecas digitais tem seu valor reconhecido à medida que se torna disponível à pesquisa. Desta forma, a interoperabilidade entre as bibliotecas digitais representa uma importante questão a ser resolvida. Algumas iniciativas tentam resolver este problema, como o protocolo Z39.50 e o padrão de metadados Dublin Core. Mas existem diversas implementações de bibliotecas digitais proprietárias que não adotam tais iniciativas. Hoje existem soluções permitindo adaptar padrões de interoperabilidade às bibliotecas proprietárias. Mas a instalação de tais soluções é complexa, exigindo até a recodificação. O objetivo desta dissertação é propor uma solução para facilitar a adaptação de servidores Z39.50 a bibliotecas digitais proprietárias que utilizam banco de dados relacional. Para tal, foi definida uma linguagem baseada em XML para a definição da correlação entre metadados de bibliotecas digitais proprietárias em metadados Dublin Core. Além disso, foi definido o procedimento de uso desta linguagem em servidores Z39.50

    Integration of distributed terminology resources to facilitate subject cross-browsing for library portal systems

    Get PDF
    With the increase in the number of distributed library information resources, users may have to interact with different user interfaces, learn to switch their mental models between these interfaces, and familiarise themselves with controlled vocabularies used by different resources. For this reason, library professionals have developed library portals to integrate these distributed information resources, and assist end-users in cross-accessing distributed resources via a single access point in their own library. There are two important subject-based services that a library portal system might be able to provide. The first is a federated search service, which refers to a process where a user can input a query to cross-search a number of information resources. The second is a subject cross-browsing service, which can offer a knowledge navigation tree to link subject schemes used by distributed resources. However, the development of subject cross-searching and browsing services has been impeded by the heterogeneity of different KOS (Knowledge Organisation System) used by different information resources. Due to the lack of mappings between different KOS, it is impossible to offer a subject cross-browsing service for a library portal system. [Continues.

    Intelligent Information Access to Linked Data - Weaving the Cultural Heritage Web

    Get PDF
    The subject of the dissertation is an information alignment experiment of two cultural heritage information systems (ALAP): The Perseus Digital Library and Arachne. In modern societies, information integration is gaining importance for many tasks such as business decision making or even catastrophe management. It is beyond doubt that the information available in digital form can offer users new ways of interaction. Also, in the humanities and cultural heritage communities, more and more information is being published online. But in many situations the way that information has been made publicly available is disruptive to the research process due to its heterogeneity and distribution. Therefore integrated information will be a key factor to pursue successful research, and the need for information alignment is widely recognized. ALAP is an attempt to integrate information from Perseus and Arachne, not only on a schema level, but to also perform entity resolution. To that end, technical peculiarities and philosophical implications of the concepts of identity and co-reference are discussed. Multiple approaches to information integration and entity resolution are discussed and evaluated. The methodology that is used to implement ALAP is mainly rooted in the fields of information retrieval and knowledge discovery. First, an exploratory analysis was performed on both information systems to get a first impression of the data. After that, (semi-)structured information from both systems was extracted and normalized. Then, a clustering algorithm was used to reduce the number of needed entity comparisons. Finally, a thorough matching was performed on the different clusters. ALAP helped with identifying challenges and highlighted the opportunities that arise during the attempt to align cultural heritage information systems

    Exploring multi-granular documentation strategies for the representation, discovery and use of geographic information

    Get PDF
    This thesis explores how digital representations of geography and Geographic Information (GI) may be described, and how these descriptions facilitate the use of the resources they depict. More specifically, it critically examines existing geospatial documentation practices and aims to identify opportunities for refinement therein, whether when used to signpost those data assets documented, for managing and maintaining information assets, or to assist in resource interpretation and discrimination. Documentation of GI can therefore facilitate its utilisation; it can be reasonably expected that by refining documentation practices, GI hold the potential for being better exploited. The underpinning theme connecting the individual papers of the thesis is one of multi-granular documentation. GI may be recorded at varying degrees of granularity, and yet traditional documentation efforts have predominantly focussed on a solitary level (that of the geospatial data layer). Developing documentation practices to account for other granularities permits the description of GI at different levels of detail and can further assist in realising its potential through better discovery, interpretation and use. One of the aims of the current work is to establish the merit of such multi-granular practices. Over the course of four research papers and a short research article, proprietary as well as open source software approaches are accordingly presented and provide proof-of-concept and conceptual solutions that aim to enhance GI utilisation through improved documentation practices. Presented in the context of an existing body of research, the proposed approaches focus on the technological infrastructure supporting data discovery, the automation of documentation processes and the implications of describing geospatial information resources of varying granularity. Each paper successively contributes to the notion that geospatial resources are potentially better exploited when documentation practices account for the multi-granular aspects of GI, and the varying ways in which such documentation may be used. In establishing the merit of multi-granular documentation, it is nevertheless recognised in the current work that instituting a comprehensive documentation strategy at several granularities may be unrealistic for some geospatial applications. Pragmatically, the level of effort required would be excessive, making universal adoption impractical. Considering however the ever-expanding volumes of geospatial data gathered and the demand for ways of managing and maintaining the usefulness of potentially unwieldy repositories, improved documentation practices are required. A system of hierarchical documentation, of self-documenting information, would provide for information discovery and retrieval from such expanding resource pools at multiple granularities, improve the accessibility of GI and ultimately, its utilisation

    Interoperability between heterogeneous and distributed biodiversity data sources in structured data networks

    Get PDF
    The extensive capturing of biodiversity data and storing them in heterogeneous information systems that are accessible on the internet across the globe has created many interoperability problems. One is that the data providers are independent of others and they can run systems which were developed on different platforms at different times using different software products to respond to different needs of information. A second arises from the data modelling used to convert the real world data into a computerised data structure which is not conditioned by a universal standard. Most importantly the need for interoperation between these disparate data sources is to get accurate and useful information for further analysis and decision making. The software representation of a universal or a single data definition structure for depicting a biodiversity entity is ideal. But this is not necessarily possible when integrating data from independently developed systems. The different perspectives of the real-world entity when being modelled by independent teams will result in the use of different terminologies, definition and representation of attributes and operations for the same real-world entity. The research in this thesis is concerned with designing and developing an interoperable flexible framework that allows data integration between various distributed and heterogeneous biodiversity data sources that adopt XML standards for data communication. In particular the problems of scope and representational heterogeneity among the various XML data schemas are addressed. To demonstrate this research a prototype system called BUFFIE (Biodiversity Users‘ Flexible Framework for Interoperability Experiments) was designed using a hybrid of Object-oriented and Functional design principles. This system accepts the query information from the user in a web form, and designs an XML query. This request query is enriched and is made more specific to data providers using the data provider information stored in a repository. These requests are sent to the different heterogeneous data resources across the internet using HTTP protocol. The responses received are in varied XML formats which are integrated using knowledge mapping rules defined in XSLT & XML. The XML mappings are derived from a biodiversity domain knowledgebase defined for schema mappings of different data exchange protocols. The integrated results are presented to users or client programs to do further analysis. The main results of this thesis are: (1) A framework model that allows interoperation between the heterogeneous data source systems. (2) Enriched querying improves the accuracy of responses by finding the correct information existing among autonomous, distributed and heterogeneous data resources. (3) A methodology that provides a foundation for extensibility as any new network data standards in XML can be added to the existing protocols. The presented approach shows that (1) semi automated mapping and integration of datasets from the heterogeneous and autonomous data providers is feasible. (2) Query enriching and integrating the data allows the querying and harvesting of useful data from various data providers for helpful analysis.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Els sistemes d’informació de qualitat en ciències aquàtiques: la interoperabilitat semàntica

    Get PDF
    [cat] La missió principal d’aquesta tesi és desenvolupar les bases per a la creació d’un model de sistema d’informació de qualitat per a les ciències aquàtiques per mitjà de l’estudi dels sistemes d’informació de qualitat i dels sistemes d’informació de tipus distribuït. Per tal d’aconseguir-ho, els objectius d’aquesta recerca són estudiar i avaluar els sistemes d’informació distribuïts de qualitat i els seus estàndards per tal d’aplicar-lo a l’àmbit de les ciències aquàtiques. Concretament, l’estudi es centra en els factors claus dels sistemes distribuïts, l’intercanvi de la informació i la recuperació de la informació a través dels sistemes d’organització del coneixement, com a mecanismes per assegurar la interoperabilitat semàntica d’un sistema d’informació distribuït i de qualitat. Per tal de crear les bases d’un futur sistema d’informació de qualitat en l’àmbit de les ciències aquàtiques es persegueixen tres objectius específics que es van desenvolupant al llarg de la tesi a través dels diversos capítols que la formen: En els primers capítols (capítol 1 i 2) es cerca un primer objectiu específic com és assentar les bases per al desenvolupament d’un model futur de sistema d’informació de qualitat (SIQ) de tipus distribuït i en un entorn científic. Aquests capítols es basen en estudis bibliogràfics i estudis de casos (comparativa de sistemes d’informació de tipus semàntic i no semàntic) per tal d’analitzar els antecedents, les característiques, evolució i tendències dels sistemes d’informació de qualitat. En els capítols 3, 4, 5 i 6 s’estudia i es caracteritza els principals elements que fan possible el desenvolupament dels sistemes d’informació distribuïts i de qualitat en ciències aquàtiques: l’arquitectura de la informació basada en la indexació i navegació creuades (model tipus passarel·les temàtiques), els formats i estàndards de descripció bibliogràfica i objectes digitals aplicats en l’àmbit de les ciències aquàtiques i altres ciències afins, l’estudi de la interoperabilitat semàntica com a element clau per assegurar l’intercanvi d’informació i la compatibilitat entre diferents sistemes d’informació que puguin estar integrats en un sistema d’informació distribuït i de qualitat, i finalment, l’estudi de context de la “informació de qualitat” (capítol 6) com a característica principal dels sistemes d’informació de qualitat que és on s’estableix el protocol per a l’anàlisi i avaluació de sistemes d’informació distribuïts de qualitat que formarà part de la base principal del desenvolupament d’un SIQ en ciències aquàtiques. En els darrers capítols, 7 i 8, es desenvolupa un estudi experimental que demostra l’eficàcia dels mètodes d’interoperabilitat semàntica en sistemes d’informació de qualitat en ciències aquàtiques. Finalment en el capítol 9 es resumeixen les conclusions generals obtingudes dels diferents estudis realitzats i els resultats d’aquesta tesi així com el possible treball futur.[eng] The main purpose of this thesis is to develop the basis for creating a model of information quality system in water science domain through the study of information quality systems and distributed information systems. To achieve this goal, we are proposed specific research objectives to study and analyze distributed and quality information systems and their standards to apply them in the field of aquatic sciences. Specifically, this study focuses on the key factors of distributed systems, information sharing and information retrieval through knowledge organization systems (KOS) as mechanisms to ensure the semantic interoperability and information quality in a distributed system

    Interoperability between heterogeneous and distributed biodiversity data sources in structured data networks

    Get PDF
    The extensive capturing of biodiversity data and storing them in heterogeneous information systems that are accessible on the internet across the globe has created many interoperability problems. One is that the data providers are independent of others and they can run systems which were developed on different platforms at different times using different software products to respond to different needs of information. A second arises from the data modelling used to convert the real world data into a computerised data structure which is not conditioned by a universal standard. Most importantly the need for interoperation between these disparate data sources is to get accurate and useful information for further analysis and decision making. The software representation of a universal or a single data definition structure for depicting a biodiversity entity is ideal. But this is not necessarily possible when integrating data from independently developed systems. The different perspectives of the real-world entity when being modelled by independent teams will result in the use of different terminologies, definition and representation of attributes and operations for the same real-world entity. The research in this thesis is concerned with designing and developing an interoperable flexible framework that allows data integration between various distributed and heterogeneous biodiversity data sources that adopt XML standards for data communication. In particular the problems of scope and representational heterogeneity among the various XML data schemas are addressed. To demonstrate this research a prototype system called BUFFIE (Biodiversity Users‘ Flexible Framework for Interoperability Experiments) was designed using a hybrid of Object-oriented and Functional design principles. This system accepts the query information from the user in a web form, and designs an XML query. This request query is enriched and is made more specific to data providers using the data provider information stored in a repository. These requests are sent to the different heterogeneous data resources across the internet using HTTP protocol. The responses received are in varied XML formats which are integrated using knowledge mapping rules defined in XSLT & XML. The XML mappings are derived from a biodiversity domain knowledgebase defined for schema mappings of different data exchange protocols. The integrated results are presented to users or client programs to do further analysis. The main results of this thesis are: (1) A framework model that allows interoperation between the heterogeneous data source systems. (2) Enriched querying improves the accuracy of responses by finding the correct information existing among autonomous, distributed and heterogeneous data resources. (3) A methodology that provides a foundation for extensibility as any new network data standards in XML can be added to the existing protocols. The presented approach shows that (1) semi automated mapping and integration of datasets from the heterogeneous and autonomous data providers is feasible. (2) Query enriching and integrating the data allows the querying and harvesting of useful data from various data providers for helpful analysis

    Interoperability of Enterprise Software and Applications

    Get PDF
    corecore