13 research outputs found

    Hypermedia-based discovery for source selection using low-cost linked data interfaces

    Get PDF
    Evaluating federated Linked Data queries requires consulting multiple sources on the Web. Before a client can execute queries, it must discover data sources, and determine which ones are relevant. Federated query execution research focuses on the actual execution, while data source discovery is often marginally discussed-even though it has a strong impact on selecting sources that contribute to the query results. Therefore, the authors introduce a discovery approach for Linked Data interfaces based on hypermedia links and controls, and apply it to federated query execution with Triple Pattern Fragments. In addition, the authors identify quantitative metrics to evaluate this discovery approach. This article describes generic evaluation measures and results for their concrete approach. With low-cost data summaries as seed, interfaces to eight large real-world datasets can discover each other within 7 minutes. Hypermedia-based client-side querying shows a promising gain of up to 50% in execution time, but demands algorithms that visit a higher number of interfaces to improve result completeness

    Pervasive brain monitoring and data sharing based on multi-tier distributed computing and linked data technology

    Get PDF
    EEG-based Brain-computer interfaces (BCI) are facing grant challenges in their real-world applications. The technical difficulties in developing truly wearable multi-modal BCI systems that are capable of making reliable real-time prediction of users’ cognitive states under dynamic real-life situations may appear at times almost insurmountable. Fortunately, recent advances in miniature sensors, wireless communication and distributed computing technologies offered promising ways to bridge these chasms. In this paper, we report our attempt to develop a pervasive on-line BCI system by employing state-of-art technologies such as multi-tier fog and cloud computing, semantic Linked Data search and adaptive prediction/classification models. To verify our approach, we implement a pilot system using wireless dry-electrode EEG headsets and MEMS motion sensors as the front-end devices, Android mobile phones as the personal user interfaces, compact personal computers as the near-end fog servers and the computer clusters hosted by the Taiwan National Center for High-performance Computing (NCHC) as the far-end cloud servers. We succeeded in conducting synchronous multi-modal global data streaming in March and then running a multi-player on-line BCI game in September, 2013. We are currently working with the ARL Translational Neuroscience Branch and the UCSD Movement Disorder Center to use our system in real-life personal stress and in-home Parkinson’s disease patient monitoring experiments. We shall proceed to develop a necessary BCI ontology and add automatic semantic annotation and progressive model refinement capability to our system

    Querying over Federated SPARQL Endpoints - A State of the Art Survey

    No full text
    Technical reportThe increasing amount of Linked Data and its inherent distributed nature have attracted significant attention throughout the research community and amongst practitioners to search data, in the past years. Inspired by research results from traditional distributed databases, different approaches for managing federation over SPARQL Endpoints have been introduced. SPARQL is the standardised query language for RDF, the default data model used in Linked Data deployments and SPARQL Endpoints are a popular access mechanism provided by many Linked Open Data (LOD) repositories. In this paper, we initially give an overview of the federation framework infrastructure and then proceed with a comparison of existing SPARQL federation frameworks. Finally, we highlight shortcomings in existing frameworks, which we hope helps spawning new research directionsScience Foundation Ireland (Lion 2); European Commission FP7 Support Action (LATC), project no. 256975, Intelligent Information Management (ICT-2009.4.3)non-peer-reviewe

    Querying over Federated SPARQL Endpoints - A State of the Art Survey

    Get PDF
    Technical reportThe increasing amount of Linked Data and its inherent distributed nature have attracted significant attention throughout the research community and amongst practitioners to search data, in the past years. Inspired by research results from traditional distributed databases, different approaches for managing federation over SPARQL Endpoints have been introduced. SPARQL is the standardised query language for RDF, the default data model used in Linked Data deployments and SPARQL Endpoints are a popular access mechanism provided by many Linked Open Data (LOD) repositories. In this paper, we initially give an overview of the federation framework infrastructure and then proceed with a comparison of existing SPARQL federation frameworks. Finally, we highlight shortcomings in existing frameworks, which we hope helps spawning new research directionsScience Foundation Ireland (Lion 2); European Commission FP7 Support Action (LATC), project no. 256975, Intelligent Information Management (ICT-2009.4.3

    FESHYD : busca federada sobre bases de dados RDF híbridas

    Get PDF
    Orientadora: Carmem Satie HaraCoorientadora: Raqueline Ritter de Moura PenteadoDissertação (mestrado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa : Curitiba, 04/09/2020Inclui referências: p.66-67Área de concentração: Ciência da ComputaçãoResumo Na Web Semantica, os dados sao disponibilizados no formato RDF e consultados por meio da linguagem SPARQL. A maioria dos processadores de consultas consideram apenas bases RDF federadas ou apenas bases proprietarias. Bases federadas consistem de um conjunto de repositorios autonomos, enquanto bases proprietarias permitem acesso irrestrito, tanto aos dados quanto ao processamento interno da consulta. Caso uma consulta envolva tanto dados de bases de terceiros autonomas bem como dados da base proprietaria, existem duas alternativas para o seu processamento: (i) tratar a base proprietaria como um componente da base federada; (ii) intervencao do usuario para integrar os dados de base proprietaria e federada. Embora ambas permitam integracao de dados da base, elas nao exploram otimizacoes que sao possiveis pelo fato de haver acesso irrestrito a base proprietaria. Esta questao e tratada nesta dissertacao, com a proposta de uma terceira alternativa, denominada de FeSHyD, que processa consultas SPARQL tanto sobre bases federadas quanto proprietarias distribuidas. O FeSHyD gera um plano de consultas otimizado, que e executado em paralelo por todos os servidores que compoem a base proprietaria. Durante a geracao do plano, a otimizacao envolve metodos para a selecao das fontes e para a ordenacao dos blocos que compoem o plano de consulta, de forma que a base proprietaria seja explorada antes de submeter subconsultas as bases de terceiros. Durante o processamento da consulta, os servidores da base proprietaria submetem estas subconsultas a base federada diretamente, sem a existencia de um ponto central de controle. O sistema foi implementado e os resultados experimentais mostram que ele reduz o tempo de processamento de consultas em ate 45% comparado a alternativa de tratar a base proprietaria como um componente de uma base federada. Palavras-chave: busca federada, consulta SPARQL, bases de dados hibridas distribuidas, integracao de sistemas distribuidos, selecao de fontes, ordenacao das subconsultas.Abstract In the Semantic Web, data is made available in RDF format and queried using the SPARQL language. Most query processors consider only federated RDF bases or only proprietary bases. Federated databases consist of a set of autonomous repositories, while proprietary databases allow unrestricted access, both to data and to query processing execution alternatives. If a query involves both data from autonomous third party databases as well as data from the proprietary database, there are two alternatives for processing it: (i) consider the proprietary base as a component of the federated database; (ii) rely on user intervention to integrate the proprietary and federated databases. Although both alternatives promote data integration, they do not explore optimizations that are possible by the fact that there is unrestricted access to the proprietary base. This issue is addressed in this dissertation, with the proposal of a third alternative, called FeSHyD, which processes SPARQL queries on both federated and distributed proprietary bases. FeSHyD generates an optimized query plan that is executed in parallel by all servers that compose the proprietary database. During the generation of the plan, the optimization involves methods for selecting external data sources, and for ordering the blocks that compose the query plan such that the proprietary base is explored before subqueries are submitted to external sources. During query processing, these subqueries are sent to third party databases directly by the servers, without relying on a central control point. The system was implemented and the experimental results show that it reduces query processing time by up to 45% compared to the alternative of considering the proprietary base as a component of a federated database. Keywords: federated search, SPARQL query, distributed hybrid databases, distributed system integration, source selection, subquery orderin

    Serviços de integração de dados para aplicações biomédicas

    Get PDF
    Doutoramento em Informática (MAP-i)In the last decades, the field of biomedical science has fostered unprecedented scientific advances. Research is stimulated by the constant evolution of information technology, delivering novel and diverse bioinformatics tools. Nevertheless, the proliferation of new and disconnected solutions has resulted in massive amounts of resources spread over heterogeneous and distributed platforms. Distinct data types and formats are generated and stored in miscellaneous repositories posing data interoperability challenges and delays in discoveries. Data sharing and integrated access to these resources are key features for successful knowledge extraction. In this context, this thesis makes contributions towards accelerating the semantic integration, linkage and reuse of biomedical resources. The first contribution addresses the connection of distributed and heterogeneous registries. The proposed methodology creates a holistic view over the different registries, supporting semantic data representation, integrated access and querying. The second contribution addresses the integration of heterogeneous information across scientific research, aiming to enable adequate data-sharing services. The third contribution presents a modular architecture to support the extraction and integration of textual information, enabling the full exploitation of curated data. The last contribution lies in providing a platform to accelerate the deployment of enhanced semantic information systems. All the proposed solutions were deployed and validated in the scope of rare diseases.Nas últimas décadas, o campo das ciências biomédicas proporcionou grandes avanços científicos estimulados pela constante evolução das tecnologias de informação. A criação de diversas ferramentas na área da bioinformática e a falta de integração entre novas soluções resultou em enormes quantidades de dados distribuídos por diferentes plataformas. Dados de diferentes tipos e formatos são gerados e armazenados em vários repositórios, o que origina problemas de interoperabilidade e atrasa a investigação. A partilha de informação e o acesso integrado a esses recursos são características fundamentais para a extração bem sucedida do conhecimento científico. Nesta medida, esta tese fornece contribuições para acelerar a integração, ligação e reutilização semântica de dados biomédicos. A primeira contribuição aborda a interconexão de registos distribuídos e heterogéneos. A metodologia proposta cria uma visão holística sobre os diferentes registos, suportando a representação semântica de dados e o acesso integrado. A segunda contribuição aborda a integração de diversos dados para investigações científicas, com o objetivo de suportar serviços interoperáveis para a partilha de informação. O terceiro contributo apresenta uma arquitetura modular que apoia a extração e integração de informações textuais, permitindo a exploração destes dados. A última contribuição consiste numa plataforma web para acelerar a criação de sistemas de informação semânticos. Todas as soluções propostas foram validadas no âmbito das doenças raras
    corecore