726 research outputs found

    From Events to Reactions: A Progress Report

    Full text link
    Syndicate is a new coordinated, concurrent programming language. It occupies a novel point on the spectrum between the shared-everything paradigm of threads and the shared-nothing approach of actors. Syndicate actors exchange messages and share common knowledge via a carefully controlled database that clearly scopes conversations. This approach clearly simplifies coordination of concurrent activities. Experience in programming with Syndicate, however, suggests a need to raise the level of linguistic abstraction. In addition to writing event handlers and managing event subscriptions directly, the language will have to support a reactive style of programming. This paper presents event-oriented Syndicate programming and then describes a preliminary design for augmenting it with new reactive programming constructs.Comment: In Proceedings PLACES 2016, arXiv:1606.0540

    BioCloud Search EnGene: Surfing Biological Data on the Cloud

    Get PDF
    The massive production and spread of biomedical data around the web introduces new challenges related to identify computational approaches for providing quality search and browsing of web resources. This papers presents BioCloud Search EnGene (BSE), a cloud application that facilitates searching and integration of the many layers of biological information offered by public large-scale genomic repositories. Grounding on the concept of dataspace, BSE is built on top of a cloud platform that severely curtails issues associated with scalability and performance. Like popular online gene portals, BSE adopts a gene-centric approach: researchers can find their information of interest by means of a simple “Google-like” query interface that accepts standard gene identification as keywords. We present BSE architecture and functionality and discuss how our strategies contribute to successfully tackle big data problems in querying gene-based web resources. BSE is publically available at: http://biocloud-unica.appspot.com/

    Supporting service discovery, querying and interaction in ubiquitous computing environments.

    Get PDF
    In this paper, we contend that ubiquitous computing environments will be highly heterogeneous, service rich domains. Moreover, future applications will consequently be required to interact with multiple, specialised service location and interaction protocols simultaneously. We argue that existing service discovery techniques do not provide sufficient support to address the challenges of building applications targeted to these emerging environments. This paper makes a number of contributions. Firstly, using a set of short ubiquitous computing scenarios we identify several key limitations of existing service discovery approaches that reduce their ability to support ubiquitous computing applications. Secondly, we present a detailed analysis of requirements for providing effective support in this domain. Thirdly, we provide the design of a simple extensible meta-service discovery architecture that uses database techniques to unify service discovery protocols and addresses several of our key requirements. Lastly, we examine the lessons learnt through the development of a prototype implementation of our architecture

    Verification of distributed dataspace architectures

    Get PDF

    LinkedScales : bases de dados em multiescala

    Get PDF
    Orientador: André SantanchèTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: As ciências biológicas e médicas precisam cada vez mais de abordagens unificadas para a análise de dados, permitindo a exploração da rede de relacionamentos e interações entre elementos. No entanto, dados essenciais estão frequentemente espalhados por um conjunto cada vez maior de fontes com múltiplos níveis de heterogeneidade entre si, tornando a integração cada vez mais complexa. Abordagens de integração existentes geralmente adotam estratégias especializadas e custosas, exigindo a produção de soluções monolíticas para lidar com formatos e esquemas específicos. Para resolver questões de complexidade, essas abordagens adotam soluções pontuais que combinam ferramentas e algoritmos, exigindo adaptações manuais. Abordagens não sistemáticas dificultam a reutilização de tarefas comuns e resultados intermediários, mesmo que esses possam ser úteis em análises futuras. Além disso, é difícil o rastreamento de transformações e demais informações de proveniência, que costumam ser negligenciadas. Este trabalho propõe LinkedScales, um dataspace baseado em múltiplos níveis, projetado para suportar a construção progressiva de visões unificadas de fontes heterogêneas. LinkedScales sistematiza as múltiplas etapas de integração em escalas, partindo de representações brutas (escalas mais baixas), indo gradualmente para estruturas semelhantes a ontologias (escalas mais altas). LinkedScales define um modelo de dados e um processo de integração sistemático e sob demanda, através de transformações em um banco de dados de grafos. Resultados intermediários são encapsulados em escalas reutilizáveis e transformações entre escalas são rastreadas em um grafo de proveniência ortogonal, que conecta objetos entre escalas. Posteriormente, consultas ao dataspace podem considerar objetos nas escalas e o grafo de proveniência ortogonal. Aplicações práticas de LinkedScales são tratadas através de dois estudos de caso, um no domínio da biologia -- abordando um cenário de análise centrada em organismos -- e outro no domínio médico -- com foco em dados de medicina baseada em evidênciasAbstract: Biological and medical sciences increasingly need a unified, network-driven approach for exploring relationships and interactions among data elements. Nevertheless, essential data is frequently scattered across sources with multiple levels of heterogeneity. Existing data integration approaches usually adopt specialized, heavyweight strategies, requiring a costly upfront effort to produce monolithic solutions for handling specific formats and schemas. Furthermore, such ad-hoc strategies hamper the reuse of intermediary integration tasks and outcomes. This work proposes LinkedScales, a multiscale-based dataspace designed to support the progressive construction of a unified view of heterogeneous sources. It departs from raw representations (lower scales) and goes towards ontology-like structures (higher scales). LinkedScales defines a data model and a systematic, gradual integration process via operations over a graph database. Intermediary outcomes are encapsulated as reusable scales, tracking the provenance of inter-scale operations. Later, queries can combine both scale data and orthogonal provenance information. Practical applications of LinkedScales are discussed through two case studies on the biology domain -- addressing an organism-centric analysis scenario -- and the medical domain -- focusing on evidence-based medicine dataDoutoradoCiência da ComputaçãoDoutor em Ciência da Computação141353/2015-5CAPESCNP

    Intersection schemas as a dataspace integration technique

    Get PDF
    This paper introduces the concept of Intersection Schemas in the field of heterogeneous data integration and dataspaces. We introduce a technique for incrementally integrating heterogeneous data sources by specifying semantic overlaps between sets of extensional schemas using bidirectional schema transformations, and automatically combining them into a global schema at each iteration of the integration process. We propose an incremental data integration methodology that uses this technique and that aims to reduce the amount of up-front effort required. Such approaches to data integration are often described as pay-as-you-go. A demonstrator of our technique is described, which utilizes a new graphical user tool implemented using the AutoMed heterogeneous data integration system. A case study is also described, and our technique and integration methodology are compared with a classical data integration strategy

    Dynamic digital factories for agile supply chains: An architectural approach

    Get PDF
    Digital factories comprise a multi-layered integration of various activities along the factories and product lifecycles. A central aspect of a digital factory is that of enabling the product lifecycle stakeholders to collaborate through the use of software solutions. The digital factory thus expands outside the company boundaries and offers the opportunity to collaborate on business processes affecting the whole supply chain. This paper discusses an interoperability architecture for digital factories. To this end, it delves into the issue by analysing the key requirements for enabling a scalable factory architecture characterized by access to services, aggregation of data, and orchestration of production processes. Then, the paper revises the state-of-the-art w.r.t. these requirements and proposes an architectural framework conjugating features of both service-oriented and data-sharing architectures. The framework is exemplified through a case study

    Integration of Biological Sources: Exploring the Case of Protein Homology

    Get PDF
    Data integration is a key issue in the domain of bioin- formatics, which deals with huge amounts of heteroge- neous biological data that grows and changes rapidly. This paper serves as an introduction in the field of bioinformatics and the biological concepts it deals with, and an exploration of the integration problems a bioinformatics scientist faces. We examine ProGMap, an integrated protein homology system used by bioin- formatics scientists at Wageningen University, and several use cases related to protein homology. A key issue we identify is the huge manual effort required to unify source databases into a single resource. Un- certain databases are able to contain several possi- ble worlds, and it has been proposed that they can be used to significantly reduce initial integration efforts. We propose several directions for future work where uncertain databases can be applied to bioinformatics, with the goal of furthering the cause of bioinformatics integration

    Cost-based Optimization of Multistore Query Plans

    Get PDF
    Multistores are data management systems that enable query processing across different and heterogeneous databases; besides the distribution of data, complexity factors like schema heterogeneity and data replication must be resolved through integration and data fusion activities. Our multistore solution relies on a dataspace to provide the user with an integrated view of the available data and enables the formulation and execution of GPSJ queries. In this paper, we propose a technique to optimize the execution of GPSJ queries by formulating and evaluating different execution plans on the multistore. In particular, we outline different strategies to carry out joins and data fusion by relying on different schema representations; then, a self-learning black-box cost model is used to estimate execution times and select the most efficient plan. The experiments assess the effectiveness of the cost model in choosing the best execution plan for the given queries and exploit multiple multistore benchmarks to investigate the factors that influence the performance of different plans
    corecore