63 research outputs found

    LinkedScales : bases de dados em multiescala

    Get PDF
    Orientador: André SantanchèTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: As ciências biológicas e médicas precisam cada vez mais de abordagens unificadas para a análise de dados, permitindo a exploração da rede de relacionamentos e interações entre elementos. No entanto, dados essenciais estão frequentemente espalhados por um conjunto cada vez maior de fontes com múltiplos níveis de heterogeneidade entre si, tornando a integração cada vez mais complexa. Abordagens de integração existentes geralmente adotam estratégias especializadas e custosas, exigindo a produção de soluções monolíticas para lidar com formatos e esquemas específicos. Para resolver questões de complexidade, essas abordagens adotam soluções pontuais que combinam ferramentas e algoritmos, exigindo adaptações manuais. Abordagens não sistemáticas dificultam a reutilização de tarefas comuns e resultados intermediários, mesmo que esses possam ser úteis em análises futuras. Além disso, é difícil o rastreamento de transformações e demais informações de proveniência, que costumam ser negligenciadas. Este trabalho propõe LinkedScales, um dataspace baseado em múltiplos níveis, projetado para suportar a construção progressiva de visões unificadas de fontes heterogêneas. LinkedScales sistematiza as múltiplas etapas de integração em escalas, partindo de representações brutas (escalas mais baixas), indo gradualmente para estruturas semelhantes a ontologias (escalas mais altas). LinkedScales define um modelo de dados e um processo de integração sistemático e sob demanda, através de transformações em um banco de dados de grafos. Resultados intermediários são encapsulados em escalas reutilizáveis e transformações entre escalas são rastreadas em um grafo de proveniência ortogonal, que conecta objetos entre escalas. Posteriormente, consultas ao dataspace podem considerar objetos nas escalas e o grafo de proveniência ortogonal. Aplicações práticas de LinkedScales são tratadas através de dois estudos de caso, um no domínio da biologia -- abordando um cenário de análise centrada em organismos -- e outro no domínio médico -- com foco em dados de medicina baseada em evidênciasAbstract: Biological and medical sciences increasingly need a unified, network-driven approach for exploring relationships and interactions among data elements. Nevertheless, essential data is frequently scattered across sources with multiple levels of heterogeneity. Existing data integration approaches usually adopt specialized, heavyweight strategies, requiring a costly upfront effort to produce monolithic solutions for handling specific formats and schemas. Furthermore, such ad-hoc strategies hamper the reuse of intermediary integration tasks and outcomes. This work proposes LinkedScales, a multiscale-based dataspace designed to support the progressive construction of a unified view of heterogeneous sources. It departs from raw representations (lower scales) and goes towards ontology-like structures (higher scales). LinkedScales defines a data model and a systematic, gradual integration process via operations over a graph database. Intermediary outcomes are encapsulated as reusable scales, tracking the provenance of inter-scale operations. Later, queries can combine both scale data and orthogonal provenance information. Practical applications of LinkedScales are discussed through two case studies on the biology domain -- addressing an organism-centric analysis scenario -- and the medical domain -- focusing on evidence-based medicine dataDoutoradoCiência da ComputaçãoDoutor em Ciência da Computação141353/2015-5CAPESCNP

    Exact maximal reduction of stochastic reaction networks by species lumping

    Get PDF
    Motivation: Stochastic reaction networks are a widespread model to describe biological systems where the presence of noise is relevant, such as in cell regulatory processes. Unfortu-nately, in all but simplest models the resulting discrete state-space representation hinders analytical tractability and makes numerical simulations expensive. Reduction methods can lower complexity by computing model projections that preserve dynamics of interest to the user. Results: We present an exact lumping method for stochastic reaction networks with mass-action kinetics. It hinges on an equivalence relation between the species, resulting in a reduced network where the dynamics of each macro-species is stochastically equivalent to the sum of the original species in each equivalence class, for any choice of the initial state of the system. Furthermore, by an appropriate encoding of kinetic parameters as additional species, the method can establish equivalences that do not depend on specific values of the parameters. The method is supported by an efficient algorithm to compute the largest species equivalence, thus the maximal lumping. The effectiveness and scalability of our lumping technique, as well as the physical interpretability of resulting reductions, is demonstrated in several models of signaling pathways and epidemic processes on complex networks. Availability: The algorithms for species equivalence have been implemented in the software tool ERODE, freely available for download from https://www.erode.eu

    Exact and approximate role assignment for multi-layer networks

    Get PDF
    The concept of role equivalence has been applied in social network analysis for decades. Early definitions recognized two social actors as role equivalent, if they have identical relationships to the same other actors. Although this rather strong equivalence requirement has been relaxed in different ways, it is often challenging to detect interesting, non-Trivial role equivalences, especially for social networks derived from empirical data. Multi-layer networks (MLNs) are increasingly gaining popularity for modelling collective adaptive systems, for example, engineered cyber-physical systems or animal collectives. Multiplex networks, a special case of MLNs, transparently and compactly describe such complex interactions (social, biological, transportation), where nodes can be connected by links of different types. In this work, we first propose a novel notion of exact and approximate role equivalence for multiplex MLNs. Then, we implement and experimentally evaluate the algorithm on a suite of real-world case studies. Results demonstrate that our notion of approximate role assignment not only obtains non-Trivial partitions over nodes and layers as well, but it provides a fine-grained hierarchy of role equivalences, which is impossible to obtain by (combining) the existing role detection techniques. We demonstrate the latter by interpreting in detail the case study of Florence families, a classical benchmark from literature

    Executable cancer models: successes and challenges

    Get PDF
    Making decisions on how best to treat cancer patients requires the integration of different data sets, including genomic profiles, tumour histopathology, radiological images, proteomic analysis and more. This wealth of biological information calls for novel strategies to integrate such information in a meaningful, predictive and experimentally verifiable way. In this Perspective we explain how executable computational models meet this need. Such models provide a means for comprehensive data integration, can be experimentally validated, are readily interpreted both biologically and clinically, and have the potential to predict effective therapies for different cancer types and subtypes. We explain what executable models are and how they can be used to represent the dynamic biological behaviours inherent in cancer, and demonstrate how such models, when coupled with automated reasoning, facilitate our understanding of the mechanisms by which oncogenic signalling pathways regulate tumours. We explore how executable models have impacted the field of cancer research and argue that extending them to represent a tumour in a specific patient (that is, an avatar) will pave the way for improved personalized treatments and precision medicine. Finally, we highlight some of the ongoing challenges in developing executable models and stress that effective cross-disciplinary efforts are key to forward progress in the field

    Barrier elision for production parallel programs

    Full text link
    Large scientific code bases are often composed of several layers of runtime libraries, implemented in multiple programming languages. In such situation, programmers often choose conservative synchronization patterns leading to suboptimal performance. In this paper, we present context-sensitive dynamic optimizations that elide barriers redundant during the program execution. In our technique, we perform data race detection alongside the program to identify redundant barriers in their calling contexts; after an initial learning, we start eliding all future instances of barriers occurring in the same calling context. We present an automatic on-the-fly optimization and a multi-pass guided optimization. We apply our techniques to NWChem - a 6 million line computational chemistry code written in C/C++/Fortran that uses several runtime libraries such as Global Arrays, ComEx, DMAPP, and MPI. Our technique elides a surprisingly high fraction of barriers (as many as 63%) in production runs. This redundancy elimination translates to application speedups as high as 14% on 2048 cores. Our techniques also provided valuable insight about the application behavior, later used by NWChem developers. Overall, we demonstrate the value of holistic context-sensitive analyses that consider the domain science in conjunction with the associated runtime software stack

    A diversity-aware computational framework for systems biology

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Formal Techniques for Component-based Design of Embedded Systems

    Get PDF
    Embedded systems have become ubiquitous - from avionics and automotive over consumer electronics to medical devices. Failures may entailmaterial damage or compromise safety of human beings. At the same time, shorter product cycles, together with fast growing complexity of the systems to be designed, create a tremendous need for rigorous design techniques. The goal of component-based construction is to build complex systems from simpler components that are well understood and can be (re)used so as to accelerate the design process. This document presents a summary of the formal techniques for component-based design of embedded systems I have (co-)developed

    Toward a formal theory for computing machines made out of whatever physics offers: extended version

    Full text link
    Approaching limitations of digital computing technologies have spurred research in neuromorphic and other unconventional approaches to computing. Here we argue that if we want to systematically engineer computing systems that are based on unconventional physical effects, we need guidance from a formal theory that is different from the symbolic-algorithmic theory of today's computer science textbooks. We propose a general strategy for developing such a theory, and within that general view, a specific approach that we call "fluent computing". In contrast to Turing, who modeled computing processes from a top-down perspective as symbolic reasoning, we adopt the scientific paradigm of physics and model physical computing systems bottom-up by formalizing what can ultimately be measured in any physical substrate. This leads to an understanding of computing as the structuring of processes, while classical models of computing systems describe the processing of structures.Comment: 76 pages. This is an extended version of a perspective article with the same title that will appear in Nature Communications soon after this manuscript goes public on arxi

    Visual region understanding: unsupervised extraction and abstraction

    Get PDF
    The ability to gain a conceptual understanding of the world in uncontrolled environments is the ultimate goal of vision-based computer systems. Technological societies today are heavily reliant on surveillance and security infrastructure, robotics, medical image analysis, visual data categorisation and search, and smart device user interaction, to name a few. Out of all the complex problems tackled by computer vision today in context of these technologies, that which lies closest to the original goals of the field is the subarea of unsupervised scene analysis or scene modelling. However, its common use of low level features does not provide a good balance between generality and discriminative ability, both a result and a symptom of the sensory and semantic gaps existing between low level computer representations and high level human descriptions. In this research we explore a general framework that addresses the fundamental problem of universal unsupervised extraction of semantically meaningful visual regions and their behaviours. For this purpose we address issues related to (i) spatial and spatiotemporal segmentation for region extraction, (ii) region shape modelling, and (iii) the online categorisation of visual object classes and the spatiotemporal analysis of their behaviours. Under this framework we propose (a) a unified region merging method and spatiotemporal region reduction, (b) shape representation by the optimisation and novel simplication of contour-based growing neural gases, and (c) a foundation for the analysis of visual object motion properties using a shape and appearance based nearest-centroid classification algorithm and trajectory plots for the obtained region classes. 1 Specifically, we formulate a region merging spatial segmentation mechanism that combines and adapts features shown previously to be individually useful, namely parallel region growing, the best merge criterion, a time adaptive threshold, and region reduction techniques. For spatiotemporal region refinement we consider both scalar intensity differences and vector optical flow. To model the shapes of the visual regions thus obtained, we adapt the growing neural gas for rapid region contour representation and propose a contour simplication technique. A fast unsupervised nearest-centroid online learning technique next groups observed region instances into classes, for which we are then able to analyse spatial presence and spatiotemporal trajectories. The analysis results show semantic correlations to real world object behaviour. Performance evaluation of all steps across standard metrics and datasets validate their performance

    Reversible Computation: Extending Horizons of Computing

    Get PDF
    This open access State-of-the-Art Survey presents the main recent scientific outcomes in the area of reversible computation, focusing on those that have emerged during COST Action IC1405 "Reversible Computation - Extending Horizons of Computing", a European research network that operated from May 2015 to April 2019. Reversible computation is a new paradigm that extends the traditional forwards-only mode of computation with the ability to execute in reverse, so that computation can run backwards as easily and naturally as forwards. It aims to deliver novel computing devices and software, and to enhance existing systems by equipping them with reversibility. There are many potential applications of reversible computation, including languages and software tools for reliable and recovery-oriented distributed systems and revolutionary reversible logic gates and circuits, but they can only be realized and have lasting effect if conceptual and firm theoretical foundations are established first
    corecore