240 research outputs found

    Exploiting Latent Features of Text and Graphs

    Get PDF
    As the size and scope of online data continues to grow, new machine learning techniques become necessary to best capitalize on the wealth of available information. However, the models that help convert data into knowledge require nontrivial processes to make sense of large collections of text and massive online graphs. In both scenarios, modern machine learning pipelines produce embeddings --- semantically rich vectors of latent features --- to convert human constructs for machine understanding. In this dissertation we focus on information available within biomedical science, including human-written abstracts of scientific papers, as well as machine-generated graphs of biomedical entity relationships. We present the Moliere system, and our method for identifying new discoveries through the use of natural language processing and graph mining algorithms. We propose heuristically-based ranking criteria to augment Moliere, and leverage this ranking to identify a new gene-treatment target for HIV-associated Neurodegenerative Disorders. We additionally focus on the latent features of graphs, and propose a new bipartite graph embedding technique. Using our graph embedding, we advance the state-of-the-art in hypergraph partitioning quality. Having newfound intuition of graph embeddings, we present Agatha, a deep-learning approach to hypothesis generation. This system learns a data-driven ranking criteria derived from the embeddings of our large proposed biomedical semantic graph. To produce human-readable results, we additionally propose CBAG, a technique for conditional biomedical abstract generation

    Analysis and Visualisation of Edge Entanglement in Multiplex Networks

    Get PDF
    Cette thèse présente une nouvelle méthodologie pour analyser des réseaux. Nous développons l'intrication d'un réseau multiplex, qui se matérialise sous forme d'une mesure d'intensité et d'homogénéité, et d'une abstraction, le réseau d'interaction des catalyseurs, auxquels sont associés des indices d'intrication. Nous présentons ensuite la mise en place d'outils spécifiques pour l'analyse visuelle des réseaux complexes qui tirent profit de cette méthodologie. Ces outils présente une vue double de deux réseaux,qui inclue une un algorithme de dessin, une interaction associant brossage d'une sélection et de multiples liens pré-attentifs. Nous terminons ce document par la présentation détaillée d'applications dans de multiples domaines.When it comes to comprehension of complex phenomena, humans need to understand what interactions lie within them.These interactions are often captured with complex networks. However, the interaction pluralism is often shallowed by traditional network models. We propose a new way to look at these phenomena through the lens of multiplex networks, in which catalysts are drivers of the interaction through substrates. To study the entanglement of a multiplex network is to study how edges intertwine, in other words, how catalysts interact. Our entanglement analysis results in a full set of new objects which completes traditional network approaches: the entanglement homogeneity and intensity of the multiplex network, and the catalyst interaction network, with for each catalyst, an entanglement index. These objects are very suitable for embedment in a visual analytics framework, to enable comprehension of a complex structure. We thus propose of visual setting with coordinated multiple views. We take advantage of mental mapping and visual linking to present simultaneous information of a multiplex network at three different levels of abstraction. We complete brushing and linking with a leapfrog interaction that mimics the back-and-forth process involved in users' comprehension. The method is validated and enriched through multiple applications including assessing group cohesion in document collections, and identification of particular associations in social networks.BORDEAUX1-Bib.electronique (335229901) / SudocSudocFranceF

    Visual analytics for relationships in scientific data

    Get PDF
    Domain scientists hope to address grand scientific challenges by exploring the abundance of data generated and made available through modern high-throughput techniques. Typical scientific investigations can make use of novel visualization tools that enable dynamic formulation and fine-tuning of hypotheses to aid the process of evaluating sensitivity of key parameters. These general tools should be applicable to many disciplines: allowing biologists to develop an intuitive understanding of the structure of coexpression networks and discover genes that reside in critical positions of biological pathways, intelligence analysts to decompose social networks, and climate scientists to model extrapolate future climate conditions. By using a graph as a universal data representation of correlation, our novel visualization tool employs several techniques that when used in an integrated manner provide innovative analytical capabilities. Our tool integrates techniques such as graph layout, qualitative subgraph extraction through a novel 2D user interface, quantitative subgraph extraction using graph-theoretic algorithms or by querying an optimized B-tree, dynamic level-of-detail graph abstraction, and template-based fuzzy classification using neural networks. We demonstrate our system using real-world workflows from several large-scale studies. Parallel coordinates has proven to be a scalable visualization and navigation framework for multivariate data. However, when data with thousands of variables are at hand, we do not have a comprehensive solution to select the right set of variables and order them to uncover important or potentially insightful patterns. We present algorithms to rank axes based upon the importance of bivariate relationships among the variables and showcase the efficacy of the proposed system by demonstrating autonomous detection of patterns in a modern large-scale dataset of time-varying climate simulation

    Methods and applications in social networks analysis

    Get PDF
    The Social Network Analysis perspective has proven the ability to develop a significant breadth of theoretical and methodological issues witnessed by the contribution of an increasing number of scholars and the multiplication of empirical applications in a wide range of scientific fields. One of the disciplinary areas in which this development has occurred, among others, is certainly that of computational social science, by virtue of the developing field of online social networks and the leading role of information technologies in the production of scientific knowledge. The complex nature of social phenomena enforced the usefulness of the network perspective as a wealth of theoretical and methodological tools capable of penetrating within the dimensions of that complexity. The book hosts eleven contributions that within a sound theoretical ground, present different examples of speculative and applicative areas where the Social Network Analysis can contribute to explore, interpret and predict social interaction between actors. Some of the contributions were presented at the ARS’19 Conference held in Vietri sul Mare (Salerno, Italy) in October, 29-31 2019; it was the seventh of a biennial meetings series started in 2007 with the aim to promote relevant results and the most recent methodological developments in Social Network Analysis

    Methods and applications in social networks analysis

    Get PDF
    The Social Network Analysis perspective has proven the ability to develop a significant breadth of theoretical and methodological issues witnessed by the contribution of an increasing number of scholars and the multiplication of empirical applications in a wide range of scientific fields. One of the disciplinary areas in which this development has occurred, among others, is certainly that of computational social science, by virtue of the developing field of online social networks and the leading role of information technologies in the production of scientific knowledge. The complex nature of social phenomena enforced the usefulness of the network perspective as a wealth of theoretical and methodological tools capable of penetrating within the dimensions of that complexity. The book hosts eleven contributions that within a sound theoretical ground, present different examples of speculative and applicative areas where the Social Network Analysis can contribute to explore, interpret and predict social interaction between actors. Some of the contributions were presented at the ARS’19 Conference held in Vietri sul Mare (Salerno, Italy) in October, 29-31 2019; it was the seventh of a biennial meetings series started in 2007 with the aim to promote relevant results and the most recent methodological developments in Social Network Analysis

    A survey of statistical network models

    Full text link
    Networks are ubiquitous in science and have become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a form of graphical representation. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active network community and a substantial literature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning network literature in statistical physics and computer science. The growth of the World Wide Web and the emergence of online networking communities such as Facebook, MySpace, and LinkedIn, and a host of more specialized professional network communities has intensified interest in the study of networks and network data. Our goal in this review is to provide the reader with an entry point to this burgeoning literature. We begin with an overview of the historical development of statistical network modeling and then we introduce a number of examples that have been studied in the network literature. Our subsequent discussion focuses on a number of prominent static and dynamic network models and their interconnections. We emphasize formal model descriptions, and pay special attention to the interpretation of parameters and their estimation. We end with a description of some open problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference

    Visualization of modular structures in biological networks

    Get PDF

    A Visual Analytics Approach to Debugging Cooperative, Autonomous Multi-Robot Systems' Worldviews

    Full text link
    Autonomous multi-robot systems, where a team of robots shares information to perform tasks that are beyond an individual robot's abilities, hold great promise for a number of applications, such as planetary exploration missions. Each robot in a multi-robot system that uses the shared-world coordination paradigm autonomously schedules which robot should perform a given task, and when, using its worldview--the robot's internal representation of its belief about both its own state, and other robots' states. A key problem for operators is that robots' worldviews can fall out of sync (often due to weak communication links), leading to desynchronization of the robots' scheduling decisions and inconsistent emergent behavior (e.g., tasks not performed, or performed by multiple robots). Operators face the time-consuming and difficult task of making sense of the robots' scheduling decisions, detecting de-synchronizations, and pinpointing the cause by comparing every robot's worldview. To address these challenges, we introduce MOSAIC Viewer, a visual analytics system that helps operators (i) make sense of the robots' schedules and (ii) detect and conduct a root cause analysis of the robots' desynchronized worldviews. Over a year-long partnership with roboticists at the NASA Jet Propulsion Laboratory, we conduct a formative study to identify the necessary system design requirements and a qualitative evaluation with 12 roboticists. We find that MOSAIC Viewer is faster- and easier-to-use than the users' current approaches, and it allows them to stitch low-level details to formulate a high-level understanding of the robots' schedules and detect and pinpoint the cause of the desynchronized worldviews.Comment: To appear in IEEE Conference on Visual Analytics Science and Technology (VAST) 202

    Généralisation de modèles métaboliques par connaissances

    Get PDF
    Genome-scale metabolic models describe the relationships between thousands of reactions and biochemical molecules, and are used to improve our understanding of organism’s metabolism. They found applications in pharmaceutical, chemical and bioremediation industries.The complexity of metabolic models hampers many tasks that are important during the process of model inference, such as model comparison, analysis, curation and refinement by human experts. The abundance of details in large-scale networks can mask errors and important organism-specific adaptations. It is therefore important to find the right levels of abstraction that are comfortable for human experts. These abstract levels should highlight the essential model structure and the divergences from it, such as alternative paths or missing reactions, while hiding inessential details.To address this issue, we defined a knowledge-based generalization that allows for production of higher-level abstract views of metabolic network models. We developed a theoretical method that groups similar metabolites and reactions based on the network structure and the knowledge extracted from metabolite ontologies, and then compresses the network based on this grouping. We implemented our method as a python library, that is available for download from metamogen.gforge.inria.fr.To validate our method we applied it to 1 286 metabolic models from the Path2Model project, and showed that it helps to detect organism-, and domain-specific adaptations, as well as to compare models.Based on discussions with users about their ways of navigation in metabolic networks, we defined a 3-level representation of metabolic networks: the full-model level, the generalized level, the compartment level. We combined our model generalization method with the zooming user interface (ZUI) paradigm and developed Mimoza, a user-centric tool for zoomable navigation and knowledgebased exploration of metabolic networks that produces this 3-level representation. Mimoza is available both as an on-line tool and for download atmimoza.bordeaux.inria.fr.Les réseaux métaboliques à l’échelle génomique décrivent les relations entre milliers de réactions et molécules biochimiques pour améliorer notre compréhension du métabolisme. Ils trouvent des applications dans les domaines chimiques, pharmaceutiques, et dans la biorestauration.La complexité de modèles métaboliques mets des obstacles á l’inférence des modèles, à la comparaison entre eux, ainsi que leur analyse, curation et amélioration par des experts humains. Parce que l’abondance des détailles dans les réseaux à grande échelle peut cacher des erreurs et des adaptations importantes de l’espèce qui est étudié, c’est important de trouver les correct niveaux d’abstraction qui sont confortables pour les experts humains : on doit mettre en évidence la structure essentiel du modèle ainsi que les divergences de celle-là (par exemple les chemins alternatives et les réactions manquantes), tout en masquant les détails non significatifs.Pour répondre a cette demande nous avons défini une généralisation des modèles métaboliques, fondée sur les connaissances, qui permet la création des vues abstraites de réseaux métaboliques. Nous avons développé une méthode théorétique qui regroupe les métabolites en classes d’équivalence et factorise les réactions reliant ces classes d’équivalence. Nous avons réalisé cette méthode comme une bibliothèque Python qui peut être téléchargée depuis metamogen.gforge.inria.fr.Pour valider l’intérêt de notre méthode, nous l’avons appliquée à 1 286 modèles métaboliques que nous avons extraits de la ressource Path2Model. Nous avons montré que notre méthode aide l’expert humain à relever de façon automatique les adaptations spécifiques de certains espèces et à comparer les modèles entre eux.Après en avoir discuté avec des utilisateurs, nous avons décidé de définir trois niveaux hiérarchiques de représentation de réseaux métaboliques : les compartiments, les modules et les réactions détaillées. Nous avons combiné notre méthode de généralisation et le paradigme des interfaces zoomables pour développer Mimoza, un système de navigation dans les réseaux métaboliques qui crée et visualise ces trois niveaux. Mimoza est accessible en ligne et pour le téléchargement depuis le site mimoza.bordeaux.inria.fr
    corecore