49 research outputs found

    Workshop NotesInternational Workshop ``What can FCA do for Artificial Intelligence?'' (FCA4AI 2015)

    Get PDF
    International audienceThis volume includes the proceedings of the fourth edition of the FCA4AI --What can FCA do for Artificial Intelligence?-- Workshop co-located with the IJCAI 2015 Conference in Buenos Aires (Argentina). Formal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and classification. FCA allows one to build a concept lattice and a system of dependencies (implications) which can be used for many AI needs, e.g. knowledge discovery, learning, knowledge representation, reasoning, ontology engineering, as well as information retrieval and text processing. There are many ``natural links'' between FCA and AI, and the present workshop is organized for discussing about these links and more generally for improving the links between knowledge discovery based on FCA and knowledge management in artificial intelligence

    Towards Performance Portable Graph Algorithms

    Get PDF
    In today's data-driven world, our computational resources have become heterogeneous, making the processing of large-scale graphs in an architecture agnostic manner crucial. Traditionally, hand-optimized high-performance computing (HPC) solutions have been studied and used to implement highly efficient and scalable graph algorithms. In recent years, several graph processing and management systems have also been proposed. Hand optimized HPC approaches require high levels of expertise and graph processing frameworks suffer from expressibility and performance. Portability is a major concern for both approaches. The main thesis of this work is that block-based graph algorithms offer a compromise between efficient parallelism and architecture agnostic algorithm design for a wide class of graph problems. This dissertation seeks to prove this thesis by focusing the work on the three pillars; data/computation partitioning, block-based algorithm design, and performance portability. In this dissertation, we first show how we can partition the computation and the data to design efficient block-based algorithms for solving graph merging and triangle counting problems. Then, generalizing from our experiences, we propose an algorithmic framework, for shared-memory, heterogeneous machines for implementing block-based graph algorithms; PGAbB. PGAbB aims to maximally leverage different architectures by implementing a task-based execution on top of a block-based programming model. In this talk we will discuss PGAbB's programming model, algorithmic optimizations for scheduling, and load-balancing strategies for graph problems on real-world and synthetic inputs.Ph.D

    Data Integration in the Life Sciences: Scientific Workflows, Provenance, and Ranking

    Get PDF
    Biological research is a science which derives its findings from the proper analysis of experiments. Today, a large variety of experiments are carried-out in hundreds of labs around the world, and their results are reported in a myriad of different databases, web-sites, publications etc., using different formats, conventions, and schemas. Providing a uniform access to these diverse and distributed databases is the aim of data integration solutions, which have been designed and implemented within the bioinformatics community for more than 20 years. However, the perception of the problem of data integration research in the life sciences has changed: While early approaches concentrated on handling schema-dependent queries over heterogeneous and distributed databases, current research emphasizes instances rather than schemas, tries to place the human back into the loop, and intertwines data integration and data analysis. Transparency -- providing users with the illusion that they are using a centralized database and thus completely hiding the original databases -- was one of the main goals of federated databases. It is not a target anymore. Instead, users want to know exactly which data from which source was used in which way in studies (Provenance). The old model of "first integrate, then analyze" is replaced by a new, process-oriented paradigm: "integration is analysis - and analysis is integration". This paradigm change gives rise to some important research trends. First, the process of integration itself, i.e., the integration workflow, is becoming a research topic in its own. Scientific workflows actually implement the paradigm "integration is analysis". A second trend is the growing importance of sensible ranking, because data sets grow and grow and it becomes increasingly difficult for the biologist user to distinguish relevant data from large and noisy data sets. This HDR thesis outlines my contributions to the field of data integration in the life sciences. More precisely, my work takes place in the first two contexts mentioned above, namely, scientific workflows and biological data ranking. The reported results were obtained from 2005 to late 2014, first as a postdoctoral fellow at the Uniersity of Pennsylvania (Dec 2005 to Aug 2007) and then as an Associate Professor at Université Paris-Sud (LRI, UMR CNRS 8623, Bioinformactics team) and Inria (Saclay-Ile-de-France, AMIB team 2009-2014)

    Fundamentals

    Get PDF
    Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

    Fundamentals

    Get PDF
    Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

    Unweaving complex reactivity: graph-based tools to handle chemical reaction networks

    Get PDF
    La informació a nivell molecular obtinguda mitjançant estudis "in silico" s’ha establert com una eina essencial per a la caracterització de mecanismes de reacció complexos. A més, l’aplicabilitat de la química computacional s’ha vist substancialment ampliada a causa de l’increment continuat de la potència de càlcul disponible durant les darreres dècades. Així, no només han augmentat la precisió dels mètodes a utilitzar o la mida dels sistemes a modelitzar sinó també el grau de detall que es pot aconseguir en les descripcions mecanístiques resultants. Tanmateix, aquestes caracteritzacions més profundes, usualment assistides per tècniques d’automatització que permeten l’exploració de regions més extenses de l’espai químic, suposen un increment de la complexitat dels sistemes estudiats i per tant una limitació de la seva interpretabilitat. En aquesta Tesi s’han proposat, desenvolupat i posat a prova diverses eines amb el fi de fer el processament d’aquest tipus de xarxes de reacció químiques (CRNs) més simple i millorar la comprensió de processos reactius i catalítics complexos. Aquesta col·lecció d’eines té com fonament la utilització de grafs per modelitzar les xarxes (CRNs) corresponents, per poder fer servir els mètodes de la Teoria de Grafs (cerca de camins, isomorfismes...) en un context químic. Més concretament, aquestes eines inclouen amk-tools, una llibreria per a la visualització interactiva de xarxes de reacció descobertes de manera automàtica, gTOFfee, per a l’aplicació del "energy span model" pel càlcul de la freqüència de recanvi de cicles catalítics complexos calculats computacionalment, i OntoRXN, una ontologia per descriure CRNs de forma semàntica, integrant la topologia de la xarxa i la informació calculada en una única entitat organitzada segons els principis del "Semantic Data".La información a nivel molecular obtenida por medio de estudios "in silico" se ha convertido en una herramienta indispensable para la caracterización y comprensión de mecanismos de reacción complejos. Asimismo, la aplicabilidad de la química computacional se ha ampliado sustancialmente como consecuencia del continuo incremento de la potencia de cálculo durante las últimas décadas. Así, no sólo han aumentado la precisión de los métodos o el tamaño de los sistemas modelizables, sino también el grado de detalle en la descripción mecanística. Sin embargo, aumentar la profundidad de la caracterización de un sistema químico, usualmente a través de técnicas de automatización que permiten explorar ecciones más extensas del espacio químico, supone un aumento en la complejidad de los sistemas resultantes, dificultando la interpretación de los resultados. En esta Tesis se han propuesto, desarrollado y puesto a prueba distintas herramientas para simplificar el procesado de este tipo de redes de reacción químicas (CRNs), con el fin de mejorar la comprensión de procesos reactivos y catalíticos complejos. Este conjunto de herramientas se basa en el uso de grafos para modelizar las redes (CRNs) correspondientes, con tal de poder emplear los métodos de la Teoría de Grafos (búsqueda de caminos, isomorfismos...) bajo un contexto químico. Concretamente, estas herramientas incluyen amk-tools, para la visualización interactiva de redes de reacción descubiertas automáticamente, gTOFfee, para la aplicación del “energy span model” para calcular la frecuencia de recambio de ciclos catalíticos complejos caracterizados computacionalmente, y OntoRXN, una ontología para describir CRNs de manera semántica, integrando la topología de la red y la información calculada en una única entidad organizada bajo los principios del “Semantic Data”.The molecular-level insights gathered through "in silico" studies have become an essential asset for the elucidation and understanding of complex reaction mechanisms. Indeed, the applicability of computational chemistry has strongly widened due to the vast increase in computational power along the last decades. In this sense, not only the accuracy of the applied methods or the size of the target systems have increased, but also the level of detail attained for the mechanistic description. However, performing deeper descriptions of chemical systems, most often resorting to automation techniques that allow to easily explore larger parts of the chemical space, comes at the cost of also augmenting their complexity, rendering the results much harder to interpret. Throughout this Thesis, we have proposed, developed and tested a collection of tools aiming to process this kind of complex chemical reaction networks (CRNs), in order to provide new insights on reactive and catalytic processes. All of these tools employ graphs to model the target CRNs, in order to be able to use the methods of Graph Theory (e.g. path searches, isomorphisms...) in a chemical context. The tools that are discussed include amk-tools, a framework for the interactive visualization of automatically discovered reaction networks, gTOFfee, for the application of the energy span model to compute the turnover frequency of computationally characterized catalytic cycles, and OntoRXN, an ontology for the description of CRNs in a semantic manner integrating network topology and calculation information in a single, highly-structured entity

    The 10th Jubilee Conference of PhD Students in Computer Science

    Get PDF

    Dagstuhl News January - December 2011

    Get PDF
    "Dagstuhl News" is a publication edited especially for the members of the Foundation "Informatikzentrum Schloss Dagstuhl" to thank them for their support. The News give a summary of the scientific work being done in Dagstuhl. Each Dagstuhl Seminar is presented by a small abstract describing the contents and scientific highlights of the seminar as well as the perspectives or challenges of the research topic
    corecore