3 research outputs found

    Advances in Big Data Bio Analytics

    Get PDF
    Delivering effective data analytics is of crucial importance to the interpretation of the multitude of biological datasets currently generated by an ever increasing number of high throughput techniques. Logic programming has much to offer in this area. Here, we detail advances that highlight two of the strengths of logical formalisms in developing data analytic solutions in biological settings: access to large relational databases and building analytical pipelines collecting graph information from multiple sources. We present significant advances on the bio_db package which serves biological databases as Prolog facts that can be served either by in-memory loading or via database backends. These advances include modularising the underlying architecture and the incorporation of datasets from a second organism (mouse). In addition, we introduce a number of data analytics tools that operate on these datasets and are bundled in the analysis package: bio_analytics. Emphasis in both packages is on ease of installation and use. We highlight the general architecture of our components based approach. An experimental graphical user interface via SWISH for local installation is also available. Finally, we advocate that biological data analytics is a fertile area which can drive further innovation in applied logic programming

    Advances in big data bio analytics

    Get PDF
    Delivering effective data analytics is of crucial importance to the interpretation of the multitude of biological datasets currently generated by an ever increasing number of high throughput techniques. Logic programming has much to offer in this area. Here, we detail advances that highlight two of the strengths of logical formalisms in developing data analytic solutions in biological settings: access to large relational databases and building analytical pipelines collecting graph information from multiple sources. We present significant advances on the bio_db package which serves biological databases as Prolog facts that can be served either by in-memory loading or via database backends. These advances include modularising the underlying architecture and the incorporation of datasets from a second organism (mouse). In addition, we introduce a number of data analytics tools that operate on these datasets and are bundled in the analysis package: bio_analytics. Emphasis in both packages is on ease of installation and use. We highlight the general architecture of our components based approach. An experimental graphical user interface via SWISH for local installation is also available. Finally, we advocate that biological data analytics is a fertile area which can drive further innovation in applied logic programming

    Accessing biological data as Prolog facts

    No full text
    It has been argued before that Prolog is a strong candidate for research and code development in bioinformatics and computational biology. This position has been based on both the intrinsic strengths of Prolog and recent advances in its technologies. Here we strengthen the case for the deployment and penetration of Prolog into bioinformatics, by introducing bio_db, a comprehensive and extensible system for working with biological data. Our library packages high quality, publicly available biological databases that are routinely used in tasks such as: (a) the translation between biological products and (b) product-to-product interactions which can be visualised as graphs. This library allows easy access to these data in five formats: Prolog fact files, Prolog quick load files, Berkeley DB data files, RocksDB and SQLite databases. In addition, the library introduces two innovative features that are pertinent to data analytics in general. First, on-demand downloading of prepacked data files as well as reconstruction from latest data files from the curated databases are supported. Second, by employing code hot-swapping the library delivers the data: (a) transparently to the user and (b) in the familiar format of Prolog facts
    corecore