4,658 research outputs found

    BacillOndex: An Integrated Data Resource for Systems and Synthetic Biology

    Get PDF
    BacillOndex is an extension of the Ondex data integration system, providing a semantically annotated, integrated knowledge base for the model Gram-positive bacterium Bacillus subtilis. This application allows a user to mine a variety of B. subtilis data sources, and analyse the resulting integrated dataset, which contains data about genes, gene products and their interactions. The data can be analysed either manually, by browsing using Ondex, or computationally via a Web services interface. We describe the process of creating a BacillOndex instance, and describe the use of the system for the analysis of single nucleotide polymorphisms in B. subtilis Marburg. The Marburg strain is the progenitor of the widely-used laboratory strain B. subtilis 168. We identified 27 SNPs with predictable phenotypic effects, including genetic traits for known phenotypes. We conclude that BacillOndex is a valuable tool for the systems-level investigation of, and hypothesis generation about, this important biotechnology workhorse. Such understanding contributes to our ability to construct synthetic genetic circuits in this organism

    A Role for Bottom-Up Synthetic Cells in the Internet of Bio-Nano Things?

    Get PDF
    he potential role of bottom-up Synthetic Cells (SCs) in the Internet of Bio-Nano Things (IoBNT) is discussed. In particular, this perspective paper focuses on the growing interest in networks of biological and/or artificial objects at the micro- and nanoscale (cells and subcellular parts, microelectrodes, microvessels, etc.), whereby communication takes place in an unconventional manner, i.e., via chemical signaling. The resulting “molecular communication” (MC) scenario paves the way to the development of innovative technologies that have the potential to impact biotechnology, nanomedicine, and related fields. The scenario that relies on the interconnection of natural and artificial entities is briefly introduced, highlighting how Synthetic Biology (SB) plays a central role. SB allows the construction of various types of SCs that can be designed, tailored, and programmed according to specific predefined requirements. In particular, “bottom-up” SCs are briefly described by commenting on the principles of their design and fabrication and their features (in particular, the capacity to exchange chemicals with other SCs or with natural biological cells). Although bottom-up SCs still have low complexity and thus basic functionalities, here, we introduce their potential role in the IoBNT. This perspective paper aims to stimulate interest in and discussion on the presented topics. The article also includes commentaries on MC, semantic information, minimal cognition, wetware neuromorphic engineering, and chemical social robotics, with the specific potential they can bring to the IoBNT

    A Role for Bottom-Up Synthetic Cells in the Internet of Bio-Nano Things?

    Get PDF
    The potential role of bottom-up Synthetic Cells (SCs) in the Internet of Bio-Nano Things (IoBNT) is discussed. In particular, this perspective paper focuses on the growing interest in networks of biological and/or artificial objects at the micro- and nanoscale (cells and subcellular parts, microelectrodes, microvessels, etc.), whereby communication takes place in an unconventional manner, i.e., via chemical signaling. The resulting "molecular communication" (MC) scenario paves the way to the development of innovative technologies that have the potential to impact biotechnology, nanomedicine, and related fields. The scenario that relies on the interconnection of natural and artificial entities is briefly introduced, highlighting how Synthetic Biology (SB) plays a central role. SB allows the construction of various types of SCs that can be designed, tailored, and programmed according to specific predefined requirements. In particular, "bottom-up" SCs are briefly described by commenting on the principles of their design and fabrication and their features (in particular, the capacity to exchange chemicals with other SCs or with natural biological cells). Although bottom-up SCs still have low complexity and thus basic functionalities, here, we introduce their potential role in the IoBNT. This perspective paper aims to stimulate interest in and discussion on the presented topics. The article also includes commentaries on MC, semantic information, minimal cognition, wetware neuromorphic engineering, and chemical social robotics, with the specific potential they can bring to the IoBNT

    Unveiling evolutionary algorithm representation with DU maps

    Get PDF
    Evolutionary algorithms (EAs) have proven to be effective in tackling problems in many different domains. However, users are often required to spend a significant amount of effort in fine-tuning the EA parameters in order to make the algorithm work. In principle, visualization tools may be of great help in this laborious task, but current visualization tools are either EA-specific, and hence hardly available to all users, or too general to convey detailed information. In this work, we study the Diversity and Usage map (DU map), a compact visualization for analyzing a key component of every EA, the representation of solutions. In a single heat map, the DU map visualizes for entire runs how diverse the genotype is across the population and to which degree each gene in the genotype contributes to the solution. We demonstrate the generality of the DU map concept by applying it to six EAs that use different representations (bit and integer strings, trees, ensembles of trees, and neural networks). We present the results of an online user study about the usability of the DU map which confirm the suitability of the proposed tool and provide important insights on our design choices. By providing a visualization tool that can be easily tailored by specifying the diversity (D) and usage (U) functions, the DU map aims at being a powerful analysis tool for EAs practitioners, making EAs more transparent and hence lowering the barrier for their use

    The Hierarchic treatment of marine ecological information from spatial networks of benthic platforms

    Get PDF
    Measuring biodiversity simultaneously in different locations, at different temporal scales, and over wide spatial scales is of strategic importance for the improvement of our understanding of the functioning of marine ecosystems and for the conservation of their biodiversity. Monitoring networks of cabled observatories, along with other docked autonomous systems (e.g., Remotely Operated Vehicles [ROVs], Autonomous Underwater Vehicles [AUVs], and crawlers), are being conceived and established at a spatial scale capable of tracking energy fluxes across benthic and pelagic compartments, as well as across geographic ecotones. At the same time, optoacoustic imaging is sustaining an unprecedented expansion in marine ecological monitoring, enabling the acquisition of new biological and environmental data at an appropriate spatiotemporal scale. At this stage, one of the main problems for an effective application of these technologies is the processing, storage, and treatment of the acquired complex ecological information. Here, we provide a conceptual overview on the technological developments in the multiparametric generation, storage, and automated hierarchic treatment of biological and environmental information required to capture the spatiotemporal complexity of a marine ecosystem. In doing so, we present a pipeline of ecological data acquisition and processing in different steps and prone to automation. We also give an example of population biomass, community richness and biodiversity data computation (as indicators for ecosystem functionality) with an Internet Operated Vehicle (a mobile crawler). Finally, we discuss the software requirements for that automated data processing at the level of cyber-infrastructures with sensor calibration and control, data banking, and ingestion into large data portals.Peer ReviewedPostprint (published version

    Prediction of novel synthetic pathways for the production of desired chemicals

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>There have been several methods developed for the prediction of synthetic metabolic pathways leading to the production of desired chemicals. In these approaches, novel pathways were predicted based on chemical structure changes, enzymatic information, and/or reaction mechanisms, but the approaches generating a huge number of predicted results are difficult to be applied to real experiments. Also, some of these methods focus on specific pathways, and thus are limited to expansion to the whole metabolism.</p> <p>Results</p> <p>In the present study, we propose a system framework employing a retrosynthesis model with a prioritization scoring algorithm. This new strategy allows deducing the novel promising pathways for the synthesis of a desired chemical together with information on enzymes involved based on structural changes and reaction mechanisms present in the system database. The prioritization scoring algorithm employing Tanimoto coefficient and group contribution method allows examination of structurally qualified pathways to recognize which pathway is more appropriate. In addition, new concepts of binding site covalence, estimation of pathway distance and organism specificity were taken into account to identify the best synthetic pathway. Parameters of these factors can be evolutionarily optimized when a newly proven synthetic pathway is registered. As the proofs of concept, the novel synthetic pathways for the production of isobutanol, 3-hydroxypropionate, and butyryl-CoA were predicted. The prediction shows a high reliability, in which experimentally verified synthetic pathways were listed within the top 0.089% of the identified pathway candidates.</p> <p>Conclusions</p> <p>It is expected that the system framework developed in this study would be useful for the <it>in silico </it>design of novel metabolic pathways to be employed for the efficient production of chemicals, fuels and materials.</p

    Scientific discovery as a combinatorial optimisation problem: How best to navigate the landscape of possible experiments?

    Get PDF
    A considerable number of areas of bioscience, including gene and drug discovery, metabolic engineering for the biotechnological improvement of organisms, and the processes of natural and directed evolution, are best viewed in terms of a ‘landscape’ representing a large search space of possible solutions or experiments populated by a considerably smaller number of actual solutions that then emerge. This is what makes these problems ‘hard’, but as such these are to be seen as combinatorial optimisation problems that are best attacked by heuristic methods known from that field. Such landscapes, which may also represent or include multiple objectives, are effectively modelled in silico, with modern active learning algorithms such as those based on Darwinian evolution providing guidance, using existing knowledge, as to what is the ‘best’ experiment to do next. An awareness, and the application, of these methods can thereby enhance the scientific discovery process considerably. This analysis fits comfortably with an emerging epistemology that sees scientific reasoning, the search for solutions, and scientific discovery as Bayesian processes

    A Study of Geometric Semantic Genetic Programming with Linear Scaling

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceMachine Learning (ML) is a scientific discipline that endeavors to enable computers to learn without the need for explicit programming. Evolutionary Algorithms (EAs), a subset of ML algorithms, mimic Darwin’s Theory of Evolution by using natural selection mechanisms (i.e., survival of the fittest) to evolve a group of individuals (i.e., possible solutions to a given problem). Genetic Programming (GP) is the most recent type of EA and it evolves computer programs (i.e., individuals) to map a set of input data into known expected outputs. Geometric Semantic Genetic Programming (GSGP) extends this concept by allowing individuals to evolve and vary in the semantic space, where the output vectors are located, rather than being constrained by syntaxbased structures. Linear Scaling (LS) is a method that was introduced to facilitate the task of GP of searching for the best function matching a set of known data. GSGP and LS have both, independently, shown the ability to outperform standard GP for symbolic regression. GSGP uses Geometric Semantic Operators (GSOs), different from the standard ones, without altering the fitness, while LS modifies the fitness without altering the genetic operators. To the best of our knowledge, there has been no prior utilization of the combined methodology of GSGP and LS for classification problems. Furthermore, despite the fact that they have been used together in one practical regression application, a methodological evaluation of the advantages and disadvantages of integrating these methods for regression or classification problems has never been performed. In this dissertation, a study of a system that integrates both GSGP and LS (GSGP-LS) is presented. The performance of the proposed method, GSGPLS, was tested on six hand-tailored regression benchmarks, nine real-life regression problems and three real-life classification problems. The obtained results indicate that GSGP-LS outperforms GSGP in the majority of the cases, confirming the expected benefit of this integration. However, for some particularly hard regression datasets, GSGP-LS overfits training data, being outperformed by GSGP on unseen data. This contradicts the idea that LS is always beneficial for GP, warning the practitioners about its risk of overfitting in some specific cases.A Aprendizagem Automática (AA) é uma disciplina científica que se esforça por permitir que os computadores aprendam sem a necessidade de programação explícita. Algoritmos Evolutivos (AE),um subconjunto de algoritmos de ML, mimetizam a Teoria da Evolução de Darwin, usando a seleção natural e mecanismos de "sobrevivência dos mais aptos"para evoluir um grupo de indivíduos (ou seja, possíveis soluções para um problema dado). A Programação Genética (PG) é um processo algorítmico que evolui programas de computador (ou indivíduos) para ligar características de entrada e saída. A Programação Genética em Geometria Semântica (PGGS) estende esse conceito permitindo que os indivíduos evoluam e variem no espaço semântico, onde os vetores de saída estão localizados, em vez de serem limitados por estruturas baseadas em sintaxe. A Escala Linear (EL) é um método introduzido para facilitar a tarefa da PG de procurar a melhor função que corresponda a um conjunto de dados conhecidos. Tanto a PGGS quanto a EL demonstraram, independentemente, a capacidade de superar a PG padrão para regressão simbólica. A PGGS usa Operadores Semânticos Geométricos (OSGs), diferentes dos padrões, sem alterar o fitness, enquanto a EL modifica o fitness sem alterar os operadores genéticos. Até onde sabemos, não houve utilização prévia da metodologia combinada de PGGS e EL para problemas de classificação. Além disso, apesar de terem sido usados juntos em uma aplicação prática de regressão, nunca foi realizada uma avaliação metodológica das vantagens e desvantagens da integração desses métodos para problemas de regressão ou classificação. Nesta dissertação, é apresentado um estudo de um sistema que integra tanto a PGGS quanto a EL (PGGSEL). O desempenho do método proposto, PGGS-EL, foi testado em seis benchmarks de regressão personalizados, nove problemas de regressão da vida real e três problemas de classificação da vida real. Os resultados obtidos indicam que o PGGS-EL supera o PGGS na maioria dos casos, confirmando o benefício esperado desta integração. No entanto, para alguns conjuntos de dados de regressão particularmente difíceis, o PGGS-EL faz overfit aos dados de treino, obtendo piores resultados em comparação com PGGS em dados não vistos. Isso contradiz a ideia de que a EL é sempre benéfica para a PG, alertando os praticantes sobre o risco de overfitting em alguns casos específicos
    corecore