53 research outputs found

    Optimal Part and Module Selection for Synthetic Gene Circuit Design Automation

    No full text
    An integral challenge in synthetic circuit design is the selection of optimal parts to populate a given circuit topology, so that the resulting circuit behavior best approximates the desired one. In some cases, it is also possible to reuse multipart constructs or <i>modules</i> that have been already built and experimentally characterized. Efficient part and module selection algorithms are essential to systematically search the solution space, and their significance will only increase in the following years due to the projected explosion in part libraries and circuit complexity. Here, we address this problem by introducing a structured abstraction methodology and a dynamic programming-based algorithm that guaranties optimal part selection. In addition, we provide three extensions that are based on symmetry check, information look-ahead and branch-and-bound techniques, to reduce the running time and space requirements. We have evaluated the proposed methodology with a benchmark of 11 circuits, a database of 73 parts and 304 experimentally constructed modules with encouraging results. This work represents a fundamental departure from traditional heuristic-based methods for part and module selection and is a step toward maximizing efficiency in synthetic circuit design and construction

    A Parts Database with Consensus Parameter Estimation for Synthetic Circuit Design

    No full text
    Mathematical modeling and numerical simulation are crucial to support design decisions in synthetic biology. Accurate estimation of parameter values is key, as direct experimental measurements are difficult and time-consuming. Insufficient data, incompatible measurements, and specialized models that lack universal parameters make this task challenging. Here, we have created a database (PAMDB) that integrates data from 135 publications that contain 118 circuits and 165 genetic parts of the bacterium <i>Escherichia coli</i>. We used a succinct, universal model formulation to describe the part behavior in each circuit. We introduce a constrained consensus inference method that was used to infer the value of the model parameters and evaluated its performance through cross-validation in a benchmark of 23 circuits. We discuss these results and summarize the challenges in data integration and parameter inference. This work provides a resource and a methodology that can be used as a point of reference for synthetic circuit modeling

    Fast and Accurate Circuit Design Automation through Hierarchical Model Switching

    No full text
    In computer-aided biological design, the trifecta of characterized part libraries, accurate models and optimal design parameters is crucial for producing reliable designs. As the number of parts and model complexity increase, however, it becomes exponentially more difficult for any optimization method to search the solution space, hence creating a trade-off that hampers efficient design. To address this issue, we present a hierarchical computer-aided design architecture that uses a two-step approach for biological design. First, a simple model of low computational complexity is used to predict circuit behavior and assess candidate circuit branches through branch-and-bound methods. Then, a complex, nonlinear circuit model is used for a fine-grained search of the reduced solution space, thus achieving more accurate results. Evaluation with a benchmark of 11 circuits and a library of 102 experimental designs with known characterization parameters demonstrates a speed-up of 3 orders of magnitude when compared to other design methods that provide optimality guarantees

    Fast and Accurate Circuit Design Automation through Hierarchical Model Switching

    No full text
    In computer-aided biological design, the trifecta of characterized part libraries, accurate models and optimal design parameters is crucial for producing reliable designs. As the number of parts and model complexity increase, however, it becomes exponentially more difficult for any optimization method to search the solution space, hence creating a trade-off that hampers efficient design. To address this issue, we present a hierarchical computer-aided design architecture that uses a two-step approach for biological design. First, a simple model of low computational complexity is used to predict circuit behavior and assess candidate circuit branches through branch-and-bound methods. Then, a complex, nonlinear circuit model is used for a fine-grained search of the reduced solution space, thus achieving more accurate results. Evaluation with a benchmark of 11 circuits and a library of 102 experimental designs with known characterization parameters demonstrates a speed-up of 3 orders of magnitude when compared to other design methods that provide optimality guarantees

    Comparison of computational efficiency of five protein inference methods over six datasets.

    No full text
    <p>We ran three times for each method on the computer (Two Intel E5-2630 v3 2.4GHz CPUs with eight cores with 64GB of RDIMM RAM). PLP; ProteinLP, MSB; MSBayesPro, PL; ProteinLasso. HMD; HumanMD dataset, HEKC, HumanEKC dataset.</p

    Highly informative genes on a genetic interaction network.

    No full text
    <p><b>(A)</b> Genes are grouped into five separate modules that are distinct from the core network. Ontology of pathways and compositions of transporter complexes are based on EcoCyc for <i>E</i>. <i>coli</i> K-12 MG1655. Green edges represent genetic interactions identified in [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004127#pcbi.1004127.ref047" target="_blank">47</a>]. Histograms show frequencies of MI genes for different classifiers for 5 pathway modules. <b>(B)</b> A higher resolution representation for the biosynthesis and transporter complex pathways that are highly enriched in a number of classifiers. Genes shown are the top-ranked in each classification task. The node color denote the classification task that it is highly informative of (task legend on the upper right of the figure).</p

    Data distribution of Total Gross Sales post-filtration.

    No full text
    Non-normalized (top left), min-max normalization (top right). quantile normalization (bottom left), and z-score normalization (bottom right) are shown above, with red and blue dots representing 95% confidence interval and mean respectively. Inset plots show the entire dataset while the main plots show the inset plots zoomed in to the 95% confidence interval range. (TIF)</p

    Fig 2 -

    No full text
    (A) Tables showing organization of the dataset, with (B) PCA and t-SNE visualization of data with minmax normalization post-filtration. K-means clustering is shown and was conducted on the PCA-data. Outliers are removed, and only datapoints with audit yield greater than $0 are shown for visualization purposes.</p

    DeepPep overview.

    No full text
    <p>DeepPep takes as an input a set of strings for sequences of all the protein matches to an observed peptide. (A) To train the model for a specific peptide, each protein sequence string is converted to binary with ones where the peptide sequence matches that of the protein sequence, and zero everywhere else. (B) A CNN is then trained to predict the peptide probability. A peptide probability is the probability that the peptide that is identified through a database search from the mass spectra is the correct one. (C) The effect of a protein removal to a peptide probability is then calculated for all proteins and all peptides. (D) Finally, we score proteins based on differential change of each protein in CNN when it is present/absent.</p
    • …
    corecore