47 research outputs found

    DIY Archives: Enhancing Access to Collections via Free, Open-Source Platforms

    Get PDF
    Presentation from the MARAC conference in Boston, MA on March 18-21, 2015. S. 24 - DIY Archives: Enhancing Access to Collections via Free, Open-Source Platform

    Engineering Proteinase Inihibitor Genes For Plant Defense Against Predators

    Get PDF
    Small proteinaceous inhibitors (Mr\u3c20,000) of the digestive serine proteinases of animals and microorganisms are found as moderately abundant proteins in storage organs and leaves of many plant genera. The proteins are powerful inhibitors of the digestive enzymes of plant predators and therefore are considered to be part of the array of defensive chemicals of plants. Proteinase inhibitor genes show excellent promise, using DNA technology, to manipulate plant genomes to express these biologically active proteins in order to improve natural defense systems. Members of two unrelated families of serine proteinase inhibitors found in tomato and potato plants, called Inhibitor I (monomer Mr 8000) and Inhibitor II (monomer Mr 12,300), are under both environmental and developmental regulation in different tissues of the plants. Genes coding for wound-inducible Inhibitors I and II have been isolated from both tomato and potato genomes and characterized. Tobacco plants have been transformed with the chimeric genes containing wound-inducible promoters fused with the reporter gene, chloramphenicol acetyl transferase, to assess promoter function and signal transmission. Transacting factors that regulate their expression in response to wounding are also being identified and purified. Intact genes are being employed to transform agriculturally important crop plants to determine their potential usefulness to enhance defensive capabilities of plants against herbivores and pathogens

    Unsupervised Bayesian linear unmixing of gene expression microarrays

    Get PDF
    Background: This paper introduces a new constrained model and the corresponding algorithm, called unsupervised Bayesian linear unmixing (uBLU), to identify biological signatures from high dimensional assays like gene expression microarrays. The basis for uBLU is a Bayesian model for the data samples which are represented as an additive mixture of random positive gene signatures, called factors, with random positive mixing coefficients, called factor scores, that specify the relative contribution of each signature to a specific sample. The particularity of the proposed method is that uBLU constrains the factor loadings to be non-negative and the factor scores to be probability distributions over the factors. Furthermore, it also provides estimates of the number of factors. A Gibbs sampling strategy is adopted here to generate random samples according to the posterior distribution of the factors, factor scores, and number of factors. These samples are then used to estimate all the unknown parameters. Results: Firstly, the proposed uBLU method is applied to several simulated datasets with known ground truth and compared with previous factor decomposition methods, such as principal component analysis (PCA), non negative matrix factorization (NMF), Bayesian factor regression modeling (BFRM), and the gradient-based algorithm for general matrix factorization (GB-GMF). Secondly, we illustrate the application of uBLU on a real time-evolving gene expression dataset from a recent viral challenge study in which individuals have been inoculated with influenza A/H3N2/Wisconsin. We show that the uBLU method significantly outperforms the other methods on the simulated and real data sets considered here. Conclusions: The results obtained on synthetic and real data illustrate the accuracy of the proposed uBLU method when compared to other factor decomposition methods from the literature (PCA, NMF, BFRM, and GB-GMF). The uBLU method identifies an inflammatory component closely associated with clinical symptom scores collected during the study. Using a constrained model allows recovery of all the inflammatory genes in a single factor

    Biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>For heterogeneous tissues, such as blood, measurements of gene expression are confounded by relative proportions of cell types involved. Conclusions have to rely on estimation of gene expression signals for homogeneous cell populations, e.g. by applying micro-dissection, fluorescence activated cell sorting, or <it>in-silico </it>deconfounding. We studied feasibility and validity of a non-negative matrix decomposition algorithm using experimental gene expression data for blood and sorted cells from the same donor samples. Our objective was to optimize the algorithm regarding detection of differentially expressed genes and to enable its use for classification in the difficult scenario of reversely regulated genes. This would be of importance for the identification of candidate biomarkers in heterogeneous tissues.</p> <p>Results</p> <p>Experimental data and simulation studies involving noise parameters estimated from these data revealed that for valid detection of differential gene expression, quantile normalization and use of non-log data are optimal. We demonstrate the feasibility of predicting proportions of constituting cell types from gene expression data of single samples, as a prerequisite for a deconfounding-based classification approach.</p> <p>Classification cross-validation errors with and without using deconfounding results are reported as well as sample-size dependencies. Implementation of the algorithm, simulation and analysis scripts are available.</p> <p>Conclusions</p> <p>The deconfounding algorithm without decorrelation using quantile normalization on non-log data is proposed for biomarkers that are difficult to detect, and for cases where confounding by varying proportions of cell types is the suspected reason. In this case, a deconfounding ranking approach can be used as a powerful alternative to, or complement of, other statistical learning approaches to define candidate biomarkers for molecular diagnosis and prediction in biomedicine, in realistically noisy conditions and with moderate sample sizes.</p

    Spatio-Temporal Dynamics of Yeast Mitochondrial Biogenesis: Transcriptional and Post-Transcriptional mRNA Oscillatory Modules

    Get PDF
    Examples of metabolic rhythms have recently emerged from studies of budding yeast. High density microarray analyses have produced a remarkably detailed picture of cycling gene expression that could be clustered according to metabolic functions. We developed a model-based approach for the decomposition of expression to analyze these data and to identify functional modules which, expressed sequentially and periodically, contribute to the complex and intricate mitochondrial architecture. This approach revealed that mitochondrial spatio-temporal modules are expressed during periodic spikes and specific cellular localizations, which cover the entire oscillatory period. For instance, assembly factors (32 genes) and translation regulators (47 genes) are expressed earlier than the components of the amino-acid synthesis pathways (31 genes). In addition, we could correlate the expression modules identified with particular post-transcriptional properties. Thus, mRNAs of modules expressed “early” are mostly translated in the vicinity of mitochondria under the control of the Puf3p mRNA-binding protein. This last spatio-temporal module concerns mostly mRNAs coding for basic elements of mitochondrial construction: assembly and regulatory factors. Prediction that unknown genes from this module code for important elements of mitochondrial biogenesis is supported by experimental evidence. More generally, these observations underscore the importance of post-transcriptional processes in mitochondrial biogenesis, highlighting close connections between nuclear transcription and cytoplasmic site-specific translation

    Gene set-based module discovery in the breast cancer transcriptome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Although microarray-based studies have revealed global view of gene expression in cancer cells, we still have little knowledge about regulatory mechanisms underlying the transcriptome. Several computational methods applied to yeast data have recently succeeded in identifying expression modules, which is defined as co-expressed gene sets under common regulatory mechanisms. However, such module discovery methods are not applied cancer transcriptome data.</p> <p>Results</p> <p>In order to decode oncogenic regulatory programs in cancer cells, we developed a novel module discovery method termed EEM by extending a previously reported module discovery method, and applied it to breast cancer expression data. Starting from seed gene sets prepared based on <it>cis</it>-regulatory elements, ChIP-chip data, and gene locus information, EEM identified 10 principal expression modules in breast cancer based on their expression coherence. Moreover, EEM depicted their activity profiles, which predict regulatory programs in each subtypes of breast tumors. For example, our analysis revealed that the expression module regulated by the Polycomb repressive complex 2 (PRC2) is downregulated in triple negative breast cancers, suggesting similarity of transcriptional programs between stem cells and aggressive breast cancer cells. We also found that the activity of the PRC2 expression module is negatively correlated to the expression of EZH2, a component of PRC2 which belongs to the E2F expression module. E2F-driven EZH2 overexpression may be responsible for the repression of the PRC2 expression modules in triple negative tumors. Furthermore, our network analysis predicts regulatory circuits in breast cancer cells.</p> <p>Conclusion</p> <p>These results demonstrate that the gene set-based module discovery approach is a powerful tool to decode regulatory programs in cancer cells.</p
    corecore