122 research outputs found

    Finding Biomarker Signatures in Pooled Sample Designs: A Simulation Framework for Methodological Comparisons

    Get PDF
    Detection of discriminating patterns in gene expression data can be accomplished by using various methods of statistical learning. It has been proposed that sample pooling in this context would have negative effects; however, pooling cannot always be avoided. We propose a simulation framework to explicitly investigate the parameters of patterns, experimental design, noise, and choice of method in order to find out which effects on classification performance are to be expected. We use a two-group classification task and simulated gene expression data with independent differentially expressed genes as well as bivariate linear patterns and the combination of both. Our results show a clear increase of prediction error with pool size. For pooled training sets powered partial least squares discriminant analysis outperforms discriminance analysis, random forests, and support vector machines with linear or radial kernel for two of three simulated scenarios. The proposed simulation approach can be implemented to systematically investigate a number of additional scenarios of practical interest

    Biologists meet statisticians: A workshop for young scientists to foster interdisciplinary team work

    Full text link
    Life science and statistics have necessarily become essential partners. The need to plan complex, structured experiments, involving elaborated designs, and the need to analyse datasets in the era of systems biology and high throughput technologies has to build upon professional statistical expertise. On the other hand, conducting such analyses and also developing improved or new methods, also for novel kinds of data, has to build upon solid biological understanding and practise. However, the meeting of scientists of both fields is often hampered by a variety of communicative hurdles - which are based on field-specific working languages and cultural differences. As a step towards a better mutual understanding, we developed a workshop concept bringing together young experimental biologists and statisticians, to work as pairs and learn to value each others competences and practise interdisciplinary communication in a casual atmosphere. The first implementation of our concept was a cooperation of the German Region of the International Biometrical Society and the Leibnitz Institute DSMZ-German Collection of Microorganisms and Cell Cultures (short: DSMZ), Braunschweig, Germany. We collected feedback in form of three questionnaires, oral comments, and gathered experiences for the improvement of this concept. The long-term challenge for both disciplines is the establishment of systematic schedules and strategic partnerships which use the proposed workshop concept to foster mutual understanding, to seed the necessary interdisciplinary cooperation network, and to start training the indispensable communication skills at the earliest possible phase of education

    Expression profiling of rice cultivars differing in their tolerance to long-term drought stress

    Get PDF
    Understanding the molecular basis of plant performance under water-limiting conditions will help to breed crop plants with a lower water demand. We investigated the physiological and gene expression response of drought-tolerant (IR57311 and LC-93-4) and drought-sensitive (Nipponbare and Taipei 309) rice (Oryza sativa L.) cultivars to 18 days of drought stress in climate chamber experiments. Drought stressed plants grew significantly slower than the controls. Gene expression profiles were measured in leaf samples with the 20 K NSF oligonucleotide microarray. A linear model was fitted to the data to identify genes that were significantly regulated under drought stress. In all drought stressed cultivars, 245 genes were significantly repressed and 413 genes induced. Genes differing in their expression pattern under drought stress between tolerant and sensitive cultivars were identified by the genotype × environment (G × E) interaction term. More genes were significantly drought regulated in the sensitive than in the tolerant cultivars. Localizing all expressed genes on the rice genome map, we checked which genes with a significant G × E interaction co-localized with published quantitative trait loci regions for drought tolerance. These genes are more likely to be important for drought tolerance in an agricultural environment. To identify the metabolic processes with a significant G × E effect, we adapted the analysis software MapMan for rice. We found a drought stress induced shift toward senescence related degradation processes that was more pronounced in the sensitive than in the tolerant cultivars. In spite of higher growth rates and water use, more photosynthesis related genes were down-regulated in the tolerant than in the sensitive cultivars

    Detection of divergent genes in microbial aCGH experiments

    Get PDF
    BACKGROUND: Array-based comparative genome hybridization (aCGH) is a tool for rapid comparison of genomes from different bacterial strains. The purpose of such analysis is to detect highly divergent or absent genes in a sample strain compared to an index strain. Development of methods for analyzing aCGH data has primarily focused on copy number abberations in cancer research. In microbial aCGH analyses, genes are typically ranked by log-ratios, and classification into divergent or present is done by choosing a cutoff log-ratio, either manually or by statistics calculated from the log-ratio distribution. As experimental settings vary considerably, it is not possible to develop a classical discriminant or statistical learning approach. METHODS: We introduce a more efficient method for analyzing microbial aCGH data using a finite mixture model and a data rotation scheme. Using the average posterior probabilities from the model fitted to log-ratios before and after rotation, we get a score for each gene, and demonstrate its advantages for ranking and detecting divergent genes with enlarged specificity and sensitivity. RESULTS: The procedure is tested and compared to other approaches on simulated data sets, as well as on four experimental validation data sets for aCGH analysis on fully sequenced strains of Staphylococcus aureus and Streptococcus pneumoniae. CONCLUSION: When tested on simulated data as well as on four different experimental validation data sets from experiments with only fully sequenced strains, our procedure out-competes the standard procedures of using a simple log-ratio cutoff for classification into present and divergent genes

    Towards Systems Biology of Heterosis: A Hypothesis about Molecular Network Structure Applied for the Arabidopsis Metabolome

    Get PDF
    We propose a network structure-based model for heterosis, and investigate it relying on metabolite profiles from Arabidopsis. A simple feed-forward two-layer network model (the Steinbuch matrix) is used in our conceptual approach. It allows for directly relating structural network properties with biological function. Interpreting heterosis as increased adaptability, our model predicts that the biological networks involved show increasing connectivity of regulatory interactions. A detailed analysis of metabolite profile data reveals that the increasing-connectivity prediction is true for graphical Gaussian models in our data from early development. This mirrors properties of observed heterotic Arabidopsis phenotypes. Furthermore, the model predicts a limit for increasing hybrid vigor with increasing heterozygosity—a known phenomenon in the literature

    Comparative expression profiling of E. coli and S. aureus inoculated primary mammary gland cells sampled from cows with different genetic predispositions for somatic cell score

    Get PDF
    BACKGROUND: During the past ten years many quantitative trait loci (QTL) affecting mastitis incidence and mastitis related traits like somatic cell score (SCS) were identified in cattle. However, little is known about the molecular architecture of QTL affecting mastitis susceptibility and the underlying physiological mechanisms and genes causing mastitis susceptibility. Here, a genome-wide expression analysis was conducted to analyze molecular mechanisms of mastitis susceptibility that are affected by a specific QTL for SCS on Bos taurus autosome 18 (BTA18). Thereby, some first insights were sought into the genetically determined mechanisms of mammary gland epithelial cells influencing the course of infection. METHODS: Primary bovine mammary gland epithelial cells (pbMEC) were sampled from the udder parenchyma of cows selected for high and low mastitis susceptibility by applying a marker-assisted selection strategy considering QTL and molecular marker information of a confirmed QTL for SCS in the telomeric region of BTA18. The cells were cultured and subsequently inoculated with heat-inactivated mastitis pathogens Escherichia coli and Staphylococcus aureus, respectively. After 1, 6 and 24 h, the cells were harvested and analyzed using the microarray expression chip technology to identify differences in mRNA expression profiles attributed to genetic predisposition, inoculation and cell culture. RESULTS: Comparative analysis of co-expression profiles clearly showed a faster and stronger response after pathogen challenge in pbMEC from less susceptible animals that inherited the favorable QTL allele 'Q' than in pbMEC from more susceptible animals that inherited the unfavorable QTL allele 'q'. Furthermore, the results highlighted RELB as a functional and positional candidate gene and related non-canonical Nf-kappaB signaling as a functional mechanism affected by the QTL. However, in both groups, inoculation resulted in up-regulation of genes associated with the Ingenuity pathways 'dendritic cell maturation' and 'acute phase response signaling', whereas cell culture affected biological processes involved in 'cellular development'. CONCLUSIONS: The results indicate that the complex expression profiling of pathogen challenged pbMEC sampled from cows inheriting alternative QTL alleles is suitable to study genetically determined molecular mechanisms of mastitis susceptibility in mammary epithelial cells in vitro and to highlight the most likely functional pathways and candidate genes underlying the QTL effect

    ExprEssence - Revealing the essence of differential experimental data in the context of an interaction/regulation net-work

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Experimentalists are overwhelmed by high-throughput data and there is an urgent need to condense information into simple hypotheses. For example, large amounts of microarray and deep sequencing data are becoming available, describing a variety of experimental conditions such as gene knockout and knockdown, the effect of interventions, and the differences between tissues and cell lines.</p> <p>Results</p> <p>To address this challenge, we developed a method, implemented as a Cytoscape plugin called <it>ExprEssence</it>. As input we take a network of interaction, stimulation and/or inhibition links between genes/proteins, and differential data, such as gene expression data, tracking an intervention or development in time. We condense the network, highlighting those links across which the largest changes can be observed. Highlighting is based on a simple formula inspired by the law of mass action. We can interactively modify the threshold for highlighting and instantaneously visualize results. We applied <it>ExprEssence </it>to three scenarios describing kidney podocyte biology, pluripotency and ageing: 1) We identify putative processes involved in podocyte (de-)differentiation and validate one prediction experimentally. 2) We predict and validate the expression level of a transcription factor involved in pluripotency. 3) Finally, we generate plausible hypotheses on the role of apoptosis, cell cycle deregulation and DNA repair in ageing data obtained from the hippocampus.</p> <p>Conclusion</p> <p>Reducing the size of gene/protein networks to the few links affected by large changes allows to screen for putative mechanistic relationships among the genes/proteins that are involved in adaptation to different experimental conditions, yielding important hypotheses, insights and suggestions for new experiments. We note that we do not focus on the identification of 'active subnetworks'. Instead we focus on the identification of single links (which may or may not form subnetworks), and these single links are much easier to validate experimentally than submodules. <it>ExprEssence </it>is available at <url>http://sourceforge.net/projects/expressence/</url>.</p

    Targeted Analysis of Serum Proteins Encoded at Known Inflammatory Bowel Disease Risk Loci

    Get PDF
    Few studies have investigated the blood proteome of inflammatory bowel disease (IBD). We characterized the serum abundance of proteins encoded at 163 known IBD risk loci and tested these proteins for their biomarker discovery potential. Based on the Human Protein Atlas (HPA) antibody availability, 218 proteins from genes mapping at 163 IBD risk loci were selected. Targeted serum protein profiles from 49 Crohns disease (CD) patients, 51 ulcerative colitis (UC) patients, and 50 sex- and age-matched healthy individuals were obtained using multiplexed antibody suspension bead array assays. Differences in relative serum abundance levels between disease groups and controls were examined. Replication was attempted for CD-UC comparisons (including disease subtypes) by including 64 additional patients (33 CD and 31 UC). Antibodies targeting a potentially novel risk protein were validated by paired antibodies, Western blot, immuno-capture mass spectrometry, and epitope mapping. By univariate analysis, 13 proteins mostly related to neutrophil, T-cell, and B-cell activation and function were differentially expressed in IBD patients vs healthy controls, 3 in CD patients vs healthy controls and 2 in UC patients vs healthy controls (q <0.01). Multivariate analyses further differentiated disease groups from healthy controls and CD subtypes from UC (P <0.05). Extended characterization of an antibody targeting a novel, discriminative serum marker, the laccase (multicopper oxidoreductase) domain containing 1 (LACC1) protein, provided evidence for antibody on-target specificity. Using affinity proteomics, we identified a set of IBD-associated serum proteins encoded at IBD risk loci. These candidate proteins hold the potential to be exploited as diagnostic biomarkers of IBD.Peer reviewe
    corecore