28,786 research outputs found

    Knowledge-based gene expression classification via matrix factorization

    Get PDF
    Motivation: Modern machine learning methods based on matrix decomposition techniques, like independent component analysis (ICA) or non-negative matrix factorization (NMF), provide new and efficient analysis tools which are currently explored to analyze gene expression profiles. These exploratory feature extraction techniques yield expression modes (ICA) or metagenes (NMF). These extracted features are considered indicative of underlying regulatory processes. They can as well be applied to the classification of gene expression datasets by grouping samples into different categories for diagnostic purposes or group genes into functional categories for further investigation of related metabolic pathways and regulatory networks. Results: In this study we focus on unsupervised matrix factorization techniques and apply ICA and sparse NMF to microarray datasets. The latter monitor the gene expression levels of human peripheral blood cells during differentiation from monocytes to macrophages. We show that these tools are able to identify relevant signatures in the deduced component matrices and extract informative sets of marker genes from these gene expression profiles. The methods rely on the joint discriminative power of a set of marker genes rather than on single marker genes. With these sets of marker genes, corroborated by leave-one-out or random forest cross-validation, the datasets could easily be classified into related diagnostic categories. The latter correspond to either monocytes versus macrophages or healthy vs Niemann Pick C disease patients.Siemens AG, MunichDFG (Graduate College 638)DAAD (PPP Luso - Alem˜a and PPP Hispano - Alemanas

    Systems Biology and the Development of Vaccines and Drugs for Malaria Treatments

    Get PDF
    The sequencing race has ended and the functional race has already begun. Microarray technology enables simultaneous gene expression analysis of thousands of genes, enabling a snapshot of an organisms’ transcriptome at an unprecedented resolution. The close correlation between gene transcription and function, allow the inference of biological processes from the assessed transcriptome profile. Among the sophisticated analytical problems in microarray technology at the front and back ends respectively, are the selection of optimal DNA oligos and computational analysis of the genes expression. In this review paper, we analyse important methods in use today in customized oligos design. In the course of executing this, we discovered that the oligos designer algorithm hanged on gene PFA0135w of chromosome 1, while designing oligos for the gene sequences of Plasmodium falciparum. We do not know the reason for this yet, as the algorithm runs on other sequences like the yeast (Saccharomyces cervisiae) and Neurospora crassa. We conclude the paper highlighting the procedures encompassing the back end phase and discuss their application to the development of vaccines and drugs for malaria treatment. Note that, malaria is the cause of significant global morbidity and mortality with 300-500 million cases annually. Our aims are not ends, but a means to achieve the following: Iterate the need for experimental biologists to (i) know how to design their customized oligos and (ii) have some idea about gene expression analysis and the need for cooperation between experimental biologists and their counterpart, the computational biologists. These will help experimental biologists to coordinate very well the front and the back ends of the system biology analysis of the whole genome effectively

    Quantitative model for inferring dynamic regulation of the tumour suppressor gene p53

    Get PDF
    Background: The availability of various "omics" datasets creates a prospect of performing the study of genome-wide genetic regulatory networks. However, one of the major challenges of using mathematical models to infer genetic regulation from microarray datasets is the lack of information for protein concentrations and activities. Most of the previous researches were based on an assumption that the mRNA levels of a gene are consistent with its protein activities, though it is not always the case. Therefore, a more sophisticated modelling framework together with the corresponding inference methods is needed to accurately estimate genetic regulation from "omics" datasets. Results: This work developed a novel approach, which is based on a nonlinear mathematical model, to infer genetic regulation from microarray gene expression data. By using the p53 network as a test system, we used the nonlinear model to estimate the activities of transcription factor (TF) p53 from the expression levels of its target genes, and to identify the activation/inhibition status of p53 to its target genes. The predicted top 317 putative p53 target genes were supported by DNA sequence analysis. A comparison between our prediction and the other published predictions of p53 targets suggests that most of putative p53 targets may share a common depleted or enriched sequence signal on their upstream non-coding region. Conclusions: The proposed quantitative model can not only be used to infer the regulatory relationship between TF and its down-stream genes, but also be applied to estimate the protein activities of TF from the expression levels of its target genes

    Physico-chemical foundations underpinning microarray and next-generation sequencing experiments

    Get PDF
    Hybridization of nucleic acids on solid surfaces is a key process involved in high-throughput technologies such as microarrays and, in some cases, next-generation sequencing (NGS). A physical understanding of the hybridization process helps to determine the accuracy of these technologies. The goal of a widespread research program is to develop reliable transformations between the raw signals reported by the technologies and individual molecular concentrations from an ensemble of nucleic acids. This research has inputs from many areas, from bioinformatics and biostatistics, to theoretical and experimental biochemistry and biophysics, to computer simulations. A group of leading researchers met in Ploen Germany in 2011 to discuss present knowledge and limitations of our physico-chemical understanding of high-throughput nucleic acid technologies. This meeting inspired us to write this summary, which provides an overview of the state-of-the-art approaches based on physico-chemical foundation to modeling of the nucleic acids hybridization process on solid surfaces. In addition, practical application of current knowledge is emphasized

    Gene expression patterns in anterior pituitary associated with quantitative measure of oestrous behaviour in dairy cows

    Get PDF
    Intensive selection for high milk yield in dairy cows has raised production levels substantially but at the cost of reduced fertility, which manifests in different ways including reduced expression of oestrous behaviour. The genomic regulation of oestrous behaviour in bovines remains largely unknown. Here, we aimed to identify and study those genes that were associated with oestrous behaviour among genes expressed in the bovine anterior pituitary either at the start of oestrous cycle or at the mid-cycle (around day 12 of cycle), or regardless of the phase of cycle. Oestrous behaviour was recorded in each of 28 primiparous cows from 30 days in milk onwards till the day of their sacrifice (between 77 and 139 days in milk) and quantified as heat scores. An average heat score value was calculated for each cow from heat scores observed during consecutive oestrous cycles excluding the cycle on the day of sacrifice. A microarray experiment was designed to measure gene expression in the anterior pituitary of these cows, 14 of which were sacrificed at the start of oestrous cycle (day 0) and 14 around day 12 of cycle (day 12). Gene expression was modelled as a function of the orthogonally transformed average heat score values using a Bayesian hierarchical mixed model on data from day 0 cows alone (analysis 1), day 12 cows alone (analysis 2) and the combined data from day 0 and day 12 cows (analysis 3). Genes whose expression patterns showed significant linear or non-linear relationships with average heat scores were identified in all three analyses (177, 142 and 118 genes, respectively). Gene ontology terms enriched among genes identified in analysis 1 revealed processes associated with expression of oestrous behaviour whereas the terms enriched among genes identified in analysis 2 and 3 were general processes which may facilitate proper expression of oestrous behaviour at the subsequent oestrus. Studying these genes will help to improve our understanding of the genomic regulation of oestrous behaviour, ultimately leading to better management strategies and tools to improve or monitor reproductive performance in bovines

    A novel cassette method for probe evaluation in the designed biochips

    Get PDF
    A critical step in biochip design is the selection of probes with identical hybridisation characteristics. In this article we describe a novel method for evaluating DNA hybridisation probes, allowing the fine-tuning of biochips, that uses cassettes with multiple probes. Each cassette contains probes in equimolar proportions so that their hybridisation performance can be assessed in a single reaction. The model used to demonstrate this method was a series of probes developed to detect TORCH pathogens. DNA probes were designed for Toxoplasma gondii, Chlamidia trachomatis, Rubella, Cytomegalovirus, and Herpes virus and these were used to construct the DNA cassettes. Five cassettes were constructed to detect TORCH pathogens using a variety of genes coding for membrane proteins, viral matrix protein, an early expressed viral protein, viral DNA polymerase and the repetitive gene B1 of Toxoplasma gondii. All of these probes, except that for the B1 gene, exhibited similar profiles under the same hybridisation conditions. The failure of the B1 gene probe to hybridise was not due to a position effect, and this indicated that the probe was unsuitable for inclusion in the biochip. The redesigned probe for the B1 gene exhibited identical hybridisation properties to the other probes, suitable for inclusion in a biochip

    Major transcriptome re-organisation and abrupt changes in signalling, cell cycle and chromatin regulation at neural differentiation <em>in vivo</em>

    Get PDF
    Here, we exploit the spatial separation of temporal events of neural differentiation in the elongating chick body axis to provide the first analysis of transcriptome change in progressively more differentiated neural cell populations in vivo. Microarray data, validated against direct RNA sequencing, identified: (1) a gene cohort characteristic of the multi-potent stem zone epiblast, which contains neuro-mesodermal progenitors that progressively generate the spinal cord; (2) a major transcriptome reorganisation as cells then adopt a neural fate; and (3) increasing diversity as neural patterning and neuron production begin. Focussing on the transition from multi-potent to neural state cells, we capture changes in major signalling pathways, uncover novel Wnt and Notch signalling dynamics, and implicate new pathways (mevalonate pathway/steroid biogenesis and TGF beta). This analysis further predicts changes in cellular processes, cell cycle, RNA-processing and protein turnover as cells acquire neural fate. We show that these changes are conserved across species and provide biological evidence for reduced proteasome efficiency and a novel lengthening of S phase. This latter step may provide time for epigenetic events to mediate large-scale transcriptome re-organisation; consistent with this, we uncover simultaneous downregulation of major chromatin modifiers as the neural programme is established. We further demonstrate that transcription of one such gene, HDAC1, is dependent on FGF signalling, making a novel link between signals that control neural differentiation and transcription of a core regulator of chromatin organisation. Our work implicates new signalling pathways and dynamics, cellular processes and epigenetic modifiers in neural differentiation in vivo, identifying multiple new potential cellular and molecular mechanisms that direct differentiation

    Software and methods for oligonucleotide and cDNA array data analysis.

    Get PDF
    Two HTML-based programs were developed to analyze and filter gene-expression data: 'Bullfrog' for Affymetrix oligonucleotide arrays and 'Spot' for custom cDNA arrays. The programs provide intuitive data-filtering tools through an easy-to-use interface. A background subtraction and normalization program for cDNA arrays was also built that provides an informative summary report with data-quality assessments. These programs are freeware to aid in the analysis of gene-expression results and facilitate the search for genes responsible for interesting biological processes and phenotypes
    corecore