71,860 research outputs found

    Knowledge-based gene expression classification via matrix factorization

    Get PDF
    Motivation: Modern machine learning methods based on matrix decomposition techniques, like independent component analysis (ICA) or non-negative matrix factorization (NMF), provide new and efficient analysis tools which are currently explored to analyze gene expression profiles. These exploratory feature extraction techniques yield expression modes (ICA) or metagenes (NMF). These extracted features are considered indicative of underlying regulatory processes. They can as well be applied to the classification of gene expression datasets by grouping samples into different categories for diagnostic purposes or group genes into functional categories for further investigation of related metabolic pathways and regulatory networks. Results: In this study we focus on unsupervised matrix factorization techniques and apply ICA and sparse NMF to microarray datasets. The latter monitor the gene expression levels of human peripheral blood cells during differentiation from monocytes to macrophages. We show that these tools are able to identify relevant signatures in the deduced component matrices and extract informative sets of marker genes from these gene expression profiles. The methods rely on the joint discriminative power of a set of marker genes rather than on single marker genes. With these sets of marker genes, corroborated by leave-one-out or random forest cross-validation, the datasets could easily be classified into related diagnostic categories. The latter correspond to either monocytes versus macrophages or healthy vs Niemann Pick C disease patients.Siemens AG, MunichDFG (Graduate College 638)DAAD (PPP Luso - Alem˜a and PPP Hispano - Alemanas

    Spectral analysis of gene expression profiles using gene networks

    Full text link
    Microarrays have become extremely useful for analysing genetic phenomena, but establishing a relation between microarray analysis results (typically a list of genes) and their biological significance is often difficult. Currently, the standard approach is to map a posteriori the results onto gene networks to elucidate the functions perturbed at the level of pathways. However, integrating a priori knowledge of the gene networks could help in the statistical analysis of gene expression data and in their biological interpretation. Here we propose a method to integrate a priori the knowledge of a gene network in the analysis of gene expression data. The approach is based on the spectral decomposition of gene expression profiles with respect to the eigenfunctions of the graph, resulting in an attenuation of the high-frequency components of the expression profiles with respect to the topology of the graph. We show how to derive unsupervised and supervised classification algorithms of expression profiles, resulting in classifiers with biological relevance. We applied the method to the analysis of a set of expression profiles from irradiated and non-irradiated yeast strains. It performed at least as well as the usual classification but provides much more biologically relevant results and allows a direct biological interpretation

    Bayesian meta-analysis for identifying periodically expressed genes in fission yeast cell cycle

    Full text link
    The effort to identify genes with periodic expression during the cell cycle from genome-wide microarray time series data has been ongoing for a decade. However, the lack of rigorous modeling of periodic expression as well as the lack of a comprehensive model for integrating information across genes and experiments has impaired the effort for the accurate identification of periodically expressed genes. To address the problem, we introduce a Bayesian model to integrate multiple independent microarray data sets from three recent genome-wide cell cycle studies on fission yeast. A hierarchical model was used for data integration. In order to facilitate an efficient Monte Carlo sampling from the joint posterior distribution, we develop a novel Metropolis--Hastings group move. A surprising finding from our integrated analysis is that more than 40% of the genes in fission yeast are significantly periodically expressed, greatly enhancing the reported 10--15% of the genes in the current literature. It calls for a reconsideration of the periodically expressed gene detection problem.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS300 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Functional analysis and transcriptional output of the Göttingen minipig genome

    Get PDF
    In the past decade the Göttingen minipig has gained increasing recognition as animal model in pharmaceutical and safety research because it recapitulates many aspects of human physiology and metabolism. Genome-based comparison of drug targets together with quantitative tissue expression analysis allows rational prediction of pharmacology and cross-reactivity of human drugs in animal models thereby improving drug attrition which is an important challenge in the process of drug development.; Here we present a new chromosome level based version of the Göttingen minipig genome together with a comparative transcriptional analysis of tissues with pharmaceutical relevance as basis for translational research. We relied on mapping and assembly of WGS (whole-genome-shotgun sequencing) derived reads to the reference genome of the Duroc pig and predict 19,228 human orthologous protein-coding genes. Genome-based prediction of the sequence of human drug targets enables the prediction of drug cross-reactivity based on conservation of binding sites. We further support the finding that the genome of Sus scrofa contains about ten-times less pseudogenized genes compared to other vertebrates. Among the functional human orthologs of these minipig pseudogenes we found HEPN1, a putative tumor suppressor gene. The genomes of Sus scrofa, the Tibetan boar, the African Bushpig, and the Warthog show sequence conservation of all inactivating HEPN1 mutations suggesting disruption before the evolutionary split of these pig species. We identify 133 Sus scrofa specific, conserved long non-coding RNAs (lncRNAs) in the minipig genome and show that these transcripts are highly conserved in the African pigs and the Tibetan boar suggesting functional significance. Using a new minipig specific microarray we show high conservation of gene expression signatures in 13 tissues with biomedical relevance between humans and adult minipigs. We underline this relationship for minipig and human liver where we could demonstrate similar expression levels for most phase I drug-metabolizing enzymes. Higher expression levels and metabolic activities were found for FMO1, AKR/CRs and for phase II drug metabolizing enzymes in minipig as compared to human. The variability of gene expression in equivalent human and minipig tissues is considerably higher in minipig organs, which is important for study design in case a human target belongs to this variable category in the minipig. The first analysis of gene expression in multiple tissues during development from young to adult shows that the majority of transcriptional programs are concluded four weeks after birth. This finding is in line with the advanced state of human postnatal organ development at comparative age categories and further supports the minipig as model for pediatric drug safety studies.; Genome based assessment of sequence conservation combined with gene expression data in several tissues improves the translational value of the minipig for human drug development. The genome and gene expression data presented here are important resources for researchers using the minipig as model for biomedical research or commercial breeding. Potential impact of our data for comparative genomics, translational research, and experimental medicine are discussed

    Gravitational waves from self-ordering scalar fields

    Get PDF
    Gravitational waves were copiously produced in the early Universe whenever the processes taking place were sufficiently violent. The spectra of several of these gravitational wave backgrounds on subhorizon scales have been extensively studied in the literature. In this paper we analyze the shape and amplitude of the gravitational wave spectrum on scales which are superhorizon at the time of production. Such gravitational waves are expected from the self ordering of randomly oriented scalar fields which can be present during a thermal phase transition or during preheating after hybrid inflation. We find that, if the gravitational wave source acts only during a small fraction of the Hubble time, the gravitational wave spectrum at frequencies lower than the expansion rate at the time of production behaves as ΩGW(f)f3\Omega_{\rm GW}(f) \propto f^3 with an amplitude much too small to be observable by gravitational wave observatories like LIGO, LISA or BBO. On the other hand, if the source is active for a much longer time, until a given mode which is initially superhorizon (kη1k\eta_* \ll 1), enters the horizon, for kη1k\eta \gtrsim 1, we find that the gravitational wave energy density is frequency independent, i.e. scale invariant. Moreover, its amplitude for a GUT scale scenario turns out to be within the range and sensitivity of BBO and marginally detectable by LIGO and LISA. This new gravitational wave background can compete with the one generated during inflation, and distinguishing both may require extra information.Comment: 21 pages, 2 figures, added discussion about numerical integration and a new figure to illustrate the scale-invariance of the GW power spectrum, conclusions unchange

    Semianalytical calculation of the zonal-flow oscillation frequency in stellarators

    Full text link
    Due to their capability to reduce turbulent transport in magnetized plasmas, understanding the dynamics of zonal flows is an important problem in the fusion programme. Since the pioneering work by Rosenbluth and Hinton in axisymmetric tokamaks, it is known that studying the linear and collisionless relaxation of zonal flow perturbations gives valuable information and physical insight. Recently, the problem has been investigated in stellarators and it has been found that in these devices the relaxation process exhibits a characteristic feature: a damped oscillation. The frequency of this oscillation might be a relevant parameter in the regulation of turbulent transport, and therefore its efficient and accurate calculation is important. Although an analytical expression can be derived for the frequency, its numerical evaluation is not simple and has not been exploited systematically so far. Here, a numerical method for its evaluation is considered, and the results are compared with those obtained by calculating the frequency from gyrokinetic simulations. This "semianalytical" approach for the determination of the zonal-flow frequency reveals accurate and faster than the one based on gyrokinetic simulations.Comment: 30 pages, 14 figure

    Gene expression time delays & Turing pattern formation systems

    Get PDF
    The incorporation of time delays can greatly affect the behaviour of partial differential equations and dynamical systems. In addition, there is evidence that time delays in gene expression due to transcription and translation play an important role in the dynamics of cellular systems. In this paper, we investigate the effects of incorporating gene expression time delays into a one-dimensional putative reaction diffusion pattern formation mechanism on both stationary domains and domains with spatially uniform exponential growth. While oscillatory behaviour is rare, we find that the time taken to initiate and stabilise patterns increases dramatically as the time delay is increased. In addition, we observe that on rapidly growing domains the time delay can induce a failure of the Turing instability which cannot be predicted by a naive linear analysis of the underlying equations about the homogeneous steady state. The dramatic lag in the induction of patterning, or even its complete absence on occasions, highlights the importance of considering explicit gene expression time delays in models for cellular reaction diffusion patterning
    corecore