71,860 research outputs found
Knowledge-based gene expression classification via matrix factorization
Motivation: Modern machine learning methods based on matrix decomposition techniques, like independent component analysis (ICA) or non-negative matrix factorization (NMF), provide new and efficient analysis tools which are currently explored to analyze gene expression profiles. These exploratory feature extraction techniques yield expression modes (ICA) or metagenes (NMF). These extracted features are considered indicative of underlying regulatory processes. They can as well be applied to the classification of gene expression datasets by grouping samples into different categories for diagnostic purposes or group genes into functional categories for further investigation of related metabolic pathways and regulatory networks.
Results: In this study we focus on unsupervised matrix factorization techniques and apply ICA and sparse NMF to microarray datasets. The latter monitor the gene expression levels of human peripheral blood cells during differentiation from monocytes to macrophages. We show that these tools are able to identify relevant signatures in the deduced component matrices and extract informative sets of marker genes from these gene expression profiles. The methods rely on the joint discriminative power of a set of marker genes rather than on single marker genes. With these sets of marker genes, corroborated by leave-one-out or random forest cross-validation, the datasets could easily be classified into related diagnostic categories. The latter correspond to either monocytes versus macrophages or healthy vs Niemann Pick C disease patients.Siemens AG, MunichDFG (Graduate College 638)DAAD (PPP Luso - Alem˜a and PPP Hispano - Alemanas
Spectral analysis of gene expression profiles using gene networks
Microarrays have become extremely useful for analysing genetic phenomena, but
establishing a relation between microarray analysis results (typically a list
of genes) and their biological significance is often difficult. Currently, the
standard approach is to map a posteriori the results onto gene networks to
elucidate the functions perturbed at the level of pathways. However,
integrating a priori knowledge of the gene networks could help in the
statistical analysis of gene expression data and in their biological
interpretation. Here we propose a method to integrate a priori the knowledge of
a gene network in the analysis of gene expression data. The approach is based
on the spectral decomposition of gene expression profiles with respect to the
eigenfunctions of the graph, resulting in an attenuation of the high-frequency
components of the expression profiles with respect to the topology of the
graph. We show how to derive unsupervised and supervised classification
algorithms of expression profiles, resulting in classifiers with biological
relevance. We applied the method to the analysis of a set of expression
profiles from irradiated and non-irradiated yeast strains. It performed at
least as well as the usual classification but provides much more biologically
relevant results and allows a direct biological interpretation
Bayesian meta-analysis for identifying periodically expressed genes in fission yeast cell cycle
The effort to identify genes with periodic expression during the cell cycle
from genome-wide microarray time series data has been ongoing for a decade.
However, the lack of rigorous modeling of periodic expression as well as the
lack of a comprehensive model for integrating information across genes and
experiments has impaired the effort for the accurate identification of
periodically expressed genes. To address the problem, we introduce a Bayesian
model to integrate multiple independent microarray data sets from three recent
genome-wide cell cycle studies on fission yeast. A hierarchical model was used
for data integration. In order to facilitate an efficient Monte Carlo sampling
from the joint posterior distribution, we develop a novel Metropolis--Hastings
group move. A surprising finding from our integrated analysis is that more than
40% of the genes in fission yeast are significantly periodically expressed,
greatly enhancing the reported 10--15% of the genes in the current literature.
It calls for a reconsideration of the periodically expressed gene detection
problem.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS300 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Functional analysis and transcriptional output of the Göttingen minipig genome
In the past decade the Göttingen minipig has gained increasing recognition as animal model in pharmaceutical and safety research because it recapitulates many aspects of human physiology and metabolism. Genome-based comparison of drug targets together with quantitative tissue expression analysis allows rational prediction of pharmacology and cross-reactivity of human drugs in animal models thereby improving drug attrition which is an important challenge in the process of drug development.; Here we present a new chromosome level based version of the Göttingen minipig genome together with a comparative transcriptional analysis of tissues with pharmaceutical relevance as basis for translational research. We relied on mapping and assembly of WGS (whole-genome-shotgun sequencing) derived reads to the reference genome of the Duroc pig and predict 19,228 human orthologous protein-coding genes. Genome-based prediction of the sequence of human drug targets enables the prediction of drug cross-reactivity based on conservation of binding sites. We further support the finding that the genome of Sus scrofa contains about ten-times less pseudogenized genes compared to other vertebrates. Among the functional human orthologs of these minipig pseudogenes we found HEPN1, a putative tumor suppressor gene. The genomes of Sus scrofa, the Tibetan boar, the African Bushpig, and the Warthog show sequence conservation of all inactivating HEPN1 mutations suggesting disruption before the evolutionary split of these pig species. We identify 133 Sus scrofa specific, conserved long non-coding RNAs (lncRNAs) in the minipig genome and show that these transcripts are highly conserved in the African pigs and the Tibetan boar suggesting functional significance. Using a new minipig specific microarray we show high conservation of gene expression signatures in 13 tissues with biomedical relevance between humans and adult minipigs. We underline this relationship for minipig and human liver where we could demonstrate similar expression levels for most phase I drug-metabolizing enzymes. Higher expression levels and metabolic activities were found for FMO1, AKR/CRs and for phase II drug metabolizing enzymes in minipig as compared to human. The variability of gene expression in equivalent human and minipig tissues is considerably higher in minipig organs, which is important for study design in case a human target belongs to this variable category in the minipig. The first analysis of gene expression in multiple tissues during development from young to adult shows that the majority of transcriptional programs are concluded four weeks after birth. This finding is in line with the advanced state of human postnatal organ development at comparative age categories and further supports the minipig as model for pediatric drug safety studies.; Genome based assessment of sequence conservation combined with gene expression data in several tissues improves the translational value of the minipig for human drug development. The genome and gene expression data presented here are important resources for researchers using the minipig as model for biomedical research or commercial breeding. Potential impact of our data for comparative genomics, translational research, and experimental medicine are discussed
Gravitational waves from self-ordering scalar fields
Gravitational waves were copiously produced in the early Universe whenever
the processes taking place were sufficiently violent. The spectra of several of
these gravitational wave backgrounds on subhorizon scales have been extensively
studied in the literature. In this paper we analyze the shape and amplitude of
the gravitational wave spectrum on scales which are superhorizon at the time of
production. Such gravitational waves are expected from the self ordering of
randomly oriented scalar fields which can be present during a thermal phase
transition or during preheating after hybrid inflation. We find that, if the
gravitational wave source acts only during a small fraction of the Hubble time,
the gravitational wave spectrum at frequencies lower than the expansion rate at
the time of production behaves as with an
amplitude much too small to be observable by gravitational wave observatories
like LIGO, LISA or BBO. On the other hand, if the source is active for a much
longer time, until a given mode which is initially superhorizon (), enters the horizon, for , we find that the gravitational
wave energy density is frequency independent, i.e. scale invariant. Moreover,
its amplitude for a GUT scale scenario turns out to be within the range and
sensitivity of BBO and marginally detectable by LIGO and LISA. This new
gravitational wave background can compete with the one generated during
inflation, and distinguishing both may require extra information.Comment: 21 pages, 2 figures, added discussion about numerical integration and
a new figure to illustrate the scale-invariance of the GW power spectrum,
conclusions unchange
Semianalytical calculation of the zonal-flow oscillation frequency in stellarators
Due to their capability to reduce turbulent transport in magnetized plasmas,
understanding the dynamics of zonal flows is an important problem in the fusion
programme. Since the pioneering work by Rosenbluth and Hinton in axisymmetric
tokamaks, it is known that studying the linear and collisionless relaxation of
zonal flow perturbations gives valuable information and physical insight.
Recently, the problem has been investigated in stellarators and it has been
found that in these devices the relaxation process exhibits a characteristic
feature: a damped oscillation. The frequency of this oscillation might be a
relevant parameter in the regulation of turbulent transport, and therefore its
efficient and accurate calculation is important. Although an analytical
expression can be derived for the frequency, its numerical evaluation is not
simple and has not been exploited systematically so far. Here, a numerical
method for its evaluation is considered, and the results are compared with
those obtained by calculating the frequency from gyrokinetic simulations. This
"semianalytical" approach for the determination of the zonal-flow frequency
reveals accurate and faster than the one based on gyrokinetic simulations.Comment: 30 pages, 14 figure
Gene expression time delays & Turing pattern formation systems
The incorporation of time delays can greatly affect the behaviour of partial differential equations and dynamical systems. In addition, there is evidence that time delays in gene expression due to transcription and translation play an important role in the dynamics of cellular systems. In this paper, we investigate the effects of incorporating gene expression time delays into a one-dimensional putative reaction diffusion pattern formation mechanism on both stationary domains and domains with spatially uniform exponential growth. While oscillatory behaviour is rare, we find that the time taken to initiate and stabilise patterns increases dramatically as the time delay is increased. In addition, we observe that on rapidly growing domains the time delay can induce a failure of the Turing instability which cannot be predicted by a naive linear analysis of the underlying equations about the homogeneous steady state. The dramatic lag in the induction of patterning, or even its complete absence on occasions, highlights the importance of considering explicit gene expression time delays in models for cellular reaction diffusion patterning
- …