4,143 research outputs found

    The Wavelet-Based Cluster Analysis for Temporal Gene Expression Data

    Get PDF
    A variety of high-throughput methods have made it possible to generate detailed temporal expression data for a single gene or large numbers of genes. Common methods for analysis of these large data sets can be problematic. One challenge is the comparison of temporal expression data obtained from different growth conditions where the patterns of expression may be shifted in time. We propose the use of wavelet analysis to transform the data obtained under different growth conditions to permit comparison of expression patterns from experiments that have time shifts or delays. We demonstrate this approach using detailed temporal data for a single bacterial gene obtained under 72 different growth conditions. This general strategy can be applied in the analysis of data sets of thousands of genes under different conditions

    Phenotypic Signatures Arising from Unbalanced Bacterial Growth

    Get PDF
    Fluctuations in the growth rate of a bacterial culture during unbalanced growth are generally considered undesirable in quantitative studies of bacterial physiology. Under well-controlled experimental conditions, however, these fluctuations are not random but instead reflect the interplay between intra-cellular networks underlying bacterial growth and the growth environment. Therefore, these fluctuations could be considered quantitative phenotypes of the bacteria under a specific growth condition. Here, we present a method to identify “phenotypic signatures” by time-frequency analysis of unbalanced growth curves measured with high temporal resolution. The signatures are then applied to differentiate amongst different bacterial strains or the same strain under different growth conditions, and to identify the essential architecture of the gene network underlying the observed growth dynamics. Our method has implications for both basic understanding of bacterial physiology and for the classification of bacterial strains

    Clustering Time Series from Mixture Polynomial Models with Discretised Data

    Get PDF
    Clustering time series is an active research area with applications in many fields. One common feature of time series is the likely presence of outliers. These uncharacteristic data can significantly effect the quality of clusters formed. This paper evaluates a method of over-coming the detrimental effects of outliers. We describe some of the alternative approaches to clustering time series, then specify a particular class of model for experimentation with k-means clustering and a correlation based distance metric. For data derived from this class of model we demonstrate that discretising the data into a binary series of above and below the median improves the clustering when the data has outliers. More specifically, we show that firstly discretisation does not significantly effect the accuracy of the clusters when there are no outliers and secondly it significantly increases the accuracy in the presence of outliers, even when the probability of outlier is very low

    BMICA-independent component analysis based on B-spline mutual information estimator

    Get PDF
    The information theoretic concept of mutual information provides a general framework to evaluate dependencies between variables. Its estimation however using B-Spline has not been used before in creating an approach for Independent Component Analysis. In this paper we present a B-Spline estimator for mutual information to find the independent components in mixed signals. Tested using electroencephalography (EEG) signals the resulting BMICA (B-Spline Mutual Information Independent Component Analysis) exhibits better performance than the standard Independent Component Analysis algorithms of FastICA, JADE, SOBI and EFICA in similar simulations. BMICA was found to be also more reliable than the 'renown' FastICA

    DYNAMIC CLUSTERING OF CELL-CYCLE MICROARRAY DATA

    Get PDF
    The cell cycle is a crucial series of events that are repeated over time, allowing the cell to grow, duplicate, and split. Cell-cycle systems play an important role in cancer and other biological processes. Using gene expression data gained from microarray technology it is possible to group or cluster genes that are involved in the cell-cycle for the purpose of exploring their functional co-regulation. Typically, the goal of clustering methods as applied to gene expression data is to place genes with similar expression patterns or profiles into the same group or cluster for the purpose of inferring the function of unknown genes that cluster with genes of known function. Since a gene may be involved in more than one biological process at any one time, co-regulated genes may not have visually similar expression patterns. Furthermore, the time duration for genes in a biological process may differ, and the number of co-regulated patterns or biological processes shared by two genes may be unknown. Based on this reasoning, biologically realistic gene clusters gained from gene co-regulation may not be accurately identified using traditional clustering methods. By taking advantage of techniques and theories from signal processing, it possible to cluster cell-cycle gene expression profiles using a dynamic perspective under the assumption that different spectral frequencies characterize different biological processes

    Genome-Wide Analysis of Histone Modification Enrichments Induced by Marek's Disease Virus in Inbred Chicken Lines

    Get PDF
    Covalent histone modifications constitute a complex network of transcriptional regulation involved in diverse biological processes ranging from stem cell differentiation to immune response. The advent of modern sequencing technologies enables one to query the locations of histone modifications across the genome in an efficient manner. However, inherent biases in the technology and diverse enrichment patterns complicate data analysis. Marek's disease (MD) is an acute, lymphoma-inducing disease of chickens with disease outcomes affected by multiple host and environmental factors. Inbred chicken lines 63 and 72 share the same major histocompatibility complex haplotype, but have contrasting responses to MD. This dissertation presents novel methods for analysis of genome-wide histone modification data and application of new and existing methods to the investigation of epigenetic effects of MD on these lines. First, we present WaveSeq, a novel algorithm for detection of significant enrichments in ChIP-Seq data. WaveSeq implements a distribution-free approach by combining the continuous wavelet transform with Monte Carlo sampling techniques for effective peak detection. WaveSeq outperformed existing tools particularly for diffuse histone modification peaks demonstrating that restrictive distributional assumptions are not necessary for accurate ChIP-Seq peak detection. Second, we investigated latent MD in thymus tissues by profiling H3K4me3 and H3K27me3 in infected and control birds from lines 63 and 72. Several genes associated with MD, e.g. MX1 and CTLA–4, along with those linked with human cancers, showed line-specific and condition-specific enrichments. One of the first studies of histone modifications in chickens, our work demonstrated that MD induced widespread epigenetic variations. Finally, we analyzed the temporal evolution of histone modifications at distinct phases of MD progression in the bursa of Fabricius. Genes involved in several important pathways, e.g. apoptosis and MAPK signaling, and various immune-related miRNAs showed differential histone modifications in the promoter region. Our results indicated heightened inflammation in the susceptible line during early cytolytic MD, while resistant birds showed recuperative symptoms during early MD and epigenetic silencing during latent infection. Thus, although further elucidation of underlying mechanisms is necessary, this work provided the first definitive evidence of the epigenetic effects of MD
    • …
    corecore