2,439 research outputs found

    Model selection and efficiency testing for normalization of cDNA microarray data

    Get PDF
    In this study we present two novel normalization schemes for cDNA microarrays. They are based on iterative local regression and optimization of model parameters by generalized cross-validation. Permutation tests assessing the efficiency of normalization demonstrated that the proposed schemes have an improved ability to remove systematic errors and to reduce variability in microarray data. The analysis also reveals that without parameter optimization local regression is frequently insufficient to remove systematic errors in microarray data

    A novel neural network approach to cDNA microarray image segmentation

    Get PDF
    This is the post-print version of the Article. The official published version can be accessed from the link below. Copyright @ 2013 Elsevier.Microarray technology has become a great source of information for biologists to understand the workings of DNA which is one of the most complex codes in nature. Microarray images typically contain several thousands of small spots, each of which represents a different gene in the experiment. One of the key steps in extracting information from a microarray image is the segmentation whose aim is to identify which pixels within an image represent which gene. This task is greatly complicated by noise within the image and a wide degree of variation in the values of the pixels belonging to a typical spot. In the past there have been many methods proposed for the segmentation of microarray image. In this paper, a new method utilizing a series of artificial neural networks, which are based on multi-layer perceptron (MLP) and Kohonen networks, is proposed. The proposed method is applied to a set of real-world cDNA images. Quantitative comparisons between the proposed method and commercial software GenePix(®) are carried out in terms of the peak signal-to-noise ratio (PSNR). This method is shown to not only deliver results comparable and even superior to existing techniques but also have a faster run time.This work was funded in part by the National Natural Science Foundation of China under Grants 61174136 and 61104041, the Natural Science Foundation of Jiangsu Province of China under Grant BK2011598, the International Science and Technology Cooperation Project of China under Grant No. 2011DFA12910, the Engineering and Physical Sciences Research Council (EPSRC) of the U.K. under Grant GR/S27658/01, the Royal Society of the U.K., and the Alexander von Humboldt Foundation of Germany

    Analysis of the singular value decomposition as a tool for processing microarray expression data

    Get PDF
    We give two informative derivations of a spectral algorithm for clustering and partitioning a bi-partite graph. In the first case we begin with a discrete optimization problem that relaxes into a tractable continuous analogue. In the second case we use the power method to derive an iterative interpretation of the algorithm. Both versions reveal a natural approach for re-scaling the edge weights and help to explain the performance of the algorithm in the presence of outliers. Our motivation for this work is in the analysis of microarray data from bioinformatics, and we give some numerical results for a publicly available acute leukemia data set

    Statistical monitoring of weak spots for improvement of normalization and ratio estimates in microarrays

    Get PDF
    BACKGROUND: Several aspects of microarray data analysis are dependent on identification of genes expressed at or near the limits of detection. For example, regression-based normalization methods rely on the premise that most genes in compared samples are expressed at similar levels and therefore require accurate identification of nonexpressed genes (additive noise) so that they can be excluded from the normalization procedure. Moreover, key regulatory genes can maintain stringent control of a given response at low expression levels. If arbitrary cutoffs are used for distinguishing expressed from nonexpressed genes, some of these key regulatory genes may be unnecessarily excluded from the analysis. Unfortunately, no accurate method for differentiating additive noise from genes expressed at low levels is currently available. RESULTS: We developed a multistep procedure for analysis of mRNA expression data that robustly identifies the additive noise in a microarray experiment. This analysis is predicated on the fact that additive noise signals can be accurately identified by both distribution and statistical analysis. CONCLUSIONS: Identification of additive noise in this manner allows exclusion of noncorrelated weak signals from regression-based normalization of compared profiles thus maximizing the accuracy of these methods. Moreover, genes expressed at very low levels can be clearly identified due to the fact that their expression distribution is stable and distinguishable from the random pattern of additive noise

    maigesPack: A Computational Environment for Microarray Data Analysis

    Full text link
    Microarray technology is still an important way to assess gene expression in molecular biology, mainly because it measures expression profiles for thousands of genes simultaneously, what makes this technology a good option for some studies focused on systems biology. One of its main problem is complexity of experimental procedure, presenting several sources of variability, hindering statistical modeling. So far, there is no standard protocol for generation and evaluation of microarray data. To mitigate the analysis process this paper presents an R package, named maigesPack, that helps with data organization. Besides that, it makes data analysis process more robust, reliable and reproducible. Also, maigesPack aggregates several data analysis procedures reported in literature, for instance: cluster analysis, differential expression, supervised classifiers, relevance networks and functional classification of gene groups or gene networks

    A robust two-way semi-linear model for normalization of cDNA microarray data

    Get PDF
    BACKGROUND: Normalization is a basic step in microarray data analysis. A proper normalization procedure ensures that the intensity ratios provide meaningful measures of relative expression values. METHODS: We propose a robust semiparametric method in a two-way semi-linear model (TW-SLM) for normalization of cDNA microarray data. This method does not make the usual assumptions underlying some of the existing methods. For example, it does not assume that: (i) the percentage of differentially expressed genes is small; or (ii) the numbers of up- and down-regulated genes are about the same, as required in the LOWESS normalization method. We conduct simulation studies to evaluate the proposed method and use a real data set from a specially designed microarray experiment to compare the performance of the proposed method with that of the LOWESS normalization approach. RESULTS: The simulation results show that the proposed method performs better than the LOWESS normalization method in terms of mean square errors for estimated gene effects. The results of analysis of the real data set also show that the proposed method yields more consistent results between the direct and the indirect comparisons and also can detect more differentially expressed genes than the LOWESS method. CONCLUSIONS: Our simulation studies and the real data example indicate that the proposed robust TW-SLM method works at least as well as the LOWESS method and works better when the underlying assumptions for the LOWESS method are not satisfied. Therefore, it is a powerful alternative to the existing normalization methods

    Robust Likelihood-Based Survival Modeling with Microarray Data

    Get PDF
    Gene expression data can be associated with various clinical outcomes. In particular, these data can be of importance in discovering survival-associated genes for medical applications. As alternatives to traditional statistical methods, sophisticated methods and software programs have been developed to overcome the high-dimensional difficulty of microarray data. Nevertheless, new algorithms and software programs are needed to include practical functions such as the discovery of multiple sets of survival-associated genes and the incorporation of risk factors, and to use in the R environment which many statisticians are familiar with. For survival modeling with microarray data, we have developed a software program (called rbsurv) which can be used conveniently and interactively in the R environment. This program selects survival-associated genes based on the partial likelihood of the Cox model and separates training and validation sets of samples for robustness. It can discover multiple sets of genes by iterative forward selection rather than one large set of genes. It can also allow adjustment for risk factors in microarray survival modeling. This software package, the rbsurv package, can be used to discover survival-associated genes with microarray data conveniently.

    Evaluation of normalization methods for microarray data

    Get PDF
    BACKGROUND: Microarray technology allows the monitoring of expression levels for thousands of genes simultaneously. This novel technique helps us to understand gene regulation as well as gene by gene interactions more systematically. In the microarray experiment, however, many undesirable systematic variations are observed. Even in replicated experiment, some variations are commonly observed. Normalization is the process of removing some sources of variation which affect the measured gene expression levels. Although a number of normalization methods have been proposed, it has been difficult to decide which methods perform best. Normalization plays an important role in the earlier stage of microarray data analysis. The subsequent analysis results are highly dependent on normalization. RESULTS: In this paper, we use the variability among the replicated slides to compare performance of normalization methods. We also compare normalization methods with regard to bias and mean square error using simulated data. CONCLUSIONS: Our results show that intensity-dependent normalization often performs better than global normalization methods, and that linear and nonlinear normalization methods perform similarly. These conclusions are based on analysis of 36 cDNA microarrays of 3,840 genes obtained in an experiment to search for changes in gene expression profiles during neuronal differentiation of cortical stem cells. Simulation studies confirm our findings

    Whole-transciptome analysis of [psi+] budding yeast via cDNA microarrays

    Get PDF
    Introduction: Prions of yeast present a novel analytical challenge in terms of both initial characterization and in vitro manipulation as models for human disease research. Presently, few robust analysis strategies have been successfully implemented which enable the efficient study of prion behavior in vivo. This study sought to evaluate the utilization of conventional dual-channel cDNA microarrays for the surveillance of transcriptomic regulation patterns by the [PSI+] yeast prion relative to an identical prion deficient yeast variant, [psi-]. Methods: A data analysis and normalization workflow strategy was developed and applied to cDNA array images, yielded quality-regulated expression ratios for a subset of genes exhibiting statistical congruence across multiple experimental repetitions and nested hybridization events. The significant gene list was analyzed using classical analytical approaches including several clustering-based methods and singular value decomposition. To add biological meaning to the differential expression data in hand, functional annotation using the Gene Ontology as well as several pathway-mapping approaches was conducted. Finally, the expression patterns observed were queried against all publicly curated microarray data performed using S. cerevisiae in order to discover similar expression behavior across a vast array of experimental conditions. Results: These data collectively implicate a low-level of overall genomic regulation as a result of the [PSI+] state, where the maximum statistically significant degree of differential expression was less than ±1 Log2(FC) in all cases. Notwithstanding, the [PSI+] differential expression was localized to several specific classes of structural elements and cellular functions, implying under homeostatic conditions significant up or down regulation is likely unnecessary but possible in those specific systems if environmental conditions warranted. As a result of these findings additional work pertaining to this system should include controlled insult to both yeast variants of differing environmental properties to promote a potential [PSI+] regulatory response coupled with co-surveillance of these conditions using transcriptomic and proteomic analysis methodologies
    corecore