66 research outputs found

    Analyzing Exon Structure with PCA and ICA of Short-Time Fourier Transform

    Get PDF
    Abstract We use principal component analysis (PCA

    A permutation-based multiple testing method for time-course microarray experiments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Time-course microarray experiments are widely used to study the temporal profiles of gene expression. Storey <it>et al</it>. (2005) developed a method for analyzing time-course microarray studies that can be applied to discovering genes whose expression trajectories change over time within a single biological group, or those that follow different time trajectories among multiple groups. They estimated the expression trajectories of each gene using natural cubic splines under the null (no time-course) and alternative (time-course) hypotheses, and used a goodness of fit test statistic to quantify the discrepancy. The null distribution of the statistic was approximated through a bootstrap method. Gene expression levels in microarray data are often complicatedly correlated. An accurate type I error control adjusting for multiple testing requires the joint null distribution of test statistics for a large number of genes. For this purpose, permutation methods have been widely used because of computational ease and their intuitive interpretation.</p> <p>Results</p> <p>In this paper, we propose a permutation-based multiple testing procedure based on the test statistic used by Storey <it>et al</it>. (2005). We also propose an efficient computation algorithm. Extensive simulations are conducted to investigate the performance of the permutation-based multiple testing procedure. The application of the proposed method is illustrated using the <it>Caenorhabditis elegans </it>dauer developmental data.</p> <p>Conclusion</p> <p>Our method is computationally efficient and applicable for identifying genes whose expression levels are time-dependent in a single biological group and for identifying the genes for which the time-profile depends on the group in a multi-group setting.</p

    Prediction of a time-to-event trait using genome wide SNP data

    Get PDF
    BACKGROUND: A popular objective of many high-throughput genome projects is to discover various genomic markers associated with traits and develop statistical models to predict traits of future patients based on marker values. RESULTS: In this paper, we present a prediction method for time-to-event traits using genome-wide single-nucleotide polymorphisms (SNPs). We also propose a MaxTest associating between a time-to-event trait and a SNP accounting for its possible genetic models. The proposed MaxTest can help screen out nonprognostic SNPs and identify genetic models of prognostic SNPs. The performance of the proposed method is evaluated through simulations. CONCLUSIONS: In conjunction with the MaxTest, the proposed method provides more parsimonious prediction models but includes more prognostic SNPs than some naive prediction methods. The proposed method is demonstrated with real GWAS data

    Multiple testing for gene sets from microarray experiments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A key objective in many microarray association studies is the identification of individual genes associated with clinical outcome. It is often of additional interest to identify sets of genes, known a priori to have similar biologic function, associated with the outcome.</p> <p>Results</p> <p>In this paper, we propose a general permutation-based framework for gene set testing that controls the false discovery rate (FDR) while accounting for the dependency among the genes within and across each gene set. The application of the proposed method is demonstrated using three public microarray data sets. The performance of our proposed method is contrasted to two other existing Gene Set Enrichment Analysis (GSEA) and Gene Set Analysis (GSA) methods.</p> <p>Conclusions</p> <p>Our simulations show that the proposed method controls the FDR at the desired level. Through simulations and case studies, we observe that our method performs better than GSEA and GSA, especially when the number of prognostic gene sets is large.</p

    A copula method for modeling directional dependence of genes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genes interact with each other as basic building blocks of life, forming a complicated network. The relationship between groups of genes with different functions can be represented as gene networks. With the deposition of huge microarray data sets in public domains, study on gene networking is now possible. In recent years, there has been an increasing interest in the reconstruction of gene networks from gene expression data. Recent work includes linear models, Boolean network models, and Bayesian networks. Among them, Bayesian networks seem to be the most effective in constructing gene networks. A major problem with the Bayesian network approach is the excessive computational time. This problem is due to the interactive feature of the method that requires large search space. Since fitting a model by using the copulas does not require iterations, elicitation of the priors, and complicated calculations of posterior distributions, the need for reference to extensive search spaces can be eliminated leading to manageable computational affords. Bayesian network approach produces a discretely expression of conditional probabilities. Discreteness of the characteristics is not required in the copula approach which involves use of uniform representation of the continuous random variables. Our method is able to overcome the limitation of Bayesian network method for gene-gene interaction, i.e. information loss due to binary transformation.</p> <p>Results</p> <p>We analyzed the gene interactions for two gene data sets (one group is eight histone genes and the other group is 19 genes which include DNA polymerases, DNA helicase, type B cyclin genes, DNA primases, radiation sensitive genes, repaire related genes, replication protein A encoding gene, DNA replication initiation factor, securin gene, nucleosome assembly factor, and a subunit of the cohesin complex) by adopting a measure of directional dependence based on a copula function. We have compared our results with those from other methods in the literature. Although microarray results show a transcriptional co-regulation pattern and do not imply that the gene products are physically interactive, this tight genetic connection may suggest that each gene product has either direct or indirect connections between the other gene products. Indeed, recent comprehensive analysis of a protein interaction map revealed that those histone genes are physically connected with each other, supporting the results obtained by our method.</p> <p>Conclusion</p> <p>The results illustrate that our method can be an alternative to Bayesian networks in modeling gene interactions. One advantage of our approach is that dependence between genes is not assumed to be linear. Another advantage is that our approach can detect directional dependence. We expect that our study may help to design artificial drug candidates, which can block or activate biologically meaningful pathways. Moreover, our copula approach can be extended to investigate the effects of local environments on protein-protein interactions. The copula mutual information approach will help to propose the new variant of ARACNE (Algorithm for the Reconstruction of Accurate Cellular Networks): an algorithm for the reconstruction of gene regulatory networks.</p

    SNP Selection in Genome-Wide Association Studies via Penalized Support Vector Machine with MAX Test

    Get PDF
    One of main objectives of a genome-wide association study (GWAS) is to develop a prediction model for a binary clinical outcome using single-nucleotide polymorphisms (SNPs) which can be used for diagnostic and prognostic purposes and for better understanding of the relationship between the disease and SNPs. Penalized support vector machine (SVM) methods have been widely used toward this end. However, since investigators often ignore the genetic models of SNPs, a final model results in a loss of efficiency in prediction of the clinical outcome. In order to overcome this problem, we propose a two-stage method such that the the genetic models of each SNP are identified using the MAX test and then a prediction model is fitted using a penalized SVM method. We apply the proposed method to various penalized SVMs and compare the performance of SVMs using various penalty functions. The results from simulations and real GWAS data analysis show that the proposed method performs better than the prediction methods ignoring the genetic models in terms of prediction power and selectivity

    Robust test method for time-course microarray experiments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In a time-course microarray experiment, the expression level for each gene is observed across a number of time-points in order to characterize the temporal trajectories of the gene-expression profiles. For many of these experiments, the scientific aim is the identification of genes for which the trajectories depend on an experimental or phenotypic factor. There is an extensive recent body of literature on statistical methodology for addressing this analytical problem. Most of the existing methods are based on estimating the time-course trajectories using parametric or non-parametric mean regression methods. The sensitivity of these regression methods to outliers, an issue that is well documented in the statistical literature, should be of concern when analyzing microarray data.</p> <p>Results</p> <p>In this paper, we propose a robust testing method for identifying genes whose expression time profiles depend on a factor. Furthermore, we propose a multiple testing procedure to adjust for multiplicity.</p> <p>Conclusions</p> <p>Through an extensive simulation study, we will illustrate the performance of our method. Finally, we will report the results from applying our method to a case study and discussing potential extensions.</p

    Clinical Outcomes and Adverse Events of Gastric Endoscopic Submucosal Dissection of the Mid to Upper Stomach under General Anesthesia and Monitored Anesthetic Care

    Get PDF
    Background/Aims Endoscopic submucosal dissection (ESD) of gastric tumors in the mid-to-upper stomach is a technically challenging procedure. This study compared the therapeutic outcomes and adverse events of ESD of tumors in the mid-to-upper stomach performed under general anesthesia (GA) or monitored anesthesia care (MAC). Methods Between 2012 and 2018, 674 patients underwent ESD for gastric tumors in the midbody, high body, fundus, or cardia (100 patients received GA; 574 received MAC). The outcomes of the propensity score (PS)-matched (1:1) patients receiving either GA or MAC were analyzed. Results The PS matching identified 94 patients who received GA and 94 patients who received MAC. Both groups showed high rates ofen bloc resection (GA, 95.7%; MAC, 97.9%; p=0.68) and complete resection (GA, 81.9%; MAC, 84.0%; p=0.14). There were no significant differences between the rates of adverse events (GA, 16.0%; MAC, 8.5%; p=0.18) in the anesthetic groups. Logistic regression analysis indicated that the method of anesthesia did not affect the rates of complete resection or adverse events. Conclusions ESD of tumors in the mid-to-upper stomach at our high-volume center had good outcomes, regardless of the method of anesthesia. Our results demonstrate no differences between the efficacies and safety of ESD performed under MAC and GA

    Sample size calculation for microarray experiments with blocked one-way design

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>One of the main objectives of microarray analysis is to identify differentially expressed genes for different types of cells or treatments. Many statistical methods have been proposed to assess the treatment effects in microarray experiments.</p> <p>Results</p> <p>In this paper, we consider discovery of the genes that are differentially expressed among <it>K </it>(> 2) treatments when each set of <it>K </it>arrays consists of a block. In this case, the array data among <it>K </it>treatments tend to be correlated because of block effect. We propose to use the blocked one-way ANOVA <it>F</it>-statistic to test if each gene is differentially expressed among <it>K </it>treatments. The marginal p-values are calculated using a permutation method accounting for the block effect, adjusting for the multiplicity of the testing procedure by controlling the false discovery rate (FDR). We propose a sample size calculation method for microarray experiments with a blocked one-way design. With FDR level and effect sizes of genes specified, our formula provides a sample size for a given number of true discoveries.</p> <p>Conclusion</p> <p>The calculated sample size is shown via simulations to provide an accurate number of true discoveries while controlling the FDR at the desired level.</p
    corecore