249 research outputs found

    Feature Selection for Predicting Tumor Metastases in Microarray Experiments using Paired Design

    Get PDF
    Among the major issues in gene expression profile classification, feature selection is an important and necessary step in achieving and creating good classification rules given the high dimensionality of microarray data. Although different feature selection methods have been reported, there has been no method specifically proposed for paired microarray experiments. In this paper, we introduce a simple procedure based on a modified t-statistic for feature selection to microarray experiments using the popular matched case-control design and apply to our recent study on tumor metastasis in a low-malignant group of breast cancer patients for selecting genes that best predict metastases. Gene or feature selection is optimized by thresholding in a leaving one-pair out cross-validation. Model comparison through empirical application has shown that our method manifests improved efficiency with high sensitivity and specificity

    Gene Expression Meta-Analysis identifies Cytokine Pathways and 5q Aberrations involved in Metastasis of ERBB2 Amplified and Basal Breast Cancer

    Get PDF
    Background Breast tumors have been described by molecular subtypes characterized by pervasively different gene expression profiles. The subtypes are associated with different clinical parameters and origin of precursor cells. However, the biological pathways and chromosomal aberrations that differ between the subgroups are less well characterized. The molecular subtypes are associated with different risk of metastatic recurrence of the disease. Nevertheless, the performance of these overall patterns to predict outcome is far from optimal, suggesting that biological mechanisms that extend beyond the subgroups impact metastasis. Results We have scrutinized publicly available gene expression datasets and identified molecular subtypes in 1,394 breast tumors with outcome data. By analysis of chromosomal regions and pathways using “Gene set enrichment analysis” followed by a meta-analysis, we identified comprehensive mechanistic differences between the subgroups. Furthermore, the same approach was used to investigate mechanisms related to metastasis within the subgroups. A striking finding is that the molecular subtypes account for the majority of biological mechanisms associated with metastasis. However, some mechanisms, aside from the subtypes, were identified in a training set of 1,239 tumors and confirmed by survival analysis in two independent validation datasets from the same type of platform and consisting of very comparable node-negative patients that did not receive adjuvant medical therapy. The results show that high expression of 5q14 genes and low levels of TNFR2 pathway genes were associated with poor survival in basal-like cancers. Furthermore, low expression of 5q33 genes and interleukin-12 pathway genes were associated with poor outcome exclusively in ERBB2-like tumors. Conclusion The identified regions, genes, and pathways may be potential drug targets in future individualized treatment strategies

    A Combinatory Approach for Selecting Prognostic Genes in Microarray Studies of Tumour Survivals

    Get PDF
    Different from significant gene expression analysis which looks for genes that are differentially regulated, feature selection in the microarray-based prognostic gene expression analysis aims at finding a subset of marker genes that are not only differentially expressed but also informative for prediction. Unfortunately feature selection in literature of microarray study is predominated by the simple heuristic univariate gene filter paradigm that selects differentially expressed genes according to their statistical significances. We introduce a combinatory feature selection strategy that integrates differential gene expression analysis with the Gram-Schmidt process to identify prognostic genes that are both statistically significant and highly informative for predicting tumour survival outcomes. Empirical application to leukemia and ovarian cancer survival data through-within- and cross-study validations shows that the feature space can be largely reduced while achieving improved testing performances

    Retrospective analysis of main and interaction effects in genetic association studies of human complex traits

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The etiology of multifactorial human diseases involves complex interactions between numerous environmental factors and alleles of many genes. Efficient statistical tools are demanded in identifying the genetic and environmental variants that affect the risk of disease development. This paper introduces a retrospective polytomous logistic regression model to measure both the main and interaction effects in genetic association studies of human discrete and continuous complex traits. In this model, combinations of genotypes at two interacting loci or of environmental exposure and genotypes at one locus are treated as nominal outcomes of which the proportions are modeled as a function of the disease trait assigning both main and interaction effects and with no assumption of normality in the trait distribution. Performance of our method in detecting interaction effect is compared with that of the case-only model.</p> <p>Results</p> <p>Results from our simulation study indicate that our retrospective model exhibits high power in capturing even relatively small effect with reasonable sample sizes. Application of our method to data from an association study on the catalase -262C/T promoter polymorphism and aging phenotypes detected significant main and interaction effects for age-group and allele T on individual's cognitive functioning and produced consistent results in estimating the interaction effect as compared with the popular case-only model.</p> <p>Conclusion</p> <p>The retrospective polytomous logistic regression model can be used as a convenient tool for assessing both main and interaction effects in genetic association studies of human multifactorial diseases involving genetic and non-genetic factors as well as categorical or continuous traits.</p

    Efficient Sample Tracking With OpenLabFramework

    Get PDF
    The advance of new technologies in biomedical research has led to a dramatic growth in experimental throughput. Projects therefore steadily grow in size and involve a larger number of researchers. Spreadsheets traditionally used are thus no longer suitable for keeping track of the vast amounts of samples created and need to be replaced with state-of-the-art laboratory information management systems. Such systems have been developed in large numbers, but they are often limited to specific research domains and types of data. One domain so far neglected is the management of libraries of vector clones and genetically engineered cell lines. OpenLabFramework is a newly developed web-application for sample tracking, particularly laid out to fill this gap, but with an open architecture allowing it to be extended for other biological materials and functional data. Its sample tracking mechanism is fully customizable and aids productivity further through support for mobile devices and barcoded labels

    Clonal expansion and linear genome evolution through breast cancer progression from pre-invasive stages to asynchronous metastasis

    Get PDF
    Evolution of the breast cancer genome from pre-invasive stages to asynchronous metastasis is complex and mostly unexplored, but highly demanded as it may provide novel markers for and mechanistic insights in cancer progression. The increasing use of personalized therapy of breast cancer necessitates knowledge of the degree of genomic concordance between different steps of malignant progression as primary tumors often are used as surrogates of systemic disease. Based on exome sequencing we performed copy number profiling and point mutation detection on successive steps of breast cancer progression from one breast cancer patient, including two different regions of Ductal Carcinoma In Situ (DCIS), primary tumor and an asynchronous metastasis. We identify a remarkable landscape of somatic mutations, retained throughout breast cancer progression and with new mutational events emerging at each step. Our data, contrary to the proposed model of early dissemination of metastatic cells and parallel progression of primary tumors and metastases, provide evidence of linear progression of breast cancer with relatively late dissemination from the primary tumor. The genomic discordance between the different stages of tumor evolution in this patient emphasizes the importance of molecular profiling of metastatic tissue directing molecularly targeted therapy at recurrence

    A Growth Curve Model with Fractional Polynomials for Analysing Incomplete Time-Course Data in Microarray Gene Expression Studies

    Get PDF
    Identifying the various gene expression response patterns is a challenging issue in expression microarray time-course experiments. Due to heterogeneity in the regulatory reaction among thousands of genes tested, it is impossible to manually characterize a parametric form for each of the time-course pattern in a gene by gene manner. We introduce a growth curve model with fractional polynomials to automatically capture the various time-dependent expression patterns and meanwhile efficiently handle missing values due to incomplete observations. For each gene, our procedure compares the performances among fractional polynomial models with power terms from a set of fixed values that offer a wide range of curve shapes and suggests a best fitting model. After a limited simulation study, the model has been applied to our human in vivo irritated epidermis data with missing observations to investigate time-dependent transcriptional responses to a chemical irritant. Our method was able to identify the various nonlinear time-course expression trajectories. The integration of growth curves with fractional polynomials provides a flexible way to model different time-course patterns together with model selection and significant gene identification strategies that can be applied in microarray-based time-course gene expression experiments with missing observations

    Genomic Analyses of Breast Cancer Progression Reveal Distinct Routes of Metastasis Emergence

    Get PDF
    A main controversy in cancer research is whether metastatic abilities are present in the most advanced clone of the primary tumor or result from independently acquired aberrations in early disseminated cancer cells as suggested by the linear and the parallel progression models, respectively. The genetic concordance between different steps of malignant progression is mostly unexplored as very few studies have included cancer samples separated by both space and time. We applied whole exome sequencing and targeted deep sequencing to 26 successive samples from six patients with metastatic estrogen receptor (ER)-positive breast cancer. Our data provide support for both linear and parallel progression towards metastasis. We report for the first time evidence of metastasis-to-metastasis seeding in breast cancer. Our results point to three distinct routes of metastasis emergence. This may have profound clinical implications and provides substantial novel molecular insights into the timing and mutational evolution of breast cancer metastasis
    corecore