239 research outputs found

    Assessing Significance in High-Throughput Experiments by Sequential Goodness of Fit and q-Value Estimation

    Get PDF
    We developed a new multiple hypothesis testing adjustment called SGoF+ implemented as a sequential goodness of fit metatest which is a modification of a previous algorithm, SGoF, taking advantage of the information of the distribution of p-values in order to fix the rejection region. The new method uses a discriminant rule based on the maximum distance between the uniform distribution of p-values and the observed one, to set the null for a binomial test. This new approach shows a better power/pFDR ratio than SGoF. In fact SGoF+ automatically sets the threshold leading to the maximum power and the minimum false non-discovery rate inside the SGoF' family of algorithms. Additionally, we suggest combining the information provided by SGoF+ with the estimate of the FDR that has been committed when rejecting a given set of nulls. We study different positive false discovery rate, pFDR, estimation methods to combine q-value estimates jointly with the information provided by the SGoF+ method. Simulations suggest that the combination of SGoF+ metatest with the q-value information is an interesting strategy to deal with multiple testing issues. These techniques are provided in the latest version of the SGoF+ software freely available at http://webs.uvigo.es/acraaj/SGoF.htm

    A novel approach to simulate gene-environment interactions in complex diseases

    Get PDF
    Background: Complex diseases are multifactorial traits caused by both genetic and environmental factors. They represent the major part of human diseases and include those with largest prevalence and mortality (cancer, heart disease, obesity, etc.). Despite a large amount of information that has been collected about both genetic and environmental risk factors, there are few examples of studies on their interactions in epidemiological literature. One reason can be the incomplete knowledge of the power of statistical methods designed to search for risk factors and their interactions in these data sets. An improvement in this direction would lead to a better understanding and description of gene-environment interactions. To this aim, a possible strategy is to challenge the different statistical methods against data sets where the underlying phenomenon is completely known and fully controllable, for example simulated ones. Results: We present a mathematical approach that models gene-environment interactions. By this method it is possible to generate simulated populations having gene-environment interactions of any form, involving any number of genetic and environmental factors and also allowing non-linear interactions as epistasis. In particular, we implemented a simple version of this model in a Gene-Environment iNteraction Simulator (GENS), a tool designed to simulate case-control data sets where a one gene-one environment interaction influences the disease risk. The main aim has been to allow the input of population characteristics by using standard epidemiological measures and to implement constraints to make the simulator behaviour biologically meaningful. Conclusions: By the multi-logistic model implemented in GENS it is possible to simulate case-control samples of complex disease where gene-environment interactions influence the disease risk. The user has full control of the main characteristics of the simulated population and a Monte Carlo process allows random variability. A knowledge-based approach reduces the complexity of the mathematical model by using reasonable biological constraints and makes the simulation more understandable in biological terms. Simulated data sets can be used for the assessment of novel statistical methods or for the evaluation of the statistical power when designing a study

    A new multitest correction (SGoF) that increases its statistical power when increasing the number of tests

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The detection of true significant cases under multiple testing is becoming a fundamental issue when analyzing high-dimensional biological data. Unfortunately, known multitest adjustments reduce their statistical power as the number of tests increase. We propose a new multitest adjustment, based on a sequential goodness of fit metatest (SGoF), which increases its statistical power with the number of tests. The method is compared with Bonferroni and FDR-based alternatives by simulating a multitest context via two different kinds of tests: 1) one-sample t-test, and 2) homogeneity G-test.</p> <p>Results</p> <p>It is shown that SGoF behaves especially well with small sample sizes when 1) the alternative hypothesis is weakly to moderately deviated from the null model, 2) there are widespread effects through the family of tests, and 3) the number of tests is large.</p> <p>Conclusion</p> <p>Therefore, SGoF should become an important tool for multitest adjustment when working with high-dimensional biological data.</p

    A prospective phase II trial exploring the association between tumor microenvironment biomarkers and clinical activity of ipilimumab in advanced melanoma

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Ipilimumab, a fully human monoclonal antibody that blocks cytotoxic T-lymphocyte antigen-4, has demonstrated an improvement in overall survival in two phase III trials of patients with advanced melanoma. The primary objective of the current trial was to prospectively explore candidate biomarkers from the tumor microenvironment for associations with clinical response to ipilimumab.</p> <p>Methods</p> <p>In this randomized, double-blind, phase II biomarker study (ClinicalTrials.gov NCT00261365), 82 pretreated or treatment-naïve patients with unresectable stage III/IV melanoma were induced with 3 or 10 mg/kg ipilimumab every 3 weeks for 4 doses; at Week 24, patients could receive maintenance doses every 12 weeks. Efficacy was evaluated per modified World Health Organization response criteria and safety was assessed continuously. Candidate biomarkers were evaluated in tumor biopsies collected pretreatment and 24 to 72 hours after the second ipilimumab dose. Polymorphisms in immune-related genes were also evaluated.</p> <p>Results</p> <p>Objective response rate, response patterns, and safety were consistent with previous trials of ipilimumab in melanoma. No associations between genetic polymorphisms and clinical activity were observed. Immunohistochemistry and histology on tumor biopsies revealed significant associations between clinical activity and high baseline expression of FoxP3 (p = 0.014) and indoleamine 2,3-dioxygenase (p = 0.012), and between clinical activity and increase in tumor-infiltrating lymphocytes (TILs) between baseline and 3 weeks after start of treatment (p = 0.005). Microarray analysis of mRNA from tumor samples taken pretreatment and post-treatment demonstrated significant increases in expression of several immune-related genes, and decreases in expression of genes implicated in cancer and melanoma.</p> <p>Conclusions</p> <p>Baseline expression of immune-related tumor biomarkers and a post-treatment increase in TILs may be positively associated with ipilimumab clinical activity. The observed pharmacodynamic changes in gene expression warrant further analysis to determine whether treatment-emergent changes in gene expression may be associated with clinical efficacy. Further studies are required to determine the predictive value of these and other potential biomarkers associated with clinical response to ipilimumab.</p

    Integration of gene expression data with prior knowledge for network analysis and validation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Reconstruction of protein-protein interaction or metabolic networks based on expression data often involves in silico predictions, while on the other hand, there are unspecific networks of in vivo interactions derived from knowledge bases.</p> <p>We analyze networks designed to come as close as possible to data measured in vivo, both with respect to the set of nodes which were taken to be expressed in experiment as well as with respect to the interactions between them which were taken from manually curated databases</p> <p>Results</p> <p>A signaling network derived from the TRANSPATH database and a metabolic network derived from KEGG LIGAND are each filtered onto expression data from breast cancer (SAGE) considering different levels of restrictiveness in edge and vertex selection.</p> <p>We perform several validation steps, in particular we define pathway over-representation tests based on refined null models to recover functional modules. The prominent role of the spindle checkpoint-related pathways in breast cancer is exhibited. High-ranking key nodes cluster in functional groups retrieved from literature. Results are consistent between several functional and topological analyses and between signaling and metabolic aspects.</p> <p>Conclusions</p> <p>This construction involved as a crucial step the passage to a mammalian protein identifier format as well as to a reaction-based semantics of metabolism. This yielded good connectivity but also led to the need to perform benchmark tests to exclude loss of essential information. Such validation, albeit tedious due to limitations of existing methods, turned out to be informative, and in particular provided biological insights as well as information on the degrees of coherence of the networks despite fragmentation of experimental data.</p> <p>Key node analysis exploited the networks for potentially interesting proteins in view of drug target prediction.</p

    Forward-time simulation of realistic samples for genome-wide association studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Forward-time simulations have unique advantages in power and flexibility for the simulation of genetic samples of complex human diseases because they can closely mimic the evolution of human populations carrying these diseases. However, a number of methodological and computational constraints have prevented the power of this simulation method from being fully explored in existing forward-time simulation methods.</p> <p>Results</p> <p>Using a general-purpose forward-time population genetics simulation environment, we developed a forward-time simulation method that can be used to simulate realistic samples for genome-wide association studies. We examined the properties of this simulation method by comparing simulated samples with real data and demonstrated its wide applicability using four examples, including a simulation of case-control samples with a disease caused by multiple interacting genetic and environmental factors, a simulation of trio families affected by a disease-predisposing allele that had been subjected to either slow or rapid selective sweep, and a simulation of a structured population resulting from recent population admixture.</p> <p>Conclusions</p> <p>Our algorithm simulates populations that closely resemble the complex structure of the human genome, while allows the introduction of signals of natural selection. Because of its flexibility to generate different types of samples with arbitrary disease or quantitative trait models, this simulation method can simulate realistic samples to evaluate the performance of a wide variety of statistical gene mapping methods for genome-wide association studies.</p

    Fine-scale detection of population-specific linkage disequilibrium using haplotype entropy in the human genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The creation of a coherent genomic map of recent selection is one of the greatest challenges towards a better understanding of human evolution and the identification of functional genetic variants. Several methods have been proposed to detect linkage disequilibrium (LD), which is indicative of natural selection, from genome-wide profiles of common genetic variations but are designed for large regions.</p> <p>Results</p> <p>To find population-specific LD within small regions, we have devised an entropy-based method that utilizes differences in haplotype frequency between populations. The method has the advantages of incorporating multilocus association, conciliation with low allele frequencies, and independence from allele polarity, which are ideal for short haplotype analysis. The comparison of HapMap SNPs data from African and Caucasian populations with a median resolution size of ~23 kb gave us novel candidates as well as known selection targets. Enrichment analysis for the yielded genes showed associations with diverse diseases such as cardiovascular, immunological, neurological, and skeletal and muscular diseases. A possible scenario for a selective force is discussed. In addition, we have developed a web interface (ENIGMA, available at <url>http://gibk21.bse.kyutech.ac.jp/ENIGMA/index.html</url>), which allows researchers to query their regions of interest for population-specific LD.</p> <p>Conclusion</p> <p>The haplotype entropy method is powerful for detecting population-specific LD embedded in short regions and should contribute to further studies aiming to decipher the evolutionary histories of modern humans.</p

    Serum CD26 is related to histopathological polyp traits and behaves as a marker for colorectal cancer and advanced adenomas

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Serum CD26 (sCD26) levels were previously found diminished in colorectal cancer (CRC) patients compared to healthy donors, suggesting its potential utility for early diagnosis. Therefore we aimed to estimate the utility of the sCD26 as a biomarker for CRC and advanced adenomas in a high-risk group of patients. The relationship of this molecule with polyp characteristics was also addressed.</p> <p>Methods</p> <p>sCD26 levels were measured by ELISA in 299 symptomatic and asymptomatic patients who had undergone a colonoscopy. Patients were diagnosed as having no colorectal pathology, non-inflammatory or inflammatory bowel disease, polyps (hyperplastic, non-advanced and advanced adenomas) or CRC.</p> <p>Results</p> <p>At a 460 ng/mL cut-off, the sCD26 has a sensitivity and specificity of 81.8% (95% CI, 64.5-93.0%) and 72.3% (95% CI, 65.0-77.2%) for CRC regarding no or benign colorectal pathology. Clinicopathological analysis of polyps showed a relationship between the sCD26 and the grade of dysplasia and the presence of advanced adenomas. Hence, a 58.0% (95% CI, 46.5-68.9%) sensitivity detecting CRC and advanced adenomas was obtained, with a specificity of 75.5% (95% CI, 68.5-81.0%).</p> <p>Conclusions</p> <p>Our preliminary results show that measurement of the sCD26 is a non-invasive and reasonably sensitive assay, which could be combined with others such as the faecal occult blood test for the early diagnosis and screening of CRC and advanced adenomas. Additional comparative studies in average-risk populations are necessary.</p

    Individual Shrink Wrapping of Zucchini Fruit Improves Postharvest Chilling Tolerance Associated with a Reduction in Ethylene Production and Oxidative Stress Metabolites

    Get PDF
    We have studied the effect of individual shrink wrapping (ISW) on the postharvest performance of refrigerated fruit from two zucchini cultivars that differ in their sensitivity to cold storage: Sinatra (more sensitive) and Natura (more tolerant). The fruit was individually shrink wrapped before storing at 4°C for 0, 7 and 14 days. Quality parameters, ethylene and CO2 productions, ethylene gene expression, and oxidative stress metabolites were assessed in shrink wrapped and non-wrapped fruit after conditioning the fruit for 6 hours at 20°C. ISW decreased significantly the postharvest deterioration of chilled zucchini in both cultivars. Weight loss was reduced to less than 1%, pitting symptoms were completely absent in ISW fruit at 7 days, and were less than 25% those of control fruits at 14 days of cold storage, and firmness loss was significantly reduced in the cultivar Sinatra. These enhancements in quality of ISW fruit were associated with a significant reduction in cold-induced ethylene production, in the respiration rate, and in the level of oxidative stress metabolites such as hydrogen peroxide and malonyldialdehyde (MDA). A detailed expression analysis of ethylene biosynthesis, perception and signaling genes demonstrated a downregulation of CpACS1 and CpACO1 genes in response to ISW, two genes that are upregulated by cold storage. However, the expression patterns of six other ethylene biosynthesis genes (CpACS2 to CpACS7) and five ethylene signal transduction pathway genes (CpCTR1, CpETR1, CpERS1, CpEIN3.1 and CpEN3.2), suggest that they do not play a major role in response to cold storage and ISW packaging. In conclusion, ISW zucchini packaging resulted in improved tolerance to chilling concomitantly with a reduction in oxidative stress, respiration rate and ethylene production, as well as in the expression of ethylene biosynthesis genes, but not of those involved in ethylene perception and sensitivity.This work was supported by grants AGL2011-30568-C02/ALI from the Spanish Ministry of Science and Innovation, and AGR1423 from the Consejería de Economía, Innovación y Ciencia, Junta de Andalucía, Spain. Z.M. acknowledges FPU program scholarships from MEC, Spain. S.M. is funded by grant PTA2011-479-I from the Spanish Ministry of Science and Innovation
    corecore