2,843 research outputs found

    MultiBaC: A strategy to remove batch effects between different omic data types

    Full text link
    [EN] Diversity of omic technologies has expanded in the last years together with the number of omic data integration strategies. However, multiomic data generation is costly, and many research groups cannot afford research projects where many different omic techniques are generated, at least at the same time. As most researchers share their data in public repositories, different omic datasets of the same biological system obtained at different labs can be combined to construct a multiomic study. However, data obtained at different labs or moments in time are typically subjected to batch effects that need to be removed for successful data integration. While there are methods to correct batch effects on the same data types obtained in different studies, they cannot be applied to correct lab or batch effects across omics. This impairs multiomic meta-analysis. Fortunately, in many cases, at least one omics platform-i.e. gene expression- is repeatedly measured across labs, together with the additional omic modalities that are specific to each study. This creates an opportunity for batch analysis. We have developed MultiBaC (multiomic Multiomics Batch-effect Correction correction), a strategy to correct batch effects from multiomic datasets distributed across different labs or data acquisition events. Our strategy is based on the existence of at least one shared data type which allows data prediction across omics. We validate this approach both on simulated data and on a case where the multiomic design is fully shared by two labs, hence batch effect correction within the same omic modality using traditional methods can be compared with the MultiBaC correction across data types. Finally, we apply MultiBaC to a true multiomic data integration problem to show that we are able to improve the detection of meaningful biological effects.The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is part of a research project that is totally funded by Conselleria d'Educacio, Cultura i Esport (Generalitat Valenciana) through PROMETEO grants program for excellence research groups.Ugidos, M.; Tarazona Campos, S.; Prats-Montalbán, JM.; Ferrer, A.; Conesa, A. (2020). MultiBaC: A strategy to remove batch effects between different omic data types. Statistical Methods in Medical Research. 29(10):2851-2864. https://doi.org/10.1177/0962280220907365S285128642910Kupfer, P., Guthke, R., Pohlers, D., Huber, R., Koczan, D., & Kinne, R. W. (2012). Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis. BMC Medical Genomics, 5(1). doi:10.1186/1755-8794-5-23Gregori, J., Villarreal, L., Méndez, O., Sánchez, A., Baselga, J., & Villanueva, J. (2012). Batch effects correction improves the sensitivity of significance tests in spectral counting-based comparative discovery proteomics. Journal of Proteomics, 75(13), 3938-3951. doi:10.1016/j.jprot.2012.05.005Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W., & Smyth, G. K. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research, 43(7), e47-e47. doi:10.1093/nar/gkv007Gagnon-Bartsch, J. A., & Speed, T. P. (2011). Using control genes to correct for unwanted variation in microarray data. Biostatistics, 13(3), 539-552. doi:10.1093/biostatistics/kxr034Nueda, M. j., Ferrer, A., & Conesa, A. (2011). ARSyN: a method for the identification and removal of systematic noise in multifactorial time course microarray experiments. Biostatistics, 13(3), 553-566. doi:10.1093/biostatistics/kxr042Jansen, J. J., Hoefsloot, H. C. J., van der Greef, J., Timmerman, M. E., Westerhuis, J. A., & Smilde, A. K. (2005). ASCA: analysis of multivariate data obtained from an experimental design. Journal of Chemometrics, 19(9), 469-481. doi:10.1002/cem.952Nueda, M. J., Conesa, A., Westerhuis, J. A., Hoefsloot, H. C. J., Smilde, A. K., Talón, M., & Ferrer, A. (2007). Discovering gene expression patterns in time course microarray experiments by ANOVA–SCA. Bioinformatics, 23(14), 1792-1800. doi:10.1093/bioinformatics/btm251Giordan, M. (2013). A Two-Stage Procedure for the Removal of Batch Effects in Microarray Studies. Statistics in Biosciences, 6(1), 73-84. doi:10.1007/s12561-013-9081-1Nyamundanda, G., Poudel, P., Patil, Y., & Sadanandam, A. (2017). A Novel Statistical Method to Diagnose, Quantify and Correct Batch Effects in Genomic Studies. Scientific Reports, 7(1). doi:10.1038/s41598-017-11110-6Reese, S. E., Archer, K. J., Therneau, T. M., Atkinson, E. J., Vachon, C. M., de Andrade, M., … Eckel-Passow, J. E. (2013). A new statistic for identifying batch effects in high-throughput genomic data that uses guided principal component analysis. Bioinformatics, 29(22), 2877-2883. doi:10.1093/bioinformatics/btt480Papiez, A., Marczyk, M., Polanska, J., & Polanski, A. (2018). BatchI: Batch effect Identification in high-throughput screening data using a dynamic programming algorithm. Bioinformatics, 35(11), 1885-1892. doi:10.1093/bioinformatics/bty900Keel, B. N., Zarek, C. M., Keele, J. W., Kuehn, L. A., Snelling, W. M., Oliver, W. T., … Lindholm-Perry, A. K. (2018). RNA-Seq Meta-analysis identifies genes in skeletal muscle associated with gain and intake across a multi-season study of crossbred beef steers. BMC Genomics, 19(1). doi:10.1186/s12864-018-4769-8Li, M. D., Burns, T. C., Morgan, A. A., & Khatri, P. (2014). Integrated multi-cohort transcriptional meta-analysis of neurodegenerative diseases. Acta Neuropathologica Communications, 2(1). doi:10.1186/s40478-014-0093-yAndres-Terre, M., McGuire, H. M., Pouliot, Y., Bongen, E., Sweeney, T. E., Tato, C. M., & Khatri, P. (2015). Integrated, Multi-cohort Analysis Identifies Conserved Transcriptional Signatures across Multiple Respiratory Viruses. Immunity, 43(6), 1199-1211. doi:10.1016/j.immuni.2015.11.003Sandhu, V., Labori, K. J., Borgida, A., Lungu, I., Bartlett, J., Hafezi-Bakhtiari, S., … Haibe-Kains, B. (2019). Meta-Analysis of 1,200 Transcriptomic Profiles Identifies a Prognostic Model for Pancreatic Ductal Adenocarcinoma. JCO Clinical Cancer Informatics, (3), 1-16. doi:10.1200/cci.18.00102Huang, H., Liu, C.-C., & Zhou, X. J. (2010). Bayesian approach to transforming public gene expression repositories into disease diagnosis databases. Proceedings of the National Academy of Sciences, 107(15), 6823-6828. doi:10.1073/pnas.0912043107Pelechano, V., & Pérez-Ortín, J. E. (2010). There is a steady-state transcriptome in exponentially growing yeast cells. Yeast, 27(7), 413-422. doi:10.1002/yea.1768Garcı́a-Martı́nez, J., Aranda, A., & Pérez-Ortı́n, J. E. (2004). Genomic Run-On Evaluates Transcription Rates for All Yeast Genes and Identifies Gene Regulatory Mechanisms. Molecular Cell, 15(2), 303-313. doi:10.1016/j.molcel.2004.06.004Pelechano, V., Chávez, S., & Pérez-Ortín, J. E. (2010). A Complete Set of Nascent Transcription Rates for Yeast Genes. PLoS ONE, 5(11), e15442. doi:10.1371/journal.pone.0015442Zid, B. M., & O’Shea, E. K. (2014). Promoter sequences direct cytoplasmic localization and translation of mRNAs during starvation in yeast. Nature, 514(7520), 117-121. doi:10.1038/nature13578Freeberg, M. A., Han, T., Moresco, J. J., Kong, A., Yang, Y.-C., Lu, Z., … Kim, J. K. (2013). Pervasive and dynamic protein binding sites of the mRNA transcriptome in Saccharomyces cerevisiae. Genome Biology, 14(2), R13. doi:10.1186/gb-2013-14-2-r13McKinlay, A., Araya, C. L., & Fields, S. (2011). Genome-Wide Analysis of Nascent Transcription in Saccharomyces cerevisiae. G3 Genes|Genomes|Genetics, 1(7), 549-558. doi:10.1534/g3.111.000810Castells-Roca, L., García-Martínez, J., Moreno, J., Herrero, E., Bellí, G., & Pérez-Ortín, J. E. (2011). Heat Shock Response in Yeast Involves Changes in Both Transcription Rates and mRNA Stabilities. PLoS ONE, 6(2), e17272. doi:10.1371/journal.pone.0017272Wold, S., Sjöström, M., & Eriksson, L. (2001). PLS-regression: a basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems, 58(2), 109-130. doi:10.1016/s0169-7439(01)00155-1Folch-Fortuny, A., Vitale, R., de Noord, O. E., & Ferrer, A. (2017). Calibration transfer between NIR spectrometers: New proposals and a comparative study. Journal of Chemometrics, 31(3), e2874. doi:10.1002/cem.2874García Muñoz, S., MacGregor, J. F., & Kourti, T. (2005). Product transfer between sites using Joint-Y PLS. Chemometrics and Intelligent Laboratory Systems, 79(1-2), 101-114. doi:10.1016/j.chemolab.2005.04.009Andrade, J. M., Gómez-Carracedo, M. P., Krzanowski, W., & Kubista, M. (2004). Procrustes rotation in analytical chemistry, a tutorial. Chemometrics and Intelligent Laboratory Systems, 72(2), 123-132. doi:10.1016/j.chemolab.2004.01.007Hurley, J. R., & Cattell, R. B. (2007). The procrustes program: Producing direct rotation to test a hypothesized factor structure. Behavioral Science, 7(2), 258-262. doi:10.1002/bs.3830070216Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A K-Means Clustering Algorithm. Applied Statistics, 28(1), 100. doi:10.2307/234683

    Languages cool as they expand: Allometric scaling and the decreasing need for new words

    Get PDF
    We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word frequency distributions, with only the more common words obeying the classic Zipf law. Using corpora of unprecedented size, we test the allometric scaling relation between the corpus size and the vocabulary size of growing languages to demonstrate a decreasing marginal need for new words, a feature that is likely related to the underlying correlations between words. We calculate the annual growth fluctuations of word use which has a decreasing trend as the corpus size increases, indicating a slowdown in linguistic evolution following language expansion. This ‘‘cooling pattern’’ forms the basis of a third statistical regularity, which unlike the Zipf and the Heaps law, is dynamical in nature

    Positive words carry less information than negative words

    Get PDF
    We show that the frequency of word use is not only determined by the word length \cite{Zipf1935} and the average information content \cite{Piantadosi2011}, but also by its emotional content. We have analyzed three established lexica of affective word usage in English, German, and Spanish, to verify that these lexica have a neutral, unbiased, emotional content. Taking into account the frequency of word usage, we find that words with a positive emotional content are more frequently used. This lends support to Pollyanna hypothesis \cite{Boucher1969} that there should be a positive bias in human expression. We also find that negative words contain more information than positive words, as the informativeness of a word increases uniformly with its valence decrease. Our findings support earlier conjectures about (i) the relation between word frequency and information content, and (ii) the impact of positive emotions on communication and social links.Comment: 16 pages, 3 figures, 3 table

    Omalizumab efficacy in cases of chronic spontaneous urticaria is not explained by the inhibition of sera activity in effector cells

    Get PDF
    Omalizumab (OmAb) is a humanized anti-IgE antibody approved for the treatment of chronic spontaneous urticaria (CSU). OmAb's mechanism of action is known to include actions on free IgE and on pre-bound IgE. However, OmAb is equally and rapidly effective against autoimmune and non-autoimmune urticaria where IgE involvement is not clear, suggesting the involvement of additional mechanisms of action. In this study, we sought to investigate the ability of OmAb to inhibit mast cell and basophil degranulation induced by sera from CSU patients. For this purpose, we performed a comparison between the in vitro incubation of sera from CSU patients treated with OmAb and the in vivo administration of OmAb in a clinical trial. We found that OmAb added in vitro to sera from CSU patients did not modify the ability of the sera to induce cell degranulation. Similarly, the sera from patients treated with OmAb in the context of the clinical trial who had a good clinical outcome maintained the capacity to activate mast cells and basophils. Thus, we conclude that the beneficial activity of OmAb does not correlate with the ability of patient sera to induce cell degranulation

    Health system costs of providing outpatient care for diabetes in people with TB in the Philippines

    Get PDF
    <sec><title>BACKGROUND</title>Diabetes mellitus (DM) is a known risk factor for active TB. A key activity in the Philippines is to integrate TB services with other disease programmes, with a target of DM screening in 90% of TB cases. However, costs of providing DM outpatient services for TB patients are not well known.</sec><sec><title>METHODS</title>We estimated the costs of providing integrated DM outpatient services within TB services from the health system perspective. Resources for outpatient DM services were valued using the bottom-up approach for capital goods, staff time and consumables. Resource quantities were obtained by interviewing 60 healthcare professionals in 11 health facilities in the Philippines.</sec><sec><title>RESULTS</title>The mean cost per service ranged from USD0.53 for DM risk assessment to USD23.72 for oral glucose tolerance test. The cost per case detected for different algorithms varied from USD17.43 to USD80.81. The monthly cost per patient was estimated at USD8.95 to USD12.36.</sec><sec><title>CONCLUSION</title>Our study provides the first estimates of costs for providing integrated DM outpatient services and TB care in a low- and middle-income country. The costs of DM detection in TB patients suggests that it may be useful to further investigate the cost-effectiveness and affordability of service delivery.</sec&gt

    Prediction of absolute risk of fragility fracture at 10 years in a Spanish population: validation of the WHO FRAX ™ tool in Spain

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Age-related bone loss is asymptomatic, and the morbidity of osteoporosis is secondary to the fractures that occur. Common sites of fracture include the spine, hip, forearm and proximal humerus. Fractures at the hip incur the greatest morbidity and mortality and give rise to the highest direct costs for health services. Their incidence increases exponentially with age.</p> <p>Independently changes in population demography, the age - and sex- specific incidence of osteoporotic fractures appears to be increasing in developing and developed countries. This could mean more than double the expected burden of osteoporotic fractures in the next 50 years.</p> <p>Methods/Design</p> <p>To assess the predictive power of the WHO FRAX™ tool to identify the subjects with the highest absolute risk of fragility fracture at 10 years in a Spanish population, a predictive validation study of the tool will be carried out. For this purpose, the participants recruited by 1999 will be assessed. These were referred to scan-DXA Department from primary healthcare centres, non hospital and hospital consultations. Study population: Patients attended in the national health services integrated into a FRIDEX cohort with at least one Dual-energy X-ray absorptiometry (DXA) measurement and one extensive questionnaire related to fracture risk factors. Measurements: At baseline bone mineral density measurement using DXA, clinical fracture risk factors questionnaire, dietary calcium intake assessment, history of previous fractures, and related drugs. Follow up by telephone interview to know fragility fractures in the 10 years with verification in electronic medical records and also to know the number of falls in the last year. The absolute risk of fracture will be estimated using the FRAX™ tool from the official web site.</p> <p>Discussion</p> <p>Since more than 10 years ago numerous publications have recognised the importance of other risk factors for new osteoporotic fractures in addition to low BMD. The extension of a method for calculating the risk (probability) of fractures using the FRAX™ tool is foreseeable in Spain and this would justify a study such as this to allow the necessary adjustments in calibration of the parameters included in the logarithmic formula constituted by FRAX™.</p

    Phage inducible islands in the gram-positive cocci

    Get PDF
    The SaPIs are a cohesive subfamily of extremely common phage-inducible chromosomal islands (PICIs) that reside quiescently at specific att sites in the staphylococcal chromosome and are induced by helper phages to excise and replicate. They are usually packaged in small capsids composed of phage virion proteins, giving rise to very high transfer frequencies, which they enhance by interfering with helper phage reproduction. As the SaPIs represent a highly successful biological strategy, with many natural Staphylococcus aureus strains containing two or more, we assumed that similar elements would be widespread in the Gram-positive cocci. On the basis of resemblance to the paradigmatic SaPI genome, we have readily identified large cohesive families of similar elements in the lactococci and pneumococci/streptococci plus a few such elements in Enterococcus faecalis. Based on extensive ortholog analyses, we found that the PICI elements in the four different genera all represent distinct but parallel lineages, suggesting that they represent convergent evolution towards a highly successful lifestyle. We have characterized in depth the enterococcal element, EfCIV583, and have shown that it very closely resembles the SaPIs in functionality as well as in genome organization, setting the stage for expansion of the study of elements of this type. In summary, our findings greatly broaden the PICI family to include elements from at least three genera of cocci

    Colouration in amphibians as a reflection of nutritional status : the case of tree frogs in Costa Rica

    Get PDF
    Colouration has been considered a cue for mating success in many species; ornaments in males often are related to carotenoid mobilization towards feathers and/or skin and can signal general health and nutrition status. However, there are several factors that can also link with status, such as physiological blood parameters and body condition, but there is not substantial evidence which supports the existence of these relationships and interactions in anurans. This study evaluated how body score and blood values interact with colouration in free-range Agalychnis callidryas and Agalychnis annae males. We found significant associations between body condition and plasmatic proteins and haematocrit, as well as between body condition and colour values from the chromaticity diagram. We also demonstrated that there is a significant relation between the glucose and plasmatic protein values that were reflected in the ventral colours of the animals, and haematocrit inversely affected most of those colour values. Significant differences were found between species as well as between populations of A. callidryas, suggesting that despite colour variation, there are also biochemical differences within animals from the same species located in different regions. These data provide information on underlying factors for colouration of male tree frogs in nature, provide insights about the dynamics of several nutrients in the amphibian model and how this could affect the reproductive output of the animals

    Use of low-dose oral theophylline as an adjunct to inhaled corticosteroids in preventing exacerbations of chronic obstructive pulmonary disease: study protocol for a randomised controlled trial.

    Get PDF
    BACKGROUND: Chronic obstructive pulmonary disease (COPD) is associated with high morbidity, mortality, and health-care costs. An incomplete response to the anti-inflammatory effects of inhaled corticosteroids is present in COPD. Preclinical work indicates that 'low dose' theophylline improves steroid responsiveness. The Theophylline With Inhaled Corticosteroids (TWICS) trial investigates whether the addition of 'low dose' theophylline to inhaled corticosteroids has clinical and cost-effective benefits in COPD. METHOD/DESIGN: TWICS is a randomised double-blind placebo-controlled trial conducted in primary and secondary care sites in the UK. The inclusion criteria are the following: an established predominant respiratory diagnosis of COPD (post-bronchodilator forced expiratory volume in first second/forced vital capacity [FEV1/FVC] of less than 0.7), age of at least 40 years, smoking history of at least 10 pack-years, current inhaled corticosteroid use, and history of at least two exacerbations requiring treatment with antibiotics or oral corticosteroids in the previous year. A computerised randomisation system will stratify 1424 participants by region and recruitment setting (primary and secondary) and then randomly assign with equal probability to intervention or control arms. Participants will receive either 'low dose' theophylline (Uniphyllin MR 200 mg tablets) or placebo for 52 weeks. Dosing is based on pharmacokinetic modelling to achieve a steady-state serum theophylline of 1-5 mg/l. A dose of theophylline MR 200 mg once daily (or placebo once daily) will be taken by participants who do not smoke or participants who smoke but have an ideal body weight (IBW) of not more than 60 kg. A dose of theophylline MR 200 mg twice daily (or placebo twice daily) will be taken by participants who smoke and have an IBW of more than 60 kg. Participants will be reviewed at recruitment and after 6 and 12 months. The primary outcome is the total number of participant-reported COPD exacerbations requiring oral corticosteroids or antibiotics during the 52-week treatment period. DISCUSSION: The demonstration that 'low dose' theophylline increases the efficacy of inhaled corticosteroids in COPD by reducing the incidence of exacerbations is relevant not only to patients and clinicians but also to health-care providers, both in the UK and globally. TRIAL REGISTRATION: Current Controlled Trials ISRCTN27066620 was registered on Sept. 19, 2013, and the first subject was randomly assigned on Feb. 6, 2014
    corecore