122 research outputs found

    Telomerecat: A ploidy-agnostic method for estimating telomere length from whole genome sequencing data.

    Get PDF
    Telomere length is a risk factor in disease and the dynamics of telomere length are crucial to our understanding of cell replication and vitality. The proliferation of whole genome sequencing represents an unprecedented opportunity to glean new insights into telomere biology on a previously unimaginable scale. To this end, a number of approaches for estimating telomere length from whole-genome sequencing data have been proposed. Here we present Telomerecat, a novel approach to the estimation of telomere length. Previous methods have been dependent on the number of telomeres present in a cell being known, which may be problematic when analysing aneuploid cancer data and non-human samples. Telomerecat is designed to be agnostic to the number of telomeres present, making it suited for the purpose of estimating telomere length in cancer studies. Telomerecat also accounts for interstitial telomeric reads and presents a novel approach to dealing with sequencing errors. We show that Telomerecat performs well at telomere length estimation when compared to leading experimental and computational methods. Furthermore, we show that it detects expected patterns in longitudinal data, repeated measurements, and cross-species comparisons. We also apply the method to a cancer cell data, uncovering an interesting relationship with the underlying telomerase genotype

    Mutational signatures in esophageal adenocarcinoma define etiologically distinct subgroups with therapeutic relevance

    Get PDF
    Esophageal adenocarcinoma (EAC) has a poor outcome, and targeted therapy trials have thus far been disappointing owing to a lack of robust stratification methods. Whole-genome sequencing (WGS) analysis of 129 cases demonstrated that this is a heterogeneous cancer dominated by copy number alterations with frequent large-scale rearrangements. Co-amplification of receptor tyrosine kinases (RTKs) and/or downstream mitogenic activation is almost ubiquitous; thus tailored combination RTK inhibitor (RTKi) therapy might be required, as we demonstrate in vitro. However, mutational signatures showed three distinct molecular subtypes with potential therapeutic relevance, which we verified in an independent cohort (n = 87): (i) enrichment for BRCA signature with prevalent defects in the homologous recombination pathway; (ii) dominant T>G mutational pattern associated with a high mutational load and neoantigen burden; and (iii) C>A/T mutational pattern with evidence of an aging imprint. These subtypes could be ascertained using a clinically applicable sequencing strategy (low coverage) as a basis for therapy selection

    Flexible modelling of spatial variation in agricultural field trials with the R package INLA

    Get PDF
    The objective of this paper was to fit different established spatial models for analysing agricultural field trials using the open-source R package INLA. Spatial variation is common in field trials, and accounting for it increases the accuracy of estimated genetic effects. However, this is still hindered by the lack of available software implementations. We compare some established spatial models and show possibilities for flexible modelling with respect to field trial design and joint modelling over multiple years and locations. We use a Bayesian framework and for statistical inference the integrated nested Laplace approximations (INLA) implemented in the R package INLA. The spatial models we use are the well-known independent row and column effects, separable first-order autoregressive ( AR1⊗AR1 ) models and a Gaussian random field (Matérn) model that is approximated via the stochastic partial differential equation approach. The Matérn model can accommodate flexible field trial designs and yields interpretable parameters. We test the models in a simulation study imitating a wheat breeding programme with different levels of spatial variation, with and without genome-wide markers and with combining data over two locations, modelling spatial and genetic effects jointly. The results show comparable predictive performance for both the AR1⊗AR1 and the Matérn models. We also present an example of fitting the models to a real wheat breeding data and simulated tree breeding data with the Nelder wheel design to show the flexibility of the Matérn model and the R package INLA

    Geographic determinants of reported human Campylobacter infections in Scotland

    Get PDF
    <p><b>Background:</b> Campylobacteriosis is the leading cause of bacterial gastroenteritis in most developed countries. People are exposed to infection from contaminated food and environmental sources. However, the translation of these exposures into infection in the human population remains incompletely understood. This relationship is further complicated by differences in the presentation of cases, their investigation, identification, and reporting; thus, the actual differences in risk must be considered alongside the artefactual differences.</p> <p><b>Methods:</b> Data on 33,967 confirmed Campylobacter infections in mainland Scotland between 2000 and 2006 (inclusive) that were spatially referenced to the postcode sector level were analysed. Risk factors including the Carstairs index of social deprivation, the easting and northing of the centroid of the postcode sector, measures of livestock density by species and population density were tested in univariate screening using a non-spatial generalised linear model. The NHS Health Board of the case was included as a random effect in this final model. Subsequently, a spatial generalised linear mixed model (GLMM) was constructed and age-stratified sensitivity analysis was conducted on this model.</p> <p><b>Results:</b> The spatial GLMM included the protective effects of the Carstairs index (relative risk (RR) = 0.965, 95% Confidence intervals (CIs) = 0.959, 0.971) and population density (RR = 0.945, 95% CIs = 0.916, 0.974. Following stratification by age group, population density had a significant protective effect (RR = 0.745, 95% CIs = 0.700, 0.792) for those under 15 but not for those aged 15 and older (RR = 0.982, 95% CIs = 0.951, 1.014). Once these predictors have been taken into account three NHS Health Boards remain at significantly greater risk (Grampian, Highland and Tayside) and two at significantly lower risk (Argyll and Ayrshire and Arran).</p> <p><b>Conclusions:</b> The less deprived and children living in rural areas are at the greatest risk of being reported as a case of Campylobacter infection. However, this analysis cannot differentiate between actual risk and heterogeneities in individual reporting behaviour; nevertheless this paper has demonstrated that it is possible to explain the pattern of reported Campylobacter infections using both social and environmental predictors.</p&gt

    Dissemination of periodic mammography and patterns of use, by birth cohort, in Catalonia (Spain)

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In Catalonia (Spain) breast cancer mortality has declined since the beginning of the 1990s. The dissemination of early detection by mammography and the introduction of adjuvant treatments are among the possible causes of this decrease, and both were almost coincident in time. Thus, understanding how these procedures were incorporated into use in the general population and in women diagnosed with breast cancer is very important for assessing their contribution to the reduction in breast cancer mortality. In this work we have modeled the dissemination of periodic mammography and described repeat mammography behavior in Catalonia from 1975 to 2006.</p> <p>Methods</p> <p>Cross-sectional data from three Catalan Health Surveys for the calendar years 1994, 2002 and 2006 was used. The dissemination of mammography by birth cohort was modeled using a mixed effects model and repeat mammography behavior was described by age and survey year.</p> <p>Results</p> <p>For women born from 1938 to 1952, mammography clearly had a period effect, meaning that they started to have periodic mammograms at the same calendar years but at different ages. The age at which approximately 50% of the women were receiving periodic mammograms went from 57.8 years of age for women born in 1938–1942 to 37.3 years of age for women born in 1963–1967. Women in all age groups experienced an increase in periodic mammography use over time, although women in the 50–69 age group have experienced the highest increase. Currently, the target population of the Catalan Breast Cancer Screening Program, 50–69 years of age, is the group that self-reports the highest utilization of periodic mammograms, followed by the 40–49 age group. A higher proportion of women of all age groups have annual mammograms rather than biennial or irregular ones.</p> <p>Conclusion</p> <p>Mammography in Catalonia became more widely implemented during the 1990s. We estimated when cohorts initiated periodic mammograms and how frequently women are receiving them. These two pieces of information will be entered into a cost-effectiveness model of early detection in Catalonia.</p

    A predictive model for the early identification of patients at risk for a prolonged intensive care unit length of stay

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Patients with a prolonged intensive care unit (ICU) length of stay account for a disproportionate amount of resource use. Early identification of patients at risk for a prolonged length of stay can lead to quality enhancements that reduce ICU stay. This study developed and validated a model that identifies patients at risk for a prolonged ICU stay.</p> <p>Methods</p> <p>We performed a retrospective cohort study of 343,555 admissions to 83 ICUs in 31 U.S. hospitals from 2002-2007. We examined the distribution of ICU length of stay to identify a threshold where clinicians might be concerned about a prolonged stay; this resulted in choosing a 5-day cut-point. From patients remaining in the ICU on day 5 we developed a multivariable regression model that predicted remaining ICU stay. Predictor variables included information gathered at admission, day 1, and ICU day 5. Data from 12,640 admissions during 2002-2005 were used to develop the model, and the remaining 12,904 admissions to internally validate the model. Finally, we used data on 11,903 admissions during 2006-2007 to externally validate the model.</p> <p>Results</p> <p>The variables that had the greatest impact on remaining ICU length of stay were those measured on day 5, not at admission or during day 1. Mechanical ventilation, PaO<sub>2</sub>: FiO<sub>2 </sub>ratio, other physiologic components, and sedation on day 5 accounted for 81.6% of the variation in predicted remaining ICU stay. In the external validation set observed ICU stay was 11.99 days and predicted total ICU stay (5 days + day 5 predicted remaining stay) was 11.62 days, a difference of 8.7 hours. For the same patients, the difference between mean observed and mean predicted ICU stay using the APACHE day 1 model was 149.3 hours. The new model's r<sup>2 </sup>was 20.2% across individuals and 44.3% across units.</p> <p>Conclusions</p> <p>A model that uses patient data from ICU days 1 and 5 accurately predicts a prolonged ICU stay. These predictions are more accurate than those based on ICU day 1 data alone. The model can be used to benchmark ICU performance and to alert physicians to explore care alternatives aimed at reducing ICU stay.</p

    Transcriptomic profiling reveals three molecular phenotypes of adenocarcinoma at the gastroesophageal junction

    Get PDF
    Cancers occurring at the gastroesophageal junction (GEJ) are classified as predominantly esophageal or gastric, which is often difficult to decipher. We hypothesized that the transcriptomic profile might reveal molecular subgroups which could help to define the tumor origin and behavior beyond anatomical location. The gene expression profiles of 107 treatment-naive, intestinal type, gastroesophageal adenocarcinomas were assessed by the Illumina-HTv4.0 beadchip. Differential gene expression (limma), unsupervised subgroup assignment (mclust) and pathway analysis (gage) were undertaken in R statistical computing and results were related to demographic and clinical parameters. Unsupervised assignment of the gene expression profiles revealed three distinct molecular subgroups, which were not associated with anatomical location, tumor stage or grade (p &gt; 0.05). Group 1 was enriched for pathways involved in cell turnover, Group 2 was enriched for metabolic processes and Group 3 for immune-response pathways. Patients in group 1 showed the worst overall survival (p = 0.019). Key genes for the three subtypes were confirmed by immunohistochemistry. The newly defined intrinsic subtypes were analyzed in four independent datasets of gastric and esophageal adenocarcinomas with transcriptomic data available (RNAseq data: OCCAMS cohort, n = 158; gene expression arrays: Belfast, n = 63; Singapore, n = 191; Asian Cancer Research Group, n = 300). The subgroups were represented in the independent cohorts and pooled analysis confirmed the prognostic effect of the new subtypes. In conclusion, adenocarcinomas at the GEJ comprise three distinct molecular phenotypes which do not reflect anatomical location but rather inform our understanding of the key pathways expressed
    corecore