48 research outputs found

    Preparation of name and address data for record linkage using hidden Markov models

    Get PDF
    BACKGROUND: Record linkage refers to the process of joining records that relate to the same entity or event in one or more data collections. In the absence of a shared, unique key, record linkage involves the comparison of ensembles of partially-identifying, non-unique data items between pairs of records. Data items with variable formats, such as names and addresses, need to be transformed and normalised in order to validly carry out these comparisons. Traditionally, deterministic rule-based data processing systems have been used to carry out this pre-processing, which is commonly referred to as "standardisation". This paper describes an alternative approach to standardisation, using a combination of lexicon-based tokenisation and probabilistic hidden Markov models (HMMs). METHODS: HMMs were trained to standardise typical Australian name and address data drawn from a range of health data collections. The accuracy of the results was compared to that produced by rule-based systems. RESULTS: Training of HMMs was found to be quick and did not require any specialised skills. For addresses, HMMs produced equal or better standardisation accuracy than a widely-used rule-based system. However, acccuracy was worse when used with simpler name data. Possible reasons for this poorer performance are discussed. CONCLUSION: Lexicon-based tokenisation and HMMs provide a viable and effort-effective alternative to rule-based systems for pre-processing more complex variably formatted data such as addresses. Further work is required to improve the performance of this approach with simpler data such as names. Software which implements the methods described in this paper is freely available under an open source license for other researchers to use and improve

    The theory of expanded, extended, and enhanced opportunities for youth physical activity promotion

    Get PDF
    Background Physical activity interventions targeting children and adolescents (≤18 years) often focus on complex intra- and inter-personal behavioral constructs, social-ecological frameworks, or some combination of both. Recently published meta-analytical reviews and large-scale randomized controlled trials have demonstrated that these intervention approaches have largely produced minimal or no improvements in young people\u27s physical activity levels. Discussion In this paper, we propose that the main reason for previous studies\u27 limited effects is that fundamental mechanisms that lead to change in youth physical activity have often been overlooked or misunderstood. Evidence from observational and experimental studies is presented to support the development of a new theory positing that the primary mechanisms of change in many youth physical activity interventions are approaches that fall into one of the following three categories: (a) the expansion of opportunities for youth to be active by the inclusion of a new occasion to be active, (b) the extension of an existing physical activity opportunity by increasing the amount of time allocated for that opportunity, and/or (c) the enhancement of existing physical activity opportunities through strategies designed to increase physical activity above routine practice. Their application and considerations for intervention design and interpretation are presented. Summary The utility of these mechanisms, referred to as the Theory of Expanded, Extended, and Enhanced Opportunities (TEO), is demonstrated in their parsimony, logical appeal, support with empirical evidence, and the direct and immediate application to numerous settings and contexts. The TEO offers a new way to understand youth physical activity behaviors and provides a common taxonomy by which interventionists can identify appropriate targets for interventions across different settings and contexts. We believe the formalization of the TEO concepts will propel them to the forefront in the design of future intervention studies and through their use, lead to a greater impact on youth activity behaviors than what has been demonstrated in previous studies

    Fc-Optimized Anti-CD25 Depletes Tumor-Infiltrating Regulatory T Cells and Synergizes with PD-1 Blockade to Eradicate Established Tumors

    Get PDF
    CD25 is expressed at high levels on regulatory T (Treg) cells and was initially proposed as a target for cancer immunotherapy. However, anti-CD25 antibodies have displayed limited activity against established tumors. We demonstrated that CD25 expression is largely restricted to tumor-infiltrating Treg cells in mice and humans. While existing anti-CD25 antibodies were observed to deplete Treg cells in the periphery, upregulation of the inhibitory Fc gamma receptor (FcγR) IIb at the tumor site prevented intra-tumoral Treg cell depletion, which may underlie the lack of anti-tumor activity previously observed in pre-clinical models. Use of an anti-CD25 antibody with enhanced binding to activating FcγRs led to effective depletion of tumor-infiltrating Treg cells, increased effector to Treg cell ratios, and improved control of established tumors. Combination with anti-programmed cell death protein-1 antibodies promoted complete tumor rejection, demonstrating the relevance of CD25 as a therapeutic target and promising substrate for future combination approaches in immune-oncology

    Fc Effector Function Contributes to the Activity of Human Anti-CTLA-4 Antibodies.

    Get PDF
    With the use of a mouse model expressing human Fc-gamma receptors (FcγRs), we demonstrated that antibodies with isotypes equivalent to ipilimumab and tremelimumab mediate intra-tumoral regulatory T (Treg) cell depletion in vivo, increasing the CD8+ to Treg cell ratio and promoting tumor rejection. Antibodies with improved FcγR binding profiles drove superior anti-tumor responses and survival. In patients with advanced melanoma, response to ipilimumab was associated with the CD16a-V158F high affinity polymorphism. Such activity only appeared relevant in the context of inflamed tumors, explaining the modest response rates observed in the clinical setting. Our data suggest that the activity of anti-CTLA-4 in inflamed tumors may be improved through enhancement of FcγR binding, whereas poorly infiltrated tumors will likely require combination approaches

    Allele-Specific HLA Loss and Immune Escape in Lung Cancer Evolution

    Get PDF
    Immune evasion is a hallmark of cancer. Losing the ability to present neoantigens through human leukocyte antigen (HLA) loss may facilitate immune evasion. However, the polymorphic nature of the locus has precluded accurate HLA copy-number analysis. Here, we present loss of heterozygosity in human leukocyte antigen (LOHHLA), a computational tool to determine HLA allele-specific copy number from sequencing data. Using LOHHLA, we find that HLA LOH occurs in 40% of non-small-cell lung cancers (NSCLCs) and is associated with a high subclonal neoantigen burden, APOBEC-mediated mutagenesis, upregulation of cytolytic activity, and PD-L1 positivity. The focal nature of HLA LOH alterations, their subclonal frequencies, enrichment in metastatic sites, and occurrence as parallel events suggests that HLA LOH is an immune escape mechanism that is subject to strong microenvironmental selection pressures later in tumor evolution. Characterizing HLA LOH with LOHHLA refines neoantigen prediction and may have implications for our understanding of resistance mechanisms and immunotherapeutic approaches targeting neoantigens. Video Abstract [Figure presented] Development of the bioinformatics tool LOHHLA allows precise measurement of allele-specific HLA copy number, improves the accuracy in neoantigen prediction, and uncovers insights into how immune escape contributes to tumor evolution in non-small-cell lung cancer

    Phylogenetic ctDNA analysis depicts early-stage lung cancer evolution.

    Get PDF
    The early detection of relapse following primary surgery for non-small-cell lung cancer and the characterization of emerging subclones, which seed metastatic sites, might offer new therapeutic approaches for limiting tumour recurrence. The ability to track the evolutionary dynamics of early-stage lung cancer non-invasively in circulating tumour DNA (ctDNA) has not yet been demonstrated. Here we use a tumour-specific phylogenetic approach to profile the ctDNA of the first 100 TRACERx (Tracking Non-Small-Cell Lung Cancer Evolution Through Therapy (Rx)) study participants, including one patient who was also recruited to the PEACE (Posthumous Evaluation of Advanced Cancer Environment) post-mortem study. We identify independent predictors of ctDNA release and analyse the tumour-volume detection limit. Through blinded profiling of postoperative plasma, we observe evidence of adjuvant chemotherapy resistance and identify patients who are very likely to experience recurrence of their lung cancer. Finally, we show that phylogenetic ctDNA profiling tracks the subclonal nature of lung cancer relapse and metastasis, providing a new approach for ctDNA-driven therapeutic studies

    Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space

    Get PDF
    The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL; https://anvilproject.org) was developed to address a widespread community need for a unified computing environment for genomics data storage, management, and analysis. In this perspective, we present AnVIL, describe its ecosystem and interoperability with other platforms, and highlight how this platform and associated initiatives contribute to improved genomic data sharing efforts. The AnVIL is a federated cloud platform designed to manage and store genomics and related data, enable population-scale analysis, and facilitate collaboration through the sharing of data, code, and analysis results. By inverting the traditional model of data sharing, the AnVIL eliminates the need for data movement while also adding security measures for active threat detection and monitoring and provides scalable, shared computing resources for any researcher. We describe the core data management and analysis components of the AnVIL, which currently consists of Terra, Gen3, Galaxy, RStudio/Bioconductor, Dockstore, and Jupyter, and describe several flagship genomics datasets available within the AnVIL. We continue to extend and innovate the AnVIL ecosystem by implementing new capabilities, including mechanisms for interoperability and responsible data sharing, while streamlining access management. The AnVIL opens many new opportunities for analysis, collaboration, and data sharing that are needed to drive research and to make discoveries through the joint analysis of hundreds of thousands to millions of genomes along with associated clinical and molecular data types

    Changes in symptomatology, reinfection, and transmissibility associated with the SARS-CoV-2 variant B.1.1.7: an ecological study

    Get PDF
    Background The SARS-CoV-2 variant B.1.1.7 was first identified in December, 2020, in England. We aimed to investigate whether increases in the proportion of infections with this variant are associated with differences in symptoms or disease course, reinfection rates, or transmissibility. Methods We did an ecological study to examine the association between the regional proportion of infections with the SARS-CoV-2 B.1.1.7 variant and reported symptoms, disease course, rates of reinfection, and transmissibility. Data on types and duration of symptoms were obtained from longitudinal reports from users of the COVID Symptom Study app who reported a positive test for COVID-19 between Sept 28 and Dec 27, 2020 (during which the prevalence of B.1.1.7 increased most notably in parts of the UK). From this dataset, we also estimated the frequency of possible reinfection, defined as the presence of two reported positive tests separated by more than 90 days with a period of reporting no symptoms for more than 7 days before the second positive test. The proportion of SARS-CoV-2 infections with the B.1.1.7 variant across the UK was estimated with use of genomic data from the COVID-19 Genomics UK Consortium and data from Public Health England on spike-gene target failure (a non-specific indicator of the B.1.1.7 variant) in community cases in England. We used linear regression to examine the association between reported symptoms and proportion of B.1.1.7. We assessed the Spearman correlation between the proportion of B.1.1.7 cases and number of reinfections over time, and between the number of positive tests and reinfections. We estimated incidence for B.1.1.7 and previous variants, and compared the effective reproduction number, Rt, for the two incidence estimates. Findings From Sept 28 to Dec 27, 2020, positive COVID-19 tests were reported by 36 920 COVID Symptom Study app users whose region was known and who reported as healthy on app sign-up. We found no changes in reported symptoms or disease duration associated with B.1.1.7. For the same period, possible reinfections were identified in 249 (0·7% [95% CI 0·6–0·8]) of 36 509 app users who reported a positive swab test before Oct 1, 2020, but there was no evidence that the frequency of reinfections was higher for the B.1.1.7 variant than for pre-existing variants. Reinfection occurrences were more positively correlated with the overall regional rise in cases (Spearman correlation 0·56–0·69 for South East, London, and East of England) than with the regional increase in the proportion of infections with the B.1.1.7 variant (Spearman correlation 0·38–0·56 in the same regions), suggesting B.1.1.7 does not substantially alter the risk of reinfection. We found a multiplicative increase in the Rt of B.1.1.7 by a factor of 1·35 (95% CI 1·02–1·69) relative to pre-existing variants. However, Rt fell below 1 during regional and national lockdowns, even in regions with high proportions of infections with the B.1.1.7 variant. Interpretation The lack of change in symptoms identified in this study indicates that existing testing and surveillance infrastructure do not need to change specifically for the B.1.1.7 variant. In addition, given that there was no apparent increase in the reinfection rate, vaccines are likely to remain effective against the B.1.1.7 variant. Funding Zoe Global, Department of Health (UK), Wellcome Trust, Engineering and Physical Sciences Research Council (UK), National Institute for Health Research (UK), Medical Research Council (UK), Alzheimer's Society
    corecore