187 research outputs found

    An efficient record linkage scheme using graphical analysis for identifier error detection

    Get PDF
    Integration of information on individuals (record linkage) is a key problem in healthcare delivery, epidemiology, and "business intelligence" applications. It is now common to be required to link very large numbers of records, often containing various combinations of theoretically unique identifiers, such as NHS numbers, which are both incomplete and error-prone

    The genomic basis of tumor regression in Tasmanian devils (Sarcophilus harrisii)

    Get PDF
    Understanding the genetic basis of disease-related phenotypes, such as cancer susceptibility, is crucial for the advancement of personalized medicine. Although most cancers are somatic in origin, a small number of transmissible cancers have been documented. Two such cancers have emerged in the Tasmanian devil (Sarcophilus harrisii) and now threaten the species with extinction. Recently, cases of natural tumor regression in Tasmanian devils infected with the clonally contagious cancer have been detected. We used whole-genome sequencing and FST-based approaches to identify the genetic basis of tumor regression by comparing the genomes of seven individuals that underwent tumor regression with those of three infected individuals that did not. We found three highly differentiated candidate genomic regions containing several genes related to immune response and/or cancer risk, indicating that the genomic basis of tumor regression was polygenic. Within these genomic regions, we identified putative regulatory variation in candidate genes but no nonsynonymous variation, suggesting that natural tumor regression may be driven, at least in part, by differential host expression of key loci. Comparative oncology can provide insight into the genetic basis of cancer risk, tumor development, and the pathogenicity of cancer, particularly due to our limited ability to monitor natural, untreated tumor progression in human patients. Our results support the hypothesis that host immune response is necessary for triggering tumor regression, providing candidate genes that may translate to novel treatments in human and nonhuman cancers

    The parallel development of ODD and CD symptoms from early childhood to adolescence

    Get PDF
    This study examined the developmental relations between symptoms of oppositional defiant disorder (ODD) and conduct disorder (CD) from early childhood to adolescence. Specifically we tested, according to parent-reported problems, whether symptoms of ODD precede the development of CD symptoms, whether ODD and CD symptoms are reciprocally associated across time, or whether ODD and CD symptoms develop parallel to each other across time. Participants were a community-based sample (at time 1: N = 485, 48% boys) assessed biannually five times from age 4 to 6 until age 12-14. The findings suggested that, with control for stability effects, baseline SES, and symptoms of attention deficit hyperactivity disorder, ODD and CD symptoms develop parallel to each other. No gender differences were obtained. We conclude that without the initial presence of CD symptoms, ODD symptoms are not developmental precursors to CD symptoms

    Genetic screening of Fabry patients with EcoTILLING and HRM technology

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Anderson-Fabry disease (FD) is caused by a deficit of the α-galactosidase A enzyme which leads to the accumulation of complex sphingolipids, especially globotriaosylceramide (Gb3), in all the cells of the body, causing the onset of a multi-systemic disease with poor prognosis in adulthood. In this article, we describe two alternative methods for screening the <it>GLA </it>gene which codes for the α-galactosidase A enzyme in subjects with probable FD in order to test analysis strategies which include or rely on initial pre-screening.</p> <p>Findings</p> <p>We analyzed 740 samples using EcoTILLING, comparing two mismatch-specific<ul/>endonucleases, CEL I and ENDO-1, while conducting a parallel screening of the same samples using HRM (High Resolution Melting). Afterwards, all samples were subjected to direct sequencing. Overall, we identified 12 different genetic variations: -10C>T, -12G>A, -30G>A, IVS2-76_80del5, D165H, C172Y, IVS4+16A>G, IVS4 +68 A>G, c.718_719delAA, D313Y, IVS6-22C>T, G395A. This was consistent with the high genetic heterogeneity found in FD patients and carriers. All of the mutations were detected by HRM, whereas 17% of the mutations were not found by EcoTILLING. The results obtained by EcoTILLING comparing the CEL I and ENDO-1 endonucleases were perfectly overlapping.</p> <p>Conclusion</p> <p>On the basis of its simplicity, flexibility, repeatability, and sensitivity, we believe that<ul/>HRM analysis of the <it>GLA </it>gene is a reliable presequencing screening tool. This method can be applied to any genomic feature to identify known and unknown genetic alterations, and it is ideal for conducting screening and population studies.</p

    Evaluating privacy-preserving record linkage using cryptographic long-term keys and multibit trees on large medical datasets.

    Get PDF
    Background: Integrating medical data using databases from different sources by record linkage is a powerful technique increasingly used in medical research. Under many jurisdictions, unique personal identifiers needed for linking the records are unavailable. Since sensitive attributes, such as names, have to be used instead, privacy regulations usually demand encrypting these identifiers. The corresponding set of techniques for privacy-preserving record linkage (PPRL) has received widespread attention. One recent method is based on Bloom filters. Due to superior resilience against cryptographic attacks, composite Bloom filters (cryptographic long-term keys, CLKs) are considered best practice for privacy in PPRL. Real-world performance of these techniques using large-scale data is unknown up to now. Methods: Using a large subset of Australian hospital admission data, we tested the performance of an innovative PPRL technique (CLKs using multibit trees) against a gold-standard derived from clear-text probabilistic record linkage. Linkage time and linkage quality (recall, precision and F-measure) were evaluated. Results: Clear text probabilistic linkage resulted in marginally higher precision and recall than CLKs. PPRL required more computing time but 5 million records could still be de-duplicated within one day. However, the PPRL approach required fine tuning of parameters. Conclusions: We argue that increased privacy of PPRL comes with the price of small losses in precision and recall and a large increase in computational burden and setup time. These costs seem to be acceptable in most applied settings, but they have to be considered in the decision to apply PPRL. Further research on the optimal automatic choice of parameters is needed

    Discovery of nucleotide polymorphisms in the Musa gene pool by Ecotilling

    Get PDF
    Musa (banana and plantain) is an important genus for the global export market and in local markets where it provides staple food for approximately 400 million people. Hybridization and polyploidization of several (sub)species, combined with vegetative propagation and human selection have produced a complex genetic history. We describe the application of the Ecotilling method for the discovery and characterization of nucleotide polymorphisms in diploid and polyploid accessions of Musa. We discovered over 800 novel alleles in 80 accessions. Sequencing and band evaluation shows Ecotilling to be a robust and accurate platform for the discovery of polymorphisms in homologous and homeologous gene targets. In the process of validating the method, we identified two single nucleotide polymorphisms that may be deleterious for the function of a gene putatively important for phototropism. Evaluation of heterozygous polymorphism and haplotype blocks revealed a high level of nucleotide diversity in Musa accessions. We further applied a strategy for the simultaneous discovery of heterozygous and homozygous polymorphisms in diploid accessions to rapidly evaluate nucleotide diversity in accessions of the same genome type. This strategy can be used to develop hypotheses for inheritance patterns of nucleotide polymorphisms within and between genome types. We conclude that Ecotilling is suitable for diversity studies in Musa, that it can be considered for functional genomics studies and as tool in selecting germplasm for traditional and mutation breeding approaches

    Utilisation of an operative difficulty grading scale for laparoscopic cholecystectomy

    Get PDF
    Background A reliable system for grading operative difficulty of laparoscopic cholecystectomy would standardise description of findings and reporting of outcomes. The aim of this study was to validate a difficulty grading system (Nassar scale), testing its applicability and consistency in two large prospective datasets. Methods Patient and disease-related variables and 30-day outcomes were identified in two prospective cholecystectomy databases: the multi-centre prospective cohort of 8820 patients from the recent CholeS Study and the single-surgeon series containing 4089 patients. Operative data and patient outcomes were correlated with Nassar operative difficultly scale, using Kendall’s tau for dichotomous variables, or Jonckheere–Terpstra tests for continuous variables. A ROC curve analysis was performed, to quantify the predictive accuracy of the scale for each outcome, with continuous outcomes dichotomised, prior to analysis. Results A higher operative difficulty grade was consistently associated with worse outcomes for the patients in both the reference and CholeS cohorts. The median length of stay increased from 0 to 4 days, and the 30-day complication rate from 7.6 to 24.4% as the difficulty grade increased from 1 to 4/5 (both p < 0.001). In the CholeS cohort, a higher difficulty grade was found to be most strongly associated with conversion to open and 30-day mortality (AUROC = 0.903, 0.822, respectively). On multivariable analysis, the Nassar operative difficultly scale was found to be a significant independent predictor of operative duration, conversion to open surgery, 30-day complications and 30-day reintervention (all p < 0.001). Conclusion We have shown that an operative difficulty scale can standardise the description of operative findings by multiple grades of surgeons to facilitate audit, training assessment and research. It provides a tool for reporting operative findings, disease severity and technical difficulty and can be utilised in future research to reliably compare outcomes according to case mix and intra-operative difficulty

    A new mutant genetic resource for tomato crop improvement by TILLING technology

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the last decade, the availability of gene sequences of many plant species, including tomato, has encouraged the development of strategies that do not rely on genetic transformation techniques (GMOs) for imparting desired traits in crops. One of these new emerging technology is TILLING (Targeting Induced Local Lesions In Genomes), a reverse genetics tool, which is proving to be very valuable in creating new traits in different crop species.</p> <p>Results</p> <p>To apply TILLING to tomato, a new mutant collection was generated in the genetic background of the processing tomato cultivar Red Setter by treating seeds with two different ethylemethane sulfonate doses (0.7% and 1%). An associated phenotype database, LycoTILL, was developed and a TILLING platform was also established. The interactive and evolving database is available online to the community for phenotypic alteration inquiries. To validate the Red Setter TILLING platform, induced point mutations were searched in 7 tomato genes with the mismatch-specific ENDO1 nuclease. In total 9.5 kb of tomato genome were screened and 66 nucleotide substitutions were identified. The overall mutation density was estimated and it resulted to be 1/322 kb and 1/574 kb for the 1% EMS and 0.7% EMS treatment respectively.</p> <p>Conclusions</p> <p>The mutation density estimated in our collection and its comparison with other TILLING populations demonstrate that the Red Setter genetic resource is suitable for use in high-throughput mutation discovery. The Red Setter TILLING platform is open to the research community and is publicly available via web for requesting mutation screening services.</p

    High-Throughput Detection of Induced Mutations and Natural Variation Using KeyPoint™ Technology

    Get PDF
    Reverse genetics approaches rely on the detection of sequence alterations in target genes to identify allelic variants among mutant or natural populations. Current (pre-) screening methods such as TILLING and EcoTILLING are based on the detection of single base mismatches in heteroduplexes using endonucleases such as CEL 1. However, there are drawbacks in the use of endonucleases due to their relatively poor cleavage efficiency and exonuclease activity. Moreover, pre-screening methods do not reveal information about the nature of sequence changes and their possible impact on gene function. We present KeyPoint™ technology, a high-throughput mutation/polymorphism discovery technique based on massive parallel sequencing of target genes amplified from mutant or natural populations. KeyPoint combines multi-dimensional pooling of large numbers of individual DNA samples and the use of sample identification tags (“sample barcoding”) with next-generation sequencing technology. We show the power of KeyPoint by identifying two mutants in the tomato eIF4E gene based on screening more than 3000 M2 families in a single GS FLX sequencing run, and discovery of six haplotypes of tomato eIF4E gene by re-sequencing three amplicons in a subset of 92 tomato lines from the EU-SOL core collection. We propose KeyPoint technology as a broadly applicable amplicon sequencing approach to screen mutant populations or germplasm collections for identification of (novel) allelic variation in a high-throughput fashion
    corecore