37 research outputs found
GenomeChronicler: The Personal Genome Project UK Genomic Report Generator Pipeline
In recent years, there has been a significant increase in whole genome sequencing data of individual genomes produced by research projects as well as direct to consumer service providers. While many of these sources provide their users with an interpretation of the data, there is a lack of free, open tools for generating reports exploring the data in an easy to understand manner. GenomeChronicler was developed as part of the Personal Genome Project UK (PGP-UK) to address this need. PGP-UK provides genomic, transcriptomic, epigenomic and self-reported phenotypic data under an open-access model with full ethical approval. As a result, the reports generated by GenomeChronicler are intended for research purposes only and include information relating to potentially beneficial and potentially harmful variants, but without clinical curation. GenomeChronicler can be used with data from whole genome or whole exome sequencing, producing a genome report containing information on variant statistics, ancestry and known associated phenotypic traits. Example reports are available from the PGP-UK data page (personalgenomes.org.uk/data). The objective of this method is to leverage existing resources to find known phenotypes associated with the genotypes detected in each sample. The provided trait data is based primarily upon information available in SNPedia, but also collates data from ClinVar, GETevidence, and gnomAD to provide additional details on potential health implications, presence of genotype in other PGP participants and population frequency of each genotype. The analysis can be run in a self-contained environment without requiring internet access, making it a good choice for cases where privacy is essential or desired: any third party project can embed GenomeChronicler within their off-line safe-haven environments. GenomeChronicler can be run for one sample at a time, or in parallel making use of the Nextflow workflow manager. The source code is available from GitHub (https://github.com/PGP-UK/GenomeChronicler), container recipes are available for Docker and Singularity, as well as a pre-built container from SingularityHub (https://singularity-hub.org/collections/3664) enabling easy deployment in a variety of settings. Users without access to computational resources to run GenomeChronicler can access the software from the Lifebit CloudOS platform (https://lifebit.ai/cloudos) enabling the production of reports and variant calls from raw sequencing data in a scalable fashion
Pheno4J: a gene to phenotype graph database
Efficient storage and querying of large amounts of genetic and phenotypic data is crucial to contemporary clinical genetic research. This introduces computational challenges for classical relational databases, due to the sparsity and sheer volume of the data. Our Java based solution loads annotated genetic variants and well phenotyped patients into a graph database to allow fast efficient storage and querying of large volumes of structured genetic and phenotypic data. This abstracts technical problems away and lets researchers focus on the science rather than the implementation. We have also developed an accompanying webserver with end-points to facilitate querying of the database
Donor whole blood DNA methylation is not a strong predictor of acute graft versus host disease in unrelated donor allogeneic haematopoietic cell transplantation
Allogeneic hematopoietic cell transplantation (HCT) is used to treat many blood-based disorders and malignancies. While this is an effective treatment, it can result in serious adverse events, such as the development of acute graft-versus-host disease (aGVHD). This study aimed to develop a donor-specific epigenetic classifier that could be used in donor selection in HCT to reduce the incidence of aGVHD.
The discovery cohort of the study consisted of 288 donors from a population receiving HLA-A, -B, -C and -DRB1 matched unrelated donor HCT with T cell replete peripheral blood stem cell grafts for treatment of acute leukaemia or myelodysplastic syndromes after myeloablative conditioning. Donors were selected based on recipient aGVHD outcome; this cohort consisted of 144 cases with aGVHD grades III-IV and 144 controls with no aGVHD that survived at least 100 days post-HCT matched for sex, age, disease and GVHD prophylaxis.
Genome-wide DNA methylation was assessed using the Infinium Methylation EPIC BeadChip (Illumina), measuring CpG methylation at >850,000 sites across the genome. Following quality control, pre-processing and exploratory analyses, we applied a machine learning algorithm (Random Forest) to identify CpG sites predictive of aGVHD. Receiver operating characteristic (ROC) curve analysis of these sites resulted in a classifier with an encouraging area under the ROC curve (AUC) of 0.91.
To test this classifier, we used an independent validation cohort (n=288) selected using the same criteria as the discovery cohort. Different attempts to validate the classifier using the independent validation cohort failed with the AUC falling to 0.51. These results indicate that donor DNA methylation may not be a suitable predictor of aGVHD in an HCT setting involving unrelated donors, despite the initial promising results in the discovery cohort.
Our work highlights the importance of independent validation of machine learning classifiers, particularly when developing classifiers intended for clinical use
The Personal Genome Project-UK, an open access resource of human multi-omics data
Integrative analysis of multi-omics data is a powerful approach for gaining functional insights into biological and medical processes. Conducting these multifaceted analyses on human samples is often complicated by the fact that the raw sequencing output is rarely available under open access. The Personal Genome Project UK (PGP-UK) is one of few resources that recruits its participants under open consent and makes the resulting multi-omics data freely and openly available. As part of this resource, we describe the PGP-UK multi-omics reference panel consisting of ten genomic, methylomic and transcriptomic data. Specifically, we outline the data processing, quality control and validation procedures which were implemented to ensure data integrity and exclude sample mix-ups. In addition, we provide a REST API to facilitate the download of the entire PGP-UK dataset. The data are also available from two cloud-based environments, providing platforms for free integrated analysis. In conclusion, the genotype-validated PGP-UK multi-omics human reference panel described here provides a valuable new open access resource for integrated analyses in support of personal and medical genomics
Sequenceserver: A Modern Graphical User Interface for Custom BLAST Databases
Comparing newly obtained and previously known nucleotide and amino-acid sequences underpins modern biological research. BLAST is a well-established tool for such comparisons but is challenging to use on new data sets. We combined a user-centric design philosophy with sustainable software development approaches to create Sequenceserver, a tool for running BLAST and visually inspecting BLAST results for biological interpretation. Sequenceserver uses simple algorithms to prevent potential analysis errors and provides flexible text-based and visual outputs to support researcher productivity. Our software can be rapidly installed for use by individuals or on shared servers
MET is required for the recruitment of anti-tumoural neutrophils
Mutations or amplification of the MET proto-oncogene are involved in the pathogenesis of several tumours, which rely on the constitutive engagement of this pathway for their growth and survival. However, MET is expressed not only by cancer cells but also by tumour-associated stromal cells, although its precise role in this compartment is not well characterized. Here we show that MET is required for neutrophil chemoattraction and cytotoxicity in response to its ligand hepatocyte growth factor (HGF). Met deletion in mouse neutrophils enhances tumour growth and metastasis. This phenotype correlates with reduced neutrophil infiltration to both the primary tumour and metastatic sites. Similarly, Met is necessary for neutrophil transudation during colitis, skin rash or peritonitis. Mechanistically, Met is induced by tumour-derived tumour necrosis factor (TNF)-α or other inflammatory stimuli in both mouse and human neutrophils. This induction is instrumental for neutrophil transmigration across an activated endothelium and for inducible nitric oxide synthase production upon HGF stimulation. Consequently, HGF/MET-dependent nitric oxide release by neutrophils promotes cancer cell killing, which abates tumour growth and metastasis. After systemic administration of a MET kinase inhibitor, we prove that the therapeutic benefit of MET targeting in cancer cells is partly countered by the pro-tumoural effect arising from MET blockade in neutrophils. Our work identifies an unprecedented role of MET in neutrophils, suggests a potential ‘Achilles’ heel’ of MET-targeted therapies in cancer, and supports the rationale for evaluating anti-MET drugs in certain inflammatory diseases