35 research outputs found

    CEP: a conformational epitope prediction server

    Get PDF
    CEP server () provides a web interface to the conformational epitope prediction algorithm developed in-house. The algorithm, apart from predicting conformational epitopes, also predicts antigenic determinants and sequential epitopes. The epitopes are predicted using 3D structure data of protein antigens, which can be visualized graphically. The algorithm employs structure-based Bioinformatics approach and solvent accessibility of amino acids in an explicit manner. Accuracy of the algorithm was found to be 75% when evaluated using X-ray crystal structures of Ag–Ab complexes available in the PDB. This is the first and the only method available for the prediction of conformational epitopes, which is an attempt to map probable antibody-binding sites of protein antigens

    Unsupervised correction of gene-independent cell responses to CRISPR-Cas9 targeting

    Get PDF
    Background: Genome editing by CRISPR-Cas9 technology allows large-scale screening of gene essentiality in cancer. A confounding factor when interpreting CRISPR-Cas9 screens is the high false-positive rate in detecting essential genes within copy number amplified regions of the genome. We have developed the computational tool CRISPRcleanR which is capable of identifying and correcting gene-independent responses to CRISPR-Cas9 targeting. CRISPRcleanR uses an unsupervised approach based on the segmentation of single-guide RNA fold change values across the genome, without making any assumption about the copy number status of the targeted genes. Results: Applying our method to existing and newly generated genome-wide essentiality profiles from 15 cancer cell lines, we demonstrate that CRISPRcleanR reduces false positives when calling essential genes, correcting biases within and outside of amplified regions, while maintaining true positive rates. Established cancer dependencies and essentiality signals of amplified cancer driver genes are detectable post-correction. CRISPRcleanR reports sgRNA fold changes and normalised read counts, is therefore compatible with downstream analysis tools, and works with multiple sgRNA libraries. Conclusions: CRISPRcleanR is a versatile open-source tool for the analysis of CRISPR-Cas9 knockout screens to identify essential genes

    Genomic evolution of breast cancer metastasis and relapse

    Get PDF
    A.G.L. and J.H.R.F. were supported by a Cancer Research UK Program Grant to Simon Tavaré (C14303/A17197).Patterns of genomic evolution between primary and metastatic breast cancer have not been studied in large numbers, despite patients with metastatic breast cancer having dismal survival. We sequenced whole genomes or a panel of 365 genes on 299 samples from 170 patients with locally relapsed or metastatic breast cancer. Several lines of analysis indicate that clones seeding metastasis or relapse disseminate late from primary tumors, but continue to acquire mutations, mostly accessing the same mutational processes active in the primary tumor. Most distant metastases acquired driver mutations not seen in the primary tumor, drawing from a wider repertoire of cancer genes than early drivers. These include a number of clinically actionable alterations and mutations inactivating SWI-SNF and JAK2-STAT3 pathways.Publisher PDFPeer reviewe

    Curation of viral genomes: challenges, applications and the way forward

    Get PDF
    BACKGROUND: Whole genome sequence data is a step towards generating the 'parts list' of life to understand the underlying principles of Biocomplexity. Genome sequencing initiatives of human and model organisms are targeted efforts towards understanding principles of evolution with an application envisaged to improve human health. These efforts culminated in the development of dedicated resources. Whereas a large number of viral genomes have been sequenced by groups or individuals with an interest to study antigenic variation amongst strains and species. These independent efforts enabled viruses to attain the status of 'best-represented taxa' with the highest number of genomes. However, due to lack of concerted efforts, viral genomic sequences merely remained as entries in the public repositories until recently. RESULTS: VirGen is a curated resource of viral genomes and their analyses. Since its first release, it has grown both in terms of coverage of viral families and development of new modules for annotation and analysis. The current release (2.0) includes data for twenty-five families with broad host range as against eight in the first release. The taxonomic description of viruses in VirGen is in accordance with the ICTV nomenclature. A well-characterised strain is identified as a 'representative entry' for every viral species. This non-redundant dataset is used for subsequent annotation and analyses using sequenced-based Bioinformatics approaches. VirGen archives precomputed data on genome and proteome comparisons. A new data module that provides structures of viral proteins available in PDB has been incorporated recently. One of the unique features of VirGen is predicted conformational and sequential epitopes of known antigenic proteins using in-house developed algorithms, a step towards reverse vaccinology. CONCLUSION: Structured organization of genomic data facilitates use of data mining tools, which provides opportunities for knowledge discovery. One of the approaches to achieve this goal is to carry out functional annotations using comparative genomics. VirGen, a comprehensive viral genome resource that serves as an annotation and analysis pipeline has been developed for the curation of public domain viral genome data . Various steps in the curation and annotation of the genomic data and applications of the value-added derived data are substantiated with case studies

    Characterizing Mutational Signatures in Human Cancer Cell Lines Reveals Episodic APOBEC Mutagenesis.

    Get PDF
    Multiple signatures of somatic mutations have been identified in cancer genomes. Exome sequences of 1,001 human cancer cell lines and 577 xenografts revealed most common mutational signatures, indicating past activity of the underlying processes, usually in appropriate cancer types. To investigate ongoing patterns of mutational-signature generation, cell lines were cultured for extended periods and subsequently DNA sequenced. Signatures of discontinued exposures, including tobacco smoke and ultraviolet light, were not generated in vitro. Signatures of normal and defective DNA repair and replication continued to be generated at roughly stable mutation rates. Signatures of APOBEC cytidine deaminase DNA-editing exhibited substantial fluctuations in mutation rate over time with episodic bursts of mutations. The initiating factors for the bursts are unclear, although retrotransposon mobilization may contribute. The examined cell lines constitute a resource of live experimental models of mutational processes, which potentially retain patterns of activity and regulation operative in primary human cancers.This work was supported by Wellcome grants 098051 and 206194; Cancer Research UK Grand Challenge Award C98/A24032 to L.B.A. and B.O.; the Li Ka Shing Foundation and National Institute for Health Research Oxford Biomedical Research Centre to D.C.W.; ED481A-2016/151 from Xunta de Galicia to B.R.–M

    Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

    No full text
    Funder: NCI U24CA211006Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts

    VirGen: a comprehensive viral genome resource

    No full text
    VirGen is a comprehensive viral genome resource that organizes the ‘sequence space’ of viral genomes in a structured fashion. It has been developed with the objective of serving as an annotated and curated database comprising complete genome sequences of viruses, value-added derived data and data mining tools. The current release (v1.1) contains 559 complete genomes in addition to 287 putative genomes of viruses belonging to eight viral families for which the host range includes animals and plants. Viral genomes in VirGen are annotated using sequence-based Bioinformatics approaches. The genomic data is also curated to identify ‘alternate names’ of viral proteins, where available. VirGen archives the results of comparisons of genomes, proteomes and individual proteins within and between viral species. It is the first resource to provide phylogenetic trees of viral species computed using whole-genome sequence data. The module of predicted B-cell antigenic determinants in VirGen is an attempt to link the genome to its vaccinome. Comparative genome analysis data facilitate the study of genome organization and evolution of viruses, which would have implications in applied research to identify candidates for the design of vaccines and antiviral drugs. VirGen is a relational database and is available at http://bioinfo.ernet.in/virgen/virgen.html

    Resequencing the susceptibility gene, ITGAM, identifies two functionally deleterious rare variants in systemic lupus erythematosus cases

    Get PDF
    INTRODUCTION: The majority of the genetic variance of systemic lupus erythematosus (SLE) remains unexplained by the common disease-common variant hypothesis. Rare variants, which are not detectable by genome-wide association studies because of their low frequencies, are predicted to explain part of this ”missing heritability.” However, recent studies identifying rare variants within known disease-susceptibility loci have failed to show genetic associations because of their extremely low frequencies, leading to the questioning of the contribution of rare variants to disease susceptibility. A common (minor allele frequency = 17.4% in cases) nonsynonymous coding variant rs1143679 (R77H) in ITGAM (CD11b), which forms half of the heterodimeric integrin receptor, complement receptor 3 (CR3), is robustly associated with SLE and has been shown to impair CR3-mediated phagocytosis. METHODS: We resequenced ITGAM in 73 SLE cases and identified two previously unidentified, case-specific nonsynonymous variants, F941V and G1145S. Both variants were genotyped in 2,107 and 949 additional SLE cases, respectively, to estimate their frequencies in a disease population. An in vitro model was used to assess the impact of F941V and G1145S, together with two nonsynonymous ITGAM polymorphisms, A858V (rs1143683) and M441T (rs11861251), on CR3-mediated phagocytosis. A paired two-tailed t test was used to compare the phagocytic capabilities of each variant with that of wild-type CR3. RESULTS: Both rare variants, F941V and G1145S, significantly impair CR3-mediated phagocytosis in an in vitro model (61% reduction, P = 0.006; 26% reduction, P = 0.0232). However, neither of the common variants, M441T and A858V, had an effect on phagocytosis. Neither rare variant was observed again in the genotyping of additional SLE cases, suggesting that there frequencies are extremely low. CONCLUSIONS: Our results add further evidence to the functional importance of ITGAM in SLE pathogenesis through impaired phagocytosis. Additionally, this study provides a new example of the identification of rare variants in common-allele-associated loci, which, because of their extremely low frequencies, are not statistically associated. However, the demonstration of their functional effects adds support to their contribution to disease risk, and questions the current notion of dismissing the contribution of very rare variants on purely statistical analyses
    corecore