234 research outputs found

    African-specific molecular taxonomy of prostate cancer

    Get PDF
    Prostate cancer is characterized by considerable geo-ethnic disparity. African ancestry is a signifcant risk factor, with mortality rates across sub-Saharan Africa of 2.7-fold higher than global averages1 . The contributing genetic and non-genetic factors, and associated mutational processes, are unknown2,3 . Here, through whole-genome sequencing of treatment-naive prostate cancer samples from 183 ancestrally (African versus European) and globally distinct patients, we generate a large cancer genomics resource for sub-Saharan Africa, identifying around 2ā€‰million somatic variants. Signifcant African-ancestry-specifc fndings include an elevated tumour mutational burden, increased percentage of genome alteration, a greater number of predicted damaging mutations and a higher total of mutational signatures, and the driver genes NCOA2, STK19, DDX11L1, PCAT1 and SETBP1. Examining all somatic mutational types, we describe a molecular taxonomy for prostate cancer diferentiated by ancestry and defned as global mutational subtypes (GMS). By further including Chinese Asian data, we confrm that GMS-B (copy-number gain) and GMS-D (mutationally noisy) are specifc to African populations, GMS-A (mutationally quiet) is universal (all ethnicities) and the Africanā€“European-restricted subtype GMS-C (copy-number losses) predicts poor clinical outcomes. In addition to the clinical beneft of including individuals of African ancestry, our GMS subtypes reveal diferent evolutionary trajectories and mutational processes suggesting that both common genetic and environmental factors contribute to the disparity between ethnicities. Analogous to geneā€“environment interactionā€”defned here as a diferent efect of an environmental surrounding in people with diferent ancestries or vice versaā€”we anticipate that GMS subtypes act as a proxy for intrinsic and extrinsic mutational processes in cancers, promoting global inclusion in landmark studies

    Addressing the contribution of previously described genetic and epidemiological risk factors associated with increased prostate cancer risk and aggressive disease within men from South Africa

    Get PDF
    BACKGROUND: Although African ancestry represents a significant risk factor for prostate cancer, few studies have investigated the significance of prostate cancer and relevance of previously defined genetic and epidemiological prostate cancer risk factors within Africa. We recently established the Southern African Prostate Cancer Study (SAPCS), a resource for epidemiological and genetic analysis of prostate cancer risk and outcomes in Black men from South Africa. Biased towards highly aggressive prostate cancer disease, this is the first reported data analysis. METHODS: The SAPCS is an ongoing population-based study of Black men with or without prostate cancer. Pilot analysis was performed for the first 837 participants, 522 cases and 315 controls. We investigate 46 pre-defined prostate cancer risk alleles and up to 24 epidemiological measures including demographic, lifestyle and environmental factors, for power to predict disease status and to drive on-going SAPCS recruitment, sampling procedures and research direction. RESULTS: Preliminary results suggest that no previously defined risk alleles significantly predict prostate cancer occurrence within the SAPCS. Furthermore, genetic risk profiles did not enhance the predictive power of prostate specific antigen (PSA) testing. Our study supports several lifestyle/environmental factors contributing to prostate cancer risk including a family history of cancer, diabetes, current sexual activity and erectile dysfunction, balding pattern, frequent aspirin usage and high PSA levels. CONCLUSIONS: Despite a clear increased prostate cancer risk associated with an African ancestry, experimental data is lacking within Africa. This pilot study is therefore a significant contribution to the field. While genetic risk factors (largely European-defined) show no evidence for disease prediction in the SAPCS, several epidemiological factors were associated with prostate cancer status. We call for improved study power by building on the SAPCS resource, further validation of associated factors in independent African-based resources, and genome-wide approaches to define African-specific risk alleles

    A review and comparative study of cancer detection using machine learning : SBERT and SimCSE application

    Get PDF
    AVAILABILITY OF DATA AND MATERIALS : The data can be accessed at the host database (The European Genome-phenome Archive at the European Bioinformatics Institute, accession number: EGAD00001004582 Data access).BACKGROUND : Using visual, biological, and electronic health records data as the sole input source, pretrained convolutional neural networks and conventional machine learning methods have been heavily employed for the identification of various malignancies. Initially, a series of preprocessing steps and image segmentation steps are performed to extract region of interest features from noisy features. Then, the extracted features are applied to several machine learning and deep learning methods for the detection of cancer. METHODS : In this work, a review of all the methods that have been applied to develop machine learning algorithms that detect cancer is provided. With more than 100 types of cancer, this study only examines research on the four most common and prevalent cancers worldwide: lung, breast, prostate, and colorectal cancer. Next, by using state-of-the-art sentence transformers namely: SBERT (2019) and the unsupervised SimCSE (2021), this study proposes a new methodology for detecting cancer. This method requires raw DNA sequences of matched tumor/normal pair as the only input. The learnt DNA representations retrieved from SBERT and SimCSE will then be sent to machine learning algorithms (XGBoost, Random Forest, LightGBM, and CNNs) for classification. As far as we are aware, SBERT and SimCSE transformers have not been applied to represent DNA sequences in cancer detection settings. RESULTS : The XGBoost model, which had the highest overall accuracy of 73 Ā± 0.13 % using SBERT embeddings and 75 Ā± 0.12 % using SimCSE embeddings, was the best performing classifier. In light of these findings, it can be concluded that incorporating sentence representations from SimCSEā€™s sentence transformer only marginally improved the performance of machine learning models.The South African Medical Research Council (SAMRC) through its Division of Research Capacity Development under the Internship Scholarship Program from funding received from the South African National Treasury.https://bmcbioinformatics.biomedcentral.comam2024Computer ScienceSchool of Health Systems and Public Health (SHSPH)Non

    Structure based inhibitor design targeting glycogen phosphorylase b. Virtual screening, synthesis, biochemical and biological assessment of novel N-acyl-Ī²-d-glucopyranosylamines

    Get PDF
    Glycogen phosphorylase (GP) is a validated target for the development of new type 2 diabetes treatments. Exploiting the Zinc docking database, we report the in silico screening of 1888 Ī²- D-glucopyranose-NH-CO-R putative GP inhibitors differing only in their R groups. CombiGlide and GOLD docking programs with different scoring functions were employed with the best performing methods combined in a ā€œconsensus scoringā€ approach to ranking of ligand binding affinities for the active site. Six selected candidates from the screening were then synthesized and their inhibitory potency was assessed both in vitro and ex vivo. Their inhibition constantsā€™ values, in vitro, ranged from 5 to 377 ĀµM while two of them were effective at causing inactivation of GP in rat hepatocytes at low ĀµM concentrations. The crystal structures of GP in complex with the inhibitors were defined and provided the structural basis for their inhibitory potency and data for further structure based design of more potent inhibitors

    Discriminatory Gleason grade group signatures of prostate cancer : an application of machine learning methods

    Get PDF
    One of the most precise methods to detect prostate cancer is by evaluation of a stained biopsy by a pathologist under a microscope. Regions of the tissue are assessed and graded according to the observed histological pattern. However, this is not only laborious, but also relies on the experience of the pathologist and tends to suffer from the lack of reproducibility of biopsy outcomes across pathologists. As a result, computational approaches are being sought and machine learning has been gaining momentum in the prediction of the Gleason grade group. To date, machine learning literature has addressed this problem by using features from magnetic resonance imaging images, whole slide images, tissue microarrays, gene expression data, and clinical features. However, there is a gap with regards to predicting the Gleason grade group using DNA sequences as the only input source to the machine learning models. In this work, using whole genome sequence data from South African prostate cancer patients, an application of machine learning and biological experiments were combined to understand the challenges that are associated with the prediction of the Gleason grade group. A series of machine learning binary classifiers (XGBoost, LSTM, GRU, LR, RF) were created only relying on DNA sequences input features. All the models were not able to adequately discriminate between the DNA sequences of the studied Gleason grade groups (Gleason grade group 1 and 5). However, the models were further evaluated in the prediction of tumor DNA sequences from matched-normal DNA sequences, given DNA sequences as the only input source. In this new problem, the models performed acceptably better than before with the XGBoost model achieving the highest accuracy of 74 Ā± 01, F1 score of 79 Ā± 01, recall of 99 Ā± 0.0, and precision of 66 Ā± 0.1.The South African Medical Research Council (SAMRC) through its Division of Research Capacity Development under the Internship Scholarship Program from funding received from the South African National Treasury.http://www.plosone.orgdm2022Computer ScienceSchool of Health Systems and Public Health (SHSPH

    African inclusion in prostate cancer genomic studies provides the first glimpses into addressing health disparities through tailored clinical care

    Get PDF
    No abstract available.Funding for genomic support provided by the National Health and Medical Research Council (NHMRC) of Australia.http://wileyonlinelibrary.com/journal/ctm2am2024School of Health Systems and Public Health (SHSPH)SDG-03:Good heatlh and well-bein

    First ancient mitochondrial human genome from a prepastoralist Southern African

    Get PDF
    The oldest contemporary human mitochondrial lineages arose in Africa. The earliest divergent extant maternal offshoot, namely haplogroup L0d, is represented by click-Ā­ā€speaking forager peoples of Southern Africa. Broadly defined as Khoesan, contemporary Khoesan are today largely restricted to the semi-Ā­ā€ desert regions of Namibia and Botswana, while archeological, historical and genetic evidence promotes a once broader southerly dispersal of click-Ā­ā€speaking peoples including southward migrating pastoralists and indigenous marine-Ā­ā€foragers. Today extinct, no genetic data has been recovered from the indigenous peoples that once sustained life along the southern coastal waters of Africa pre-Ā­ā€pastoral arrival. In this study we generate a complete mitochondrial genome from a 2,330 year old male skeleton, confirmed via osteological and archeological analysis as practicing a marine-Ā­ā€based forager existence. The ancient mtDNA represents a new L0d2c lineage (L0d2c1c) that is today, unlike its Khoe-Ā­ā€language based sister-Ā­ā€ clades (L0d2c1a and L0d2c1b) most closely related to contemporary indigenous San-Ā­ā€speakers (specifically Ju). Providing the first genomic evidence that pre-Ā­ā€pastoral Southern African marine foragers carried the earliest diverged maternal modern human lineages, this study emphasizes the significance of Southern African archeological remains in defining early modern human origins.J. Craig Venter Family Foundation, La Jolla, CA, U.S.A. and the Max Planck Society (within the laboratory of Svante PƤƤbo).http://gbe.oxfordjournals.orghb201

    Revised timeline and distribution of the earliest diverged human maternal lineages in southern Africa

    Get PDF
    The oldest extant human maternal lineages include mitochondrial haplogroups L0d and L0k found in the southern African click-speaking forager peoples broadly classified as Khoesan. Profiling these early mitochondrial lineages allows for better understanding of modern human evolution. In this study, we profile 77 new early-diverged complete mitochondrial genomes and sub-classify another 105 L0d/L0k individuals from southern Africa. We use this data to refine basal phylogenetic divergence, coalescence times and Khoesan prehistory. Our results confirm L0d as the earliest diverged lineage (āˆ¼172 kya, 95%CI: 149-199 kya), followed by L0k (āˆ¼159 kya, 95%CI: 136-183 kya) and a new lineage we name L0g (āˆ¼94 kya, 95%CI: 72-116 kya). We identify two new L0d1 subclades we name L0d1d and L0d1c4/L0d1e, and estimate L0d2 and L0d1 divergence at āˆ¼93 kya (95%CI:76-112 kya). We concur the earliest emerging L0d1ā€™2 sublineage L0d1b (āˆ¼49 kya, 95%CI:37-58 kya) is widely distributed across southern Africa. Concomitantly, we find the most recent sublineage L0d2a (āˆ¼17 kya, 95%CI:10-27 kya) to be equally common. While we agree that lineages L0d1c and L0k1a are restricted to contemporary inland Khoesan populations, our observed predominance of L0d2a and L0d1a in non-Khoesan populations suggests a once independent coastal Khoesan prehistory. The distribution of early-diverged human maternal lineages within contemporary southern Africans suggests a rich history of human existence prior to any archaeological evidence of migration into the region. For the first time, we provide a genetic-based evidence for significant modern human evolution in southern Africa at the time of the Last Glacial Maximum at between āˆ¼21-17 kya, coinciding with the emergence of major lineages L0d1a, L0d2b, L0d2d and L0d2a

    ANO7 African-ancestral genomic diversity and advanced prostate cancer

    Get PDF
    DATA AVAILABILITY : The data used in this study will be made available on request.BACKGROUND : Prostate cancer (PCa) is a significant health burden for African men, with mortality rates more than double global averages. The prostate specific Anoctamin 7 (ANO7) gene linked with poor patient outcomes has recently been identified as the target for an African-specific protein-truncating PCa-risk allele. METHODS : Here we determined the role of ANO7 in a study of 889 men from southern Africa, leveraging exomic genotyping array PCa case-control data (nā€‰=ā€‰780, 17 ANO7 alleles) and deep sequenced whole genome data for germline and tumour ANO7 interrogation (nā€‰=ā€‰109), while providing clinicopathologically matched European-derived sequence data comparative analyses (nā€‰=ā€‰57). Associated predicted deleterious variants (PDVs) were further assessed for impact using computational protein structure analysis. RESULTS : Notably rare in European patients, we found the common African PDV p.Ile740Leu (rs74804606) to be associated with PCa risk in our case-control analysis (Wilcoxon rank-sum test, false discovery rate/FDRā€‰=ā€‰0.03), while sequencing revealed co-occurrence with the recently reported African-specific deleterious risk variant p.Ser914* (rs60985508). Additional findings included a novel protein-truncating African-specific frameshift variant p.Asp789Leu, African-relevant PDVs associated with altered protein structure at Ca2+ binding sites, early-onset PCa associated with PDVs and germline structural variants in Africans (Linear regression models, āˆ’6.42 years, 95% CIā€‰=ā€‰āˆ’10.68 to āˆ’2.16, P-valueā€‰=ā€‰0.003) and ANO7 as an inter-chromosomal PCa-related gene fusion partner in African derived tumours. CONCLUSIONS : Here we provide not only validation for ANO7 as an African-relevant protein-altering PCa-risk locus, but additional evidence for a role of inherited and acquired ANO7 variance in the observed phenotypic heterogeneity and African-ancestral health disparity.The National Health and Medical Research Council (NHMRC) of Australia; Ideas Grant; USA Congressionally Directed Medical Research Programs (CDMRP) Prostate Cancer Research Program (PCRP) Idea Development Award; HEROIC Consortium Award and the Petre Foundation via the University of Sydney Foundation, Australia. Open Access funding enabled and organized by CAUL and its Member Institutions.http://www.nature.com/pcan/hj2024School of Health Systems and Public Health (SHSPH)SDG-03:Good heatlh and well-bein

    Next generation mapping reveals novel large genomic rearrangements in prostate cancer

    Get PDF
    Complex genomic rearrangements are common molecular events driving prostate carcinogenesis. Clinical significance, however, has yet to be fully elucidated. Detecting the full range and subtypes of large structural variants (SVs), greater than one kilobase in length, is challenging using clinically feasible next generation sequencing (NGS) technologies. Next generation mapping (NGM) is a new technology that allows for the interrogation of megabase length DNA molecules outside the detection range of single-base resolution NGS. In this study, we sought to determine the feasibility of using the Irys (Bionano Genomics Inc.) nanochannel NGM technology to generate whole genome maps of a primary prostate tumor and matched blood from a Gleason score 7 (4 + 3), ETS-fusion negative prostate cancer patient. With an effective mapped coverage of 35X and sequence coverage of 60X, and an estimated 43% tumor purity, we identified 85 large somatic structural rearrangements and 6,172 smaller somatic variants, respectively. The vast majority of the large SVs (89%), of which 73% are insertions, were not detectable ab initio using high-coverage short-read NGS. However, guided manual inspection of single NGS reads and de novo assembled scaffolds of NGM-derived candidate regions allowed for confirmation of 94% of these large SVs, with over a third impacting genes with oncogenic potential. From this single-patient study, the first cancer study to integrate NGS and NGM data, we hypothesise that there exists a novel spectrum of large genomic rearrangements in prostate cancer, that these large genomic rearrangements are likely early events in tumorigenesis, and they have potential to enhance taxonomy.This work was supported by Movember Australia and the Prostate Cancer Foundation Australia (PCFA) as part of the Movember Revolutionary Team Award (MRTA) to the Garvan Institute of Medical Research program on prostate cancer bone metastasis (ProMis to P.I.C. and V.M.H.) dedicated to establishing NGM for clinically relevant prostate cancer, and the Australian Prostate Cancer Research Centre NSW (APCRC-NSW). Participant recruitment and sampling was supported by the Cancer Association of South Africa (CANSA to M.S.R.B and V.M.H.). W.J. is supported by APCRC-NSW, E.K.F.C. and D.C.P. are partly supported by ProMis, P.I.C. is supported by Mrs Janice Gibson and the Ernest Heine Family Foundation, Australia, and V.M.H. is supported by the University of Sydney Foundation and Petre Foundation, Australia.www.impactjournals.com/oncotargetam2018School of Health Systems and Public Health (SHSPH
    • ā€¦
    corecore