234 research outputs found
African-specific molecular taxonomy of prostate cancer
Prostate cancer is characterized by considerable geo-ethnic disparity. African ancestry is a signifcant risk factor, with mortality rates across sub-Saharan Africa of 2.7-fold higher than global averages1 . The contributing genetic and non-genetic factors, and associated mutational processes, are unknown2,3 . Here, through whole-genome sequencing of treatment-naive prostate cancer samples from 183 ancestrally (African versus European) and globally distinct patients, we generate a large cancer genomics resource for sub-Saharan Africa, identifying around 2āmillion somatic variants. Signifcant African-ancestry-specifc fndings include an elevated tumour mutational burden, increased percentage of genome alteration, a greater number of predicted damaging mutations and a higher total of mutational signatures, and the driver genes NCOA2, STK19, DDX11L1, PCAT1 and SETBP1. Examining all somatic mutational types, we describe a molecular taxonomy for prostate cancer diferentiated by ancestry and defned as global mutational subtypes (GMS). By further including Chinese Asian data, we confrm that GMS-B (copy-number gain) and GMS-D (mutationally noisy) are specifc to African populations, GMS-A (mutationally quiet) is universal (all ethnicities) and the AfricanāEuropean-restricted subtype GMS-C (copy-number losses) predicts poor clinical outcomes. In addition to the clinical beneft of including individuals of African ancestry, our GMS subtypes reveal diferent evolutionary trajectories and mutational processes suggesting that both common genetic and environmental factors contribute to the disparity between ethnicities. Analogous to geneāenvironment interactionādefned here as a diferent efect of an environmental surrounding in people with diferent ancestries or vice versaāwe anticipate that GMS subtypes act as a proxy for intrinsic and extrinsic mutational processes in cancers, promoting global inclusion in landmark studies
Addressing the contribution of previously described genetic and epidemiological risk factors associated with increased prostate cancer risk and aggressive disease within men from South Africa
BACKGROUND: Although African ancestry represents a significant risk factor for prostate cancer, few studies have investigated the significance of prostate cancer and relevance of previously defined genetic and epidemiological prostate cancer risk factors within Africa. We recently established the Southern African Prostate Cancer Study (SAPCS), a resource for epidemiological and genetic analysis of prostate cancer risk and outcomes in Black men from South Africa. Biased towards highly aggressive prostate cancer disease, this is the first reported data analysis. METHODS: The SAPCS is an ongoing population-based study of Black men with or without prostate cancer. Pilot analysis was performed for the first 837 participants, 522 cases and 315 controls. We investigate 46 pre-defined prostate cancer risk alleles and up to 24 epidemiological measures including demographic, lifestyle and environmental factors, for power to predict disease status and to drive on-going SAPCS recruitment, sampling procedures and research direction. RESULTS: Preliminary results suggest that no previously defined risk alleles significantly predict prostate cancer occurrence within the SAPCS. Furthermore, genetic risk profiles did not enhance the predictive power of prostate specific antigen (PSA) testing. Our study supports several lifestyle/environmental factors contributing to prostate cancer risk including a family history of cancer, diabetes, current sexual activity and erectile dysfunction, balding pattern, frequent aspirin usage and high PSA levels. CONCLUSIONS: Despite a clear increased prostate cancer risk associated with an African ancestry, experimental data is lacking within Africa. This pilot study is therefore a significant contribution to the field. While genetic risk factors (largely European-defined) show no evidence for disease prediction in the SAPCS, several epidemiological factors were associated with prostate cancer status. We call for improved study power by building on the SAPCS resource, further validation of associated factors in independent African-based resources, and genome-wide approaches to define African-specific risk alleles
A review and comparative study of cancer detection using machine learning : SBERT and SimCSE application
AVAILABILITY OF DATA AND MATERIALS : The data can be accessed at the host database (The European Genome-phenome Archive at the European Bioinformatics
Institute, accession number: EGAD00001004582 Data access).BACKGROUND : Using visual, biological, and electronic health records data as the sole
input source, pretrained convolutional neural networks and conventional machine
learning methods have been heavily employed for the identification of various malignancies.
Initially, a series of preprocessing steps and image segmentation steps are
performed to extract region of interest features from noisy features. Then, the extracted
features are applied to several machine learning and deep learning methods for the
detection of cancer.
METHODS : In this work, a review of all the methods that have been applied to develop
machine learning algorithms that detect cancer is provided. With more than 100 types
of cancer, this study only examines research on the four most common and prevalent
cancers worldwide: lung, breast, prostate, and colorectal cancer. Next, by using
state-of-the-art sentence transformers namely: SBERT (2019) and the unsupervised
SimCSE (2021), this study proposes a new methodology for detecting cancer. This
method requires raw DNA sequences of matched tumor/normal pair as the only input.
The learnt DNA representations retrieved from SBERT and SimCSE will then be sent to
machine learning algorithms (XGBoost, Random Forest, LightGBM, and CNNs) for classification.
As far as we are aware, SBERT and SimCSE transformers have not been applied
to represent DNA sequences in cancer detection settings.
RESULTS : The XGBoost model, which had the highest overall accuracy of 73 Ā± 0.13 %
using SBERT embeddings and 75 Ā± 0.12 % using SimCSE embeddings, was the best
performing classifier. In light of these findings, it can be concluded that incorporating
sentence representations from SimCSEās sentence transformer only marginally
improved the performance of machine learning models.The South African Medical Research Council (SAMRC) through its Division of Research Capacity Development under the Internship Scholarship Program from funding received from the South African National Treasury.https://bmcbioinformatics.biomedcentral.comam2024Computer ScienceSchool of Health Systems and Public Health (SHSPH)Non
Structure based inhibitor design targeting glycogen phosphorylase b. Virtual screening, synthesis, biochemical and biological assessment of novel N-acyl-Ī²-d-glucopyranosylamines
Glycogen phosphorylase (GP) is a validated target for the development of new type 2 diabetes treatments. Exploiting the Zinc docking database, we report the in silico screening of 1888 Ī²- D-glucopyranose-NH-CO-R putative GP inhibitors differing only in their R groups. CombiGlide and GOLD docking programs with different scoring functions were employed with the best performing methods combined in a āconsensus scoringā approach to ranking of ligand binding affinities for the active site. Six selected candidates from the screening were then synthesized and their inhibitory potency was assessed both in vitro and ex vivo. Their inhibition constantsā values, in vitro, ranged from 5 to 377 ĀµM while two of them were effective at causing inactivation of GP in rat hepatocytes at low ĀµM concentrations. The crystal structures of GP in complex with the inhibitors were defined and provided the structural basis for their inhibitory potency and data for further structure based design of more potent inhibitors
Discriminatory Gleason grade group signatures of prostate cancer : an application of machine learning methods
One of the most precise methods to detect prostate cancer is by evaluation of a stained
biopsy by a pathologist under a microscope. Regions of the tissue are assessed and graded
according to the observed histological pattern. However, this is not only laborious, but also
relies on the experience of the pathologist and tends to suffer from the lack of reproducibility
of biopsy outcomes across pathologists. As a result, computational approaches are being
sought and machine learning has been gaining momentum in the prediction of the Gleason
grade group. To date, machine learning literature has addressed this problem by using features from magnetic resonance imaging images, whole slide images, tissue microarrays,
gene expression data, and clinical features. However, there is a gap with regards to predicting the Gleason grade group using DNA sequences as the only input source to the machine
learning models. In this work, using whole genome sequence data from South African prostate cancer patients, an application of machine learning and biological experiments were
combined to understand the challenges that are associated with the prediction of the Gleason grade group. A series of machine learning binary classifiers (XGBoost, LSTM, GRU,
LR, RF) were created only relying on DNA sequences input features. All the models were
not able to adequately discriminate between the DNA sequences of the studied Gleason
grade groups (Gleason grade group 1 and 5). However, the models were further evaluated
in the prediction of tumor DNA sequences from matched-normal DNA sequences, given
DNA sequences as the only input source. In this new problem, the models performed
acceptably better than before with the XGBoost model achieving the highest accuracy of 74
Ā± 01, F1 score of 79 Ā± 01, recall of 99 Ā± 0.0, and precision of 66 Ā± 0.1.The South African Medical Research Council (SAMRC) through its Division of Research Capacity Development under the Internship Scholarship Program from funding received from the South African National Treasury.http://www.plosone.orgdm2022Computer ScienceSchool of Health Systems and Public Health (SHSPH
African inclusion in prostate cancer genomic studies provides the first glimpses into addressing health disparities through tailored clinical care
No abstract available.Funding for genomic support provided by the National Health and Medical Research Council (NHMRC) of Australia.http://wileyonlinelibrary.com/journal/ctm2am2024School of Health Systems and Public Health (SHSPH)SDG-03:Good heatlh and well-bein
First ancient mitochondrial human genome from a prepastoralist Southern African
The
oldest
contemporary
human
mitochondrial
lineages
arose
in
Africa.
The
earliest
divergent
extant
maternal
offshoot,
namely
haplogroup
L0d,
is
represented
by
click-Āāspeaking
forager
peoples
of
Southern
Africa.
Broadly
defined
as
Khoesan,
contemporary
Khoesan
are
today
largely
restricted
to
the
semi-Āā
desert
regions
of
Namibia
and
Botswana,
while
archeological,
historical
and
genetic
evidence
promotes
a
once
broader
southerly
dispersal
of
click-Āāspeaking
peoples
including
southward
migrating
pastoralists
and
indigenous
marine-Āāforagers.
Today
extinct,
no
genetic
data
has
been
recovered
from
the
indigenous
peoples
that
once
sustained
life
along
the
southern
coastal
waters
of
Africa
pre-Āāpastoral
arrival.
In
this
study
we
generate
a
complete
mitochondrial
genome
from
a
2,330
year
old
male
skeleton,
confirmed
via
osteological
and
archeological
analysis
as
practicing
a
marine-Āābased
forager
existence.
The
ancient
mtDNA
represents
a
new
L0d2c
lineage
(L0d2c1c)
that
is
today,
unlike
its
Khoe-Āālanguage
based
sister-Āā
clades
(L0d2c1a
and
L0d2c1b)
most
closely
related
to
contemporary
indigenous
San-Āāspeakers
(specifically
Ju).
Providing
the
first
genomic
evidence
that
pre-Āāpastoral
Southern
African
marine
foragers
carried
the
earliest
diverged
maternal
modern
human
lineages,
this
study
emphasizes
the
significance
of
Southern
African
archeological
remains
in
defining
early
modern
human
origins.J. Craig Venter Family Foundation, La Jolla, CA, U.S.A. and the Max Planck Society (within
the laboratory of Svante PƤƤbo).http://gbe.oxfordjournals.orghb201
Revised timeline and distribution of the earliest diverged human maternal lineages in southern Africa
The oldest extant human maternal lineages include mitochondrial haplogroups L0d and L0k found in the southern African click-speaking forager peoples broadly classified as Khoesan. Profiling these early mitochondrial lineages allows for better understanding of modern human evolution. In this study, we profile 77 new early-diverged complete mitochondrial genomes and sub-classify another 105 L0d/L0k individuals from southern Africa. We use this data to refine basal phylogenetic divergence, coalescence times and Khoesan prehistory. Our results confirm L0d as the earliest diverged lineage (ā¼172 kya, 95%CI: 149-199 kya), followed by L0k (ā¼159 kya, 95%CI: 136-183 kya) and a new lineage we name L0g (ā¼94 kya, 95%CI: 72-116 kya). We identify two new L0d1 subclades we name L0d1d and L0d1c4/L0d1e, and estimate L0d2 and L0d1 divergence at ā¼93 kya (95%CI:76-112 kya). We concur the earliest emerging L0d1ā2 sublineage L0d1b (ā¼49 kya, 95%CI:37-58 kya) is widely distributed across southern Africa. Concomitantly, we find the most recent sublineage L0d2a (ā¼17 kya, 95%CI:10-27 kya) to be equally common. While we agree that lineages L0d1c and L0k1a are restricted to contemporary inland Khoesan populations, our observed predominance of L0d2a and L0d1a in non-Khoesan populations suggests a once independent coastal Khoesan prehistory. The distribution of early-diverged human maternal lineages within contemporary southern Africans suggests a rich history of human existence prior to any archaeological evidence of migration into the region. For the first time, we provide a genetic-based evidence for significant modern human evolution in southern Africa at the time of the Last Glacial Maximum at between ā¼21-17 kya, coinciding with the emergence of major lineages L0d1a, L0d2b, L0d2d and L0d2a
ANO7 African-ancestral genomic diversity and advanced prostate cancer
DATA AVAILABILITY : The data used in this study will be made available on request.BACKGROUND :
Prostate cancer (PCa) is a significant health burden for African men, with mortality rates more than double global averages. The prostate specific Anoctamin 7 (ANO7) gene linked with poor patient outcomes has recently been identified as the target for an African-specific protein-truncating PCa-risk allele.
METHODS :
Here we determined the role of ANO7 in a study of 889 men from southern Africa, leveraging exomic genotyping array PCa case-control data (nā=ā780, 17 ANO7 alleles) and deep sequenced whole genome data for germline and tumour ANO7 interrogation (nā=ā109), while providing clinicopathologically matched European-derived sequence data comparative analyses (nā=ā57). Associated predicted deleterious variants (PDVs) were further assessed for impact using computational protein structure analysis.
RESULTS :
Notably rare in European patients, we found the common African PDV p.Ile740Leu (rs74804606) to be associated with PCa risk in our case-control analysis (Wilcoxon rank-sum test, false discovery rate/FDRā=ā0.03), while sequencing revealed co-occurrence with the recently reported African-specific deleterious risk variant p.Ser914* (rs60985508). Additional findings included a novel protein-truncating African-specific frameshift variant p.Asp789Leu, African-relevant PDVs associated with altered protein structure at Ca2+ binding sites, early-onset PCa associated with PDVs and germline structural variants in Africans (Linear regression models, ā6.42 years, 95% CIā=āā10.68 to ā2.16, P-valueā=ā0.003) and ANO7 as an inter-chromosomal PCa-related gene fusion partner in African derived tumours.
CONCLUSIONS :
Here we provide not only validation for ANO7 as an African-relevant protein-altering PCa-risk locus, but additional evidence for a role of inherited and acquired ANO7 variance in the observed phenotypic heterogeneity and African-ancestral health disparity.The National Health and Medical Research Council (NHMRC) of Australia; Ideas Grant; USA Congressionally Directed Medical Research Programs (CDMRP) Prostate Cancer Research Program (PCRP) Idea Development Award; HEROIC Consortium Award and the Petre Foundation via the University of Sydney Foundation, Australia. Open Access funding enabled and organized by CAUL and its Member Institutions.http://www.nature.com/pcan/hj2024School of Health Systems and Public Health (SHSPH)SDG-03:Good heatlh and well-bein
Next generation mapping reveals novel large genomic rearrangements in prostate cancer
Complex genomic rearrangements are common molecular events driving prostate
carcinogenesis. Clinical significance, however, has yet to be fully elucidated. Detecting
the full range and subtypes of large structural variants (SVs), greater than one
kilobase in length, is challenging using clinically feasible next generation sequencing
(NGS) technologies. Next generation mapping (NGM) is a new technology that allows
for the interrogation of megabase length DNA molecules outside the detection range
of single-base resolution NGS. In this study, we sought to determine the feasibility
of using the Irys (Bionano Genomics Inc.) nanochannel NGM technology to generate
whole genome maps of a primary prostate tumor and matched blood from a Gleason
score 7 (4 + 3), ETS-fusion negative prostate cancer patient. With an effective mapped
coverage of 35X and sequence coverage of 60X, and an estimated 43% tumor purity,
we identified 85 large somatic structural rearrangements and 6,172 smaller somatic
variants, respectively. The vast majority of the large SVs (89%), of which 73%
are insertions, were not detectable ab initio using high-coverage short-read NGS.
However, guided manual inspection of single NGS reads and de novo assembled
scaffolds of NGM-derived candidate regions allowed for confirmation of 94% of
these large SVs, with over a third impacting genes with oncogenic potential. From
this single-patient study, the first cancer study to integrate NGS and NGM data, we
hypothesise that there exists a novel spectrum of large genomic rearrangements in
prostate cancer, that these large genomic rearrangements are likely early events in
tumorigenesis, and they have potential to enhance taxonomy.This work was supported by Movember
Australia and the Prostate Cancer Foundation Australia
(PCFA) as part of the Movember Revolutionary Team
Award (MRTA) to the Garvan Institute of Medical
Research program on prostate cancer bone metastasis
(ProMis to P.I.C. and V.M.H.) dedicated to establishing
NGM for clinically relevant prostate cancer, and the
Australian Prostate Cancer Research Centre NSW
(APCRC-NSW). Participant recruitment and sampling
was supported by the Cancer Association of South Africa
(CANSA to M.S.R.B and V.M.H.). W.J. is supported by
APCRC-NSW, E.K.F.C. and D.C.P. are partly supported
by ProMis, P.I.C. is supported by Mrs Janice Gibson
and the Ernest Heine Family Foundation, Australia,
and V.M.H. is supported by the University of Sydney
Foundation and Petre Foundation, Australia.www.impactjournals.com/oncotargetam2018School of Health Systems and Public Health (SHSPH
- ā¦