Search CORE

811 research outputs found

Genome annotation of the 1.2MB Region on chromosome 8p22-p23.1 harbouring the gene for Keratolytic Winter Erythema (KWE)

Author: Aron Shaun Lyle
Publication venue
Publication date: 17/01/2012
Field of study

Keratolytic winter erythema (KWE) or Oudtshoorn skin disease is a rare autosomal dominant skin disorder for which the genetic cause remains unknown. The disorder manifests in the form of erythema and hyperkeratosis of the palmar-plantar regions and has been linked to a 1.2Mb region on chromosome 8p22-23.1 between markers D8S1759 and D8S552. A prevalence of 1/7200 has been observed in the South African Afrikaans-speaking white population with a lower unspecified prevalence occurring in the coloured South African population. A number of positional candidate genes within the critical region have been assessed for pathogenic mutations, however to date the causative gene has not been identified. The objective of the current study was to examine the KWE critical region for highly conserved coding and non-coding regions and copy number variants (CNV) and to determine if these regions may play a role in the molecular etiology of the disease. Highly conserved regions were identified based on sequence conservation across a range of evolutionary diverse organisms. These regions were further analysed for possible protein-coding gene structure, regulatory motifs and RNA secondary structure. In addition, a custom CGH tiling array (384K Roche-Nimblegen) was used to identify CNVs across the extended KWE critical region in both affected and unaffected individuals. The multi-species sequence alignment revealed eight regions that showed a high level of conservation above a 70% threshold. Functional analysis of two of the conserved regions led to the identification of a novel protein-coding gene deubiquitinating enzyme 3 (DUB3) within the critical region which presented as a credible functional candidate for KWE. Two of the conserved regions were identified within an open reading frame c8orf13 which has previously been examined and found to contain no pathogenic mutations that segregate with the KWE phenotype. The remaining four highly conserved regions were found within non-coding sequence and computational analysis revealed putative regulatory motifs in the form of transcription factor binding sites. The copy number variation analysis did not show evidence for the presence of any large or small consistent CNV alleles likely to impact on any of the functional candidate genes in the KWE critical region. No common CNV alleles were observed in all of the KWE affected individuals examined and showed absence in unaffected family members. A significant variation in copy number was however observed in affected individuals within a previously defined copy number variable beta-defensin gene cluster which has been associated with psoriasis. Although the exact copy number of the cluster could not be determined in the present study due to the cross hybridization between genes in the family, the CNV observed in affect individuals for the cluster suggests that it may be involved in the modulation of the clinical severity of KWE. The present study has led to the identification of a previously uncharacterised novel gene DUB3 within the KWE critical region which furthermore presented as a plausible functional candidate for the KWE phenotype. In addition, it has revealed that the molecular cause of KWE is unlikely to be exclusively due to copy number variation within the genes in the critical region. The current study has provided valuable insight into the KWE linked critical region and revealed a number of potential regions of interest to be examined in further studies exploring the molecular cause of the disease

Wits Institutional Repository on DSPACE

Raster based coastal marsh classification within the Galveston Bay ecosystem, Texas

Author: Edwards Aron Shaun
Publication venue
Publication date: 15/05/2009
Field of study

A mapping study using remote sensing software called ENVI was conducted utilizing four software algorithms to investigate whether these techniques could accurately classify habitat types and vegetation communities along West Bay of the Galveston Bay Ecosystem from color infra-red (CIR) imagery. The algorithms were used in a small-scale study to investigate which of these techniques could most accurately distinguish habitat types and vegetation communities from the imagery at a site specific location. The most accurate algorithm of the four was used in a large-scale classification study in which entire images were classified utilizing the same data from the small-scale study. Regions of interest (ROIs) were used within ENVI to specify areas of interest within each image that was classified. The locations of ROIs were recorded using a GPS prior to classification, then each was added into ENVI as data points, and each ROI polygon was digitized according to its respective pixel color. Once all of the ROI polygons were completed, each software algorithm was employed. After classification, each habitat type and vegetation community was ground-truthed in order to verify the accuracy of the algorithms. The position points were added as ground truth points within ENVI and an accuracy matrix was assessed. The technique with the greatest averaged accuracy within the smallscale study was selected for the large-scale study. The ROIs and ground truth points used in the smallscale study were used again in the large-scale study. The small-scale study concluded that the Parallelepiped algorithm produced significantly less accurate classifications than the other three. Although the Mahalanobis algorithm was not significantly different from the other two algorithms, it yielded the highest overall average accuracy and was used in the large-scale study. In both the small-scale and large-scale studies there was no significant difference in the two different years of aerial imagery and there were no significant differences in accuracy for locations. None of the software algorithms were accurate at classifying habitat types and vegetation communities using the imagery. The accuracy for the Mahalanobis algorithm was less than 60%. Inaccuracies were largely due to overlapping spectral signatures among habitat types and vegetation communities

OAKTrust Digital Repository (Texas A&M Univ)

Ten simple rules for developing bioinformatics capacity at an academic institution

Author: Aron Shaun
Jongenee Victor
Kumuthini Judit
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2021
Field of study

Bioinformatics is an applied interdisciplinary field whose primary purpose is to develop and deploy computational techniques to store, organize, and aid in the analysis and interpretation of large-scale data obtained from biological systems. While rooted in the analysis of nucleotide and protein sequences, it now encompasses techniques targeting multiple data acquisition modalities and seeks to comprehend the functioning of biological systems at many different levels. Bioinformaticians need to be cognizant of diverse scientific fields: basic and molecular biology, genetics, mathematics, statistics, and computer science at a minimum, thus requiring a thoroughly interdisciplinary set of skills to successfully carry out their duties. Due to the growing importance of bioinformatics in enabling modern biomedical research, programs and core facilities have been established in most academic institutions in the developed world over the last 30 years

University of the Western Cape Research Repository

The elusive gene for keratolytic winter erythema

Author: Aron Shaun
Hobbs Angela
Hull Peter R
Ramsay Michèle
Publication venue: 'South African Medical Association NPC'
Publication date: 11/10/2013
Field of study

Keratolytic winter erythema (KWE), also known as Oudtshoorn skin disease, is characterised by a cyclical disruption of normal epidermal keratinisation affecting primarily the palmoplantar skin with peeling of the palms and soles, which is worse in the winter. It is a rare monogenic, autosomal dominant condition of unknown cause. However, due to a founder effect, it occurs at a prevalence of 1/7 200 among South African Afrikaans-speakers. In the mid-1980s, samples were collected from affected families for a linkage study to pinpoint the location of the KWE gene. A genome-wide linkage analysis, using microsatellite markers, identified the KWE critical region on chromosome 8p23.1-p22. Subsequent genetic studies focused on screening candidate genes in this critical region; however, no pathogenic mutations that segregated exclusively with KWE were identified. The cathepsin B (CTSB) and farnesyl-diphosphate farnesyltransferase 1 (FDFT1) genes revealed no potentially pathogenic variants, nor did they show differential gene expression in affected skin. Mutation detection in additional candidate genes also failed to identify the KWE-associated variant, suggesting that the causal variant may be in an uncharacterised functional region. Bioinformatic analysis revealed highly conserved regions within the KWE critical region and a custom tiling array was designed to cover this region and to search for copy number variation. Although the study did not identify a variant that segregates exclusively with KWE, it provided valuable insight into the complex KWE-linked region. Next-generation sequencing approaches are being used to comb the region, but the causal variant for this interesting hyperkeratotic palmoplantar phenotype still remains elusive.

South African Medical Journal (SAMJ)

High-depth African genomes inform human migration and health

Author: Aron Shaun
Botha Gerrit
Botigué Laura R
Choudhury Ananyo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

The African continent is regarded as the cradle of modern humans and African genomes contain more genetic variation than those from any other continent, yet only a fraction of the genetic diversity among African individuals has been surveyed1. Here we performed whole-genome sequencing analyses of 426 individuals— comprising 50 ethnolinguistic groups, including previously unsampled populations— to explore the breadth of genomic diversity across Africa. We uncovered more than 3 million previously undescribed variants, most of which were found among individuals from newly sampled ethnolinguistic groups, as well as 62 previously unreported loci that are under strong selection, which were predominantly found in genes that are involved in viral immunity, DNA repair and metabolism. We observed complex patterns of ancestral admixture and putative-damaging and novel variation, both within and between populations, alongside evidence that population from Zambia were a likely intermediate site along the routes of expansion of Bantuspeaking populations. Pathogenic variants in genes that are currently characterized as medically relevant were uncommon—but in other genes, variants denoted as ‘likely pathogenic’ in the ClinVar database were commonly observed. Collectively, these findings refine our current understanding of continental migration, identify gene flow and the response to human disease as strong drivers of genome-level population variation, and underscore the scientific imperative for a broader characterization of the genomic diversity of African individuals to understand human ancestry and improve health

University of the Western Cape Research Repository

Genetic-substructure and complex demographic history of South African Bantu speakers

Author: Alberts Marianne
Aron Shaun
Bostoen Koen
Casas F Gomez-Olive
Choudhury Ananyo
Chousou-Polydouri Natalia
Delius Peter
Fortes-Lima Cesar
Gunnink Hilde
Hazelhurst Scott
Mashinya Felistas
Norris Shane
Ramsay Michèle
Schlebusch Carina M
Sengupta Dhriti
Tollman Stephen
Whitelaw Gavin
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/01/2020
Field of study

South Eastern Bantu-speaking (SEB) groups constitute more than 80% of the population in South Africa. Despite clear linguistic and geographic diversity, the genetic differences between these groups have not been systematically investigated. Based on genome-wide data of over 5000 individuals, representing eight major SEB groups, we provide strong evidence for fine-scale population structure that broadly aligns with geographic distribution and is also congruent with linguistic phylogeny (separation of Nguni, Sotho-Tswana and Tsonga speakers). Although differential Khoe-San admixture plays a key role, the structure persists after Khoe-San ancestry-masking. The timing of admixture, levels of sex-biased gene flow and population size dynamics also highlight differences in the demographic histories of individual groups. The comparisons with five Iron Age farmer genomes further support genetic continuity over ∼400 years in certain regions of the country. Simulated trait genome-wide association studies further show that the observed population structure could have major implications for biomedical genomics research in South Africa

ZORA

Genetic substructure and complex demographic history of South African Bantu speakers

Author: Aron Shaun
Choudhury Ananyo
Fortes-Lima Cesar
Gunnink Hilde
Hazelhurst Scott
Mashinya Felistas
Ramsay Michèle
Schlebusch Carina M.
Sengupta Dhriti
Whitelaw Gavin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Abstract: outh Eastern Bantu-speaking (SEB) groups constitute more than 80% of the population in South Africa. Despite clear linguistic and geographic diversity, the genetic differences between these groups have not been systematically investigated. Based on genome-wide data of over 5000 individuals, representing eight major SEB groups, we provide strong evidence for fine-scale population structure that broadly aligns with geographic distribution and is also congruent with linguistic phylogeny (separation of Nguni, Sotho-Tswana and Tsonga speakers). Although differential Khoe-San admixture plays a key role, the structure persists after Khoe-San ancestry-masking. The timing of admixture, levels of sex-biased gene flow and population size dynamics also highlight differences in the demographic histories of individual groups. The comparisons with five Iron Age farmer genomes further support genetic continuity over ~400 years in certain regions of the country. Simulated trait genomewide association studies further show that the observed population structure could have major implications for biomedical genomics research in South Africa

University of Johannesburg Institutional Repository

Population-specific common SNPs reflect demographic histories and highlight regions of genomic plasticity with functional relevance

Author: Achinike-Oduaran Ovokeraye
Aron Shaun
Choudhury Ananyo
Gamieldien Junaid
Hazelhurst Scott
Jalali Sefid Dashti Mahjoubeh
Meintjes Ayton
Mulder Nicola
Ramsay Michèle
Tiffin Nicki
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Abstract Background Population differentiation is the result of demographic and evolutionary forces. Whole genome datasets from the 1000 Genomes Project (October 2012) provide an unbiased view of genetic variation across populations from Europe, Asia, Africa and the Americas. Common population-specific SNPs (MAF > 0.05) reflect a deep history and may have important consequences for health and wellbeing. Their interpretation is contextualised by currently available genome data. Results The identification of common population-specific (CPS) variants (SNPs and SSV) is influenced by admixture and the sample size under investigation. Nine of the populations in the 1000 Genomes Project (2 African, 2 Asian (including a merged Chinese group) and 5 European) revealed that the African populations (LWK and YRI), followed by the Japanese (JPT) have the highest number of CPS SNPs, in concordance with their histories and given the populations studied. Using two methods, sliding 50-SNP and 5-kb windows, the CPS SNPs showed distinct clustering across large genome segments and little overlap of clusters between populations. iHS enrichment score and the population branch statistic (PBS) analyses suggest that selective sweeps are unlikely to account for the clustering and population specificity. Of interest is the association of clusters close to recombination hotspots. Functional analysis of genes associated with the CPS SNPs revealed over-representation of genes in pathways associated with neuronal development, including axonal guidance signalling and CREB signalling in neurones. Conclusions Common population-specific SNPs are non-randomly distributed throughout the genome and are significantly associated with recombination hotspots. Since the variant alleles of most CPS SNPs are the derived allele, they likely arose in the specific population after a split from a common ancestor. Their proximity to genes involved in specific pathways, including neuronal development, suggests evolutionary plasticity of selected genomic regions. Contrary to expectation, selective sweeps did not play a large role in the persistence of population-specific variation. This suggests a stochastic process towards population-specific variation which reflects demographic histories and may have some interesting implications for health and susceptibility to disease

Cape Town University OpenUCT

Crossref

Springer - Publisher Connector

PubMed Central

Using a multiple-delivery-mode training approach to develop local capacity and infrastructure for advanced bioinformatics in Africa

Author: Christopher J Fields
Dane Kennedy
Gerrit Botha
Gloria Rendon
Imane Allali
Jessica R Holmes
Katie Lennard
Kilaza Samson Mwaikono
Nicola Mulder
Shantelle Claassen-Weitz
Shaun Aron
Sumir Panji
Verena Ras
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2021
Field of study

With more microbiome studies being conducted by African-based research groups, there is an increasing demand for knowledge and skills in the design and analysis of microbiome studies and data. However, high-quality bioinformatics courses are often impeded by differences in computational environments, complicated software stacks, numerous dependencies, and versions of bioinformatics tools along with a lack of local computational infrastructure and expertise. To address this, H3ABioNet developed a 16S rRNA Microbiome Intermediate Bioinformatics Training course, extending its remote classroom model. The course was developed alongside experienced microbiome researchers, bioinformaticians, and systems administrators, who identified key topics to address. Development of containerised workflows has previously been undertaken by H3ABioNet, and Singularity containers were used here to enable the deployment of a standard replicable software stack across different hosting sites. The pilot ran successfully in 2019 across 23 sites registered in 11 African countries, with more than 200 participants formally enrolled and 106 volunteer staff for onsite support

Directory of Open Access Journals

University of the Western Cape Research Repository

Designing a course model for distance-based online bioinformatics training in Africa: the H3ABioNet experience

Author: A Via
Ahmed Mansour Alzohairy
Amel Ghouila
B Güzer
B Holmberg
Colleen Saunders
CP Zeki
David P. Judge
Deogratius Ssemwanga
Fatma Z. Guerfali
Francis Ouellette
J Bergmann
Jean-Baka Domelevo Entfellner
Jonathan Kayondo
Kim T. Gurwitz
L Welch
N Kemp
Nicola Mulder
NJ Mulder
O Tastan Bishop
P Pevzner
Pedro L. Fernandes
Rehab Ahmed
Ruben Cloete
Samson P. Salifu
Shaun Aron
Sumir Panji
Suresh Maslamoney
TK Attwood
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2017
Field of study

Africa is not unique in its need for basic bioinformatics training for individuals from a diverse range of academic backgrounds. However, particular logistical challenges in Africa, most notably access to bioinformatics expertise and internet stability, must be addressed in order to meet this need on the continent. H3ABioNet (www.h3abionet.org), the Pan African Bioinformatics Network for H3Africa, has therefore developed an innovative, free-of-charge "Introduction to Bioinformatics" course, taking these challenges into account as part of its educational efforts to provide on-site training and develop local expertise inside its network. A multiple-delivery±mode learning model was selected for this 3-month course in order to increase access to (mostly) African, expert bioinformatics trainers. The content of the course was developed to include a range of fundamental bioinformatics topics at the introductory level. For the first iteration of the course (2016), classrooms with a total of 364 enrolled participants were hosted at 20 institutions across 10 African countries. To ensure that classroom success did not depend on stable internet, trainers pre-recorded their lectures, and classrooms downloaded and watched these locally during biweekly contact sessions. The trainers were available via video conferencing to take questions during contact sessions, as well as via online "question and discussion" forums outside of contact session time. This learning model, developed for a resource-limited setting, could easily be adapted to other settings.IS

Access to Research and Communications Annals

Crossref

Directory of Open Access Journals

University of the Western Cape Research Repository

FigShare