36 research outputs found
Network-based approach for post genome-wide association study analysis in admixed populations
Includes abstract.Includes bibliographical references.In this project, we review some existing pathway-based approaches for GWA study analyses, by exploring different implemented methods for combining effects of multiple modest genetic variants at gene and pathway levels. We then propose a graph-based method, ancGWAS, that incorporates the signal from GWA study, and the locus-specific ancestry into the human protein-protein interaction (PPI) network to identify significant sub-networks or pathways associated with the trait of interest. This network-based method applies centrality measures within linkage disequilibrium (LD) on the network to search for pathways and applies a scoring summary statistic on the resulting pathways to identify the most enriched pathways associated with complex diseases
Proposed minimum information guideline for kidney disease—research and clinical data reporting: a cross-sectional study
Objective This project aimed to develop and propose a standardised reporting guideline for kidney disease research and clinical data reporting, in order to improve kidney disease data quality and integrity, and combat challenges associated with the management and challenges of ‘Big Data’.
Methods A list of recommendations was proposed for the reporting guideline based on the systematic review and consolidation of previously published data collection and reporting standards, including PhenX measures and Minimal Information about a Proteomics Experiment (MIAPE). Thereafter, these recommendations were reviewed by domain-specialists using an online survey, developed in Research Electronic Data Capture (REDCap). Following interpretation and consolidation of the survey results, the recommendations were mapped to existing ontologies using Zooma, Ontology Lookup Service and the Bioportal search engine. Additionally, an associated eXtensible Markup Language schema was created for the REDCap implementation to increase user friendliness and adoption.
Results The online survey was completed by 53 respondents; the majority of respondents were dual clinician-researchers (57%), based in Australia (35%), Africa (33%) and North America (22%). Data elements within the reporting standard were identified as participant-level, study-level and experiment-level information, further subdivided into essential or optional information.
Conclusion The reporting guideline is readily employable for kidney disease research projects, and also adaptable for clinical utility. The adoption of the reporting guideline in kidney disease research can increase data quality and the value for long-term preservation, ensuring researchers gain the maximum benefit from their collected and generated data.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial
The H3ABioNet helpdesk: an online bioinformatics resource, enhancing Africa’s capacity for genomics research
Abstract
Background
Currently, formal mechanisms for bioinformatics support are limited. The H3Africa Bioinformatics Network has implemented a public and freely available Helpdesk (HD), which provides generic bioinformatics support to researchers through an online ticketing platform. The following article reports on the H3ABioNet HD (H3A-HD)‘s development, outlining its design, management, usage and evaluation framework, as well as the lessons learned through implementation.
Results
The H3A-HD evaluated using automatically generated usage logs, user feedback and qualitative ticket evaluation. Evaluation revealed that communication methods, ticketing strategies and the technical platforms used are some of the primary factors which may influence the effectivity of HD.
Conclusion
To continuously improve the H3A-HD services, the resource should be regularly monitored and evaluated. The H3A-HD design, implementation and evaluation framework could be easily adapted for use by interested stakeholders within the Bioinformatics community and beyond
The extent and impact of variation in ADME genes in sub-Saharan African populations
Investigating variation in genes involved in the absorption, distribution, metabolism, and excretion (ADME) of drugs are key to characterizing pharmacogenomic (PGx) relationships. ADME gene variation is relatively well characterized in European and Asian populations, but data from African populations are under-studied—which has implications for drug safety and effective use in Africa
Development of Bioinformatics Infrastructure for Genomics Research:
Although pockets of bioinformatics excellence have developed in Africa, generally, large-scale genomic data analysis has been limited by the availability of expertise and infrastructure. H3ABioNet, a pan-African bioinformatics network, was established to build capacity specifically to enable H3Africa (Human Heredity and Health in Africa) researchers to analyze their data in Africa. Since the inception of the H3Africa initiative, H3ABioNet's role has evolved in response to changing needs from the consortium and the African bioinformatics community
High-depth African genomes inform human migration and health.
The African continent is regarded as the cradle of modern humans and African genomes contain more genetic variation than those from any other continent, yet only a fraction of the genetic diversity among African individuals has been surveyed1. Here we performed whole-genome sequencing analyses of 426 individuals-comprising 50 ethnolinguistic groups, including previously unsampled populations-to explore the breadth of genomic diversity across Africa. We uncovered more than 3Â million previously undescribed variants, most of which were found among individuals from newly sampled ethnolinguistic groups, as well as 62 previously unreported loci that are under strong selection, which were predominantly found in genes that are involved in viral immunity, DNA repair and metabolism. We observed complex patterns of ancestral admixture and putative-damaging and novel variation, both within and between populations, alongside evidence that Zambia was a likely intermediate site along the routes of expansion of Bantu-speaking populations. Pathogenic variants in genes that are currently characterized as medically relevant were uncommon-but in other genes, variants denoted as 'likely pathogenic' in the ClinVar database were commonly observed. Collectively, these findings refine our current understanding of continental migration, identify gene flow and the response to human disease as strong drivers of genome-level population variation, and underscore the scientific imperative for a broader characterization of the genomic diversity of African individuals to understand human ancestry and improve health
High-depth African genomes inform human migration and health
The African continent is regarded as the cradle of modern humans and African genomes contain more genetic variation than those from any other continent, yet only a fraction of the genetic diversity among African individuals has been surveyed1. Here we performed whole-genome sequencing analyses of 426 individuals—comprising 50 ethnolinguistic groups, including previously unsampled populations—to explore the breadth of genomic diversity across Africa. We uncovered more than 3 million previously undescribed variants, most of which were found among individuals from newly sampled ethnolinguistic groups, as well as 62 previously unreported loci that are under strong selection, which were predominantly found in genes that are involved in viral immunity, DNA repair and metabolism. We observed complex patterns of ancestral admixture and putative-damaging and novel variation, both within and between populations, alongside evidence that Zambia was a likely intermediate site along the routes of expansion of Bantu-speaking populations. Pathogenic variants in genes that are currently characterized as medically relevant were uncommon—but in other genes, variants denoted as ‘likely pathogenic’ in the ClinVar database were commonly observed. Collectively, these findings refine our current understanding of continental migration, identify gene flow and the response to human disease as strong drivers of genome-level population variation, and underscore the scientific imperative for a broader characterization of the genomic diversity of African individuals to understand human ancestry and improve health
Developing reproducible bioinformatics analysis workflows for heterogeneous computing environments to support African genomics
Background: The Pan-African bioinformatics network, H3ABioNet, comprises 27 research institutions in 17 African
countries. H3ABioNet is part of the Human Health and Heredity in Africa program (H3Africa), an African-led research
consortium funded by the US National Institutes of Health and the UK Wellcome Trust, aimed at using genomics to
study and improve the health of Africans. A key role of H3ABioNet is to support H3Africa projects by building
bioinformatics infrastructure such as portable and reproducible bioinformatics workflows for use on heterogeneous
African computing environments. Processing and analysis of genomic data is an example of a big data application
requiring complex interdependent data analysis workflows. Such bioinformatics workflows take the primary and
secondary input data through several computationally-intensive processing steps using different software packages,
where some of the outputs form inputs for other steps. Implementing scalable, reproducible, portable and
easy-to-use workflows is particularly challenging.
Results: H3ABioNet has built four workflows to support (1) the calling of variants from high-throughput sequencing
data; (2) the analysis of microbial populations from 16S rDNA sequence data; (3) genotyping and genome-wide
association studies; and (4) single nucleotide polymorphism imputation. A week-long hackathon was organized in
August 2016 with participants from six African bioinformatics groups, and US and European collaborators. Two of the
workflows are built using the Common Workflow Language framework (CWL) and two using Nextflow. All the
workflows are containerized for improved portability and reproducibility using Docker, and are publicly available for
use by members of the H3Africa consortium and the international research community.
Conclusion: The H3ABioNet workflows have been implemented in view of offering ease of use for the end user and
high levels of reproducibility and portability, all while following modern state of the art bioinformatics data processing
protocols. The H3ABioNet workflows will service the H3Africa consortium projects and are currently in use.
All four workflows are also publicly available for research scientists worldwide to use and adapt for their respective
needs. The H3ABioNet workflows will help develop bioinformatics capacity and assist genomics research within Africa
and serve to increase the scientific output of H3Africa and its Pan-African Bioinformatics Network
High-depth African genomes inform human migration and health
The African continent is regarded as the cradle of modern humans and African genomes contain more genetic variation than those from any other continent, yet only a fraction of the genetic diversity among African individuals has been surveyed1. Here we performed whole-genome sequencing analyses of 426 individuals—comprising 50 ethnolinguistic groups, including previously unsampled populations—to explore the breadth of genomic diversity across Africa. We uncovered more than 3 million previously undescribed variants, most of which were found among individuals from newly sampled ethnolinguistic groups, as well as 62 previously unreported loci that are under strong selection, which were predominantly found in genes that are involved in viral immunity, DNA repair and metabolism. We observed complex patterns of ancestral admixture and putative-damaging and novel variation, both within and between populations, alongside evidence that Zambia was a likely intermediate site along the routes of expansion of Bantu-speaking populations. Pathogenic variants in genes that are currently characterized as medically relevant were uncommon—but in other genes, variants denoted as ‘likely pathogenic’ in the ClinVar database were commonly observed. Collectively, these findings refine our current understanding of continental migration, identify gene flow and the response to human disease as strong drivers of genome-level population variation, and underscore the scientific imperative for a broader characterization of the genomic diversity of African individuals to understand human ancestry and improve health
A preliminary study of minimal-contention locks
As multicore CPUs become more common, scalable synchronization primitives have wider use and ideas previously used in large-scale computation are worth re-opening for wider use. In this paper I explore one approach to scalable synchronization, a minimal-contention lock (M-lock). The key idea is to avoid spinning on a global variable but instead for each blocked task (process or thread) to spin on a local lock representing the task that immediately preceded it in attempting to acquire the lock. This creates an ordering based on the order in which tasks attempt to acquire the lock, preventing starvation. The only globally shared variable is a pointer to the next local lock to be contended for. Each contending task swaps the value of this pointer for a pointer to its own variable. It spins on the variable previously pointed to by the global pointer. Each waiting task spins on a lock only seen by itself and the owner of that lock variable. While a task is spinning, the lock variable can be held in its local cache until invalidated by the lock owner when it unsets the lock. Consequently, the amount of bus traffic is considerably less than with a spinlock, which has the pernicious feature that the task releasing the lock is delayed by all the other bus traffic arising from contention for the lock. An MCS lock has similar properties but is more complicated and requires more memory contention-causing operations. This paper outlines the design of the M-lock and provides a preliminary performance analysis