11 research outputs found
Allele-specific network reveals combinatorial interaction that transcends small effects in psoriasis GWAS
<div><p>Hundreds of genetic markers have shown associations with various complex diseases, yet the āmissing heritabilityā remains alarmingly elusive. Combinatorial interactions may account for a substantial portion of this missing heritability, but their discoveries have been impeded by computational complexity and genetic heterogeneity. We present BlocBuster, a novel systems-level approach that efficiently constructs genome-wide, allele-specific networks that accurately segregate homogenous combinations of genetic factors, tests the associations of these combinations with the given phenotype, and rigorously validates the results using a series of unbiased validation methods. BlocBuster employs a correlation measure that is customized for single nucleotide polymorphisms and returns a multi-faceted collection of values that captures genetic heterogeneity. We applied BlocBuster to analyze psoriasis, discovering a combinatorial pattern with an odds ratio of 3.64 and Bonferroni-corrected p-value of 5.01Ć10<sup>ā16</sup>. This pattern was replicated in independent data, reflecting robustness of the method. In addition to improving prediction of disease susceptibility and broadening our understanding of the pathogenesis underlying psoriasis, these results demonstrate BlocBuster's potential for discovering combinatorial genetic associations within heterogeneous genome-wide data, thereby transcending the limiting āsmall effectsā produced by individual markers examined in isolation.</p></div
A Parallelized Implementation of Cut-and-Solve and a Streamlined Mixed-Integer Linear Programming Model for Finding Genetic Patterns Optimally Associated with Complex Diseases
With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as late-onset Alzheimerās disease (AD), but it has been a challenge to fully uncover the necessary information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases does not involve single markers working alone, but patterns of genetic markers interacting additively or epistatically. But as we move upwards beyond patterns of size two, it quickly becomes computationally infeasible to examine all combinations in the solution space. A common solution to solving this type of combinatorial optimization problem is to model it as a mixed-integer linear program (MIP) and solve it using the algorithm branch-and-cut, implemented by a commercial solver. However, with the trend of using increasing numbers of computing cores to increase computational power, there is a need for a different approach to solving MIPs that can utilize parallel environments. Here we show how a parallelized implementation of an alternative algorithm, cut-and-solve, can be used to solve this genetics problem faster than CPLEX, one of the leading commercial MIP solvers
NetMoST: A network-based machine learning approach for subtyping schizophrenia using polygenic SNP allele biomarkers
Subtyping neuropsychiatric disorders like schizophrenia is essential for
improving the diagnosis and treatment of complex diseases. Subtyping
schizophrenia is challenging because it is polygenic and genetically
heterogeneous, rendering the standard symptom-based diagnosis often unreliable
and unrepeatable. We developed a novel network-based machine-learning approach,
netMoST, to subtyping psychiatric disorders. NetMoST identifies polygenic risk
SNP-allele modules from genome-wide genotyping data as polygenic haplotype
biomarkers (PHBs) for disease subtyping. We applied netMoST to subtype a cohort
of schizophrenia subjects into three distinct biotypes with differentiable
genetic, neuroimaging and functional characteristics. The PHBs of the first
biotype (36.9% of all patients) were related to neurodevelopment and cognition,
the PHBs of the second biotype (28.4%) were enriched for neuroimmune functions,
and the PHBs of the third biotype (34.7%) were associated with the transport of
calcium ions and neurotransmitters. Neuroimaging patterns provided additional
support to the new biotypes, with unique regional homogeneity (ReHo) patterns
observed in the brains of each biotype compared with healthy controls. Our
findings demonstrated netMoST's capability for uncovering novel biotypes of
complex diseases such as schizophrenia. The results also showed the power of
exploring polygenic allelic patterns that transcend the conventional GWAS
approaches.Comment: 21 pages,4 figure
Recommended from our members
Pleiotropic and Epistatic Network-Based Discovery: Integrated Networks for Target Gene Discovery
Biological organisms are complex systems that are composed of functional networks of interacting molecules and macro-molecules. Complex phenotypes are the result of orchestrated, hierarchical, heterogeneous collections of expressed genomic variants. However, the effects of these variants are the result of historic selective pressure and current environmental and epigenetic signals, and, as such, their co-occurrence can be seen as genome-wide correlations in a number of different manners. Biomass recalcitrance (i.e., the resistance of plants to degradation or deconstruction, which ultimately enables access to a plantās sugars) is a complex polygenic phenotype of high importance to biofuels initiatives. This study makes use of data derived from the re-sequenced genomes from over 800 different Populus trichocarpa genotypes in combination with metabolomic and pyMBMS data across this population, as well as co-expression and co-methylation networks in order to better understand the molecular interactions involved in recalcitrance, and identify target genes involved in lignin biosynthesis/degradation. A Lines Of Evidence (LOE) scoring system is developed to integrate the information in the different layers and quantify the number of lines of evidence linking genes to target functions. This new scoring system was applied to quantify the lines of evidence linking genes to lignin-related genes and phenotypes across the network layers, and allowed for the generation of new hypotheses surrounding potential new candidate genes involved in lignin biosynthesis in P. trichocarpa, including various AGAMOUS-LIKE genes. The resulting Genome Wide Association Study networks, integrated with Single Nucleotide Polymorphism (SNP) correlation, co-methylation, and co-expression networks through the LOE scores are proving to be a powerful approach to determine the pleiotropic and epistatic relationships underlying cellular functions and, as such, the molecular basis for complex phenotypes, such as recalcitrance
Network and multi-scale signal analysis for the integration of large omic datasets: applications in \u3ci\u3ePopulus trichocarpa\u3c/i\u3e
Poplar species are promising sources of cellulosic biomass for biofuels because of their fast growth rate, high cellulose content and moderate lignin content. There is an increasing movement on integrating multiple layers of āomics data in a systems biology approach to understand gene-phenotype relationships and assist in plant breeding programs. This dissertation involves the use of network and signal processing techniques for the combined analysis of these various data types, for the goals of (1) increasing fundamental knowledge of P. trichocarpa and (2) facilitating the generation of hypotheses about target genes and phenotypes of interest. A data integration āLines of Evidenceā method is presented for the identification and prioritization of target genes involved in functions of interest. A new post-GWAS method, Pleiotropy Decomposition, is presented, which extracts pleiotropic relationships between genes and phenotypes from GWAS results, allowing for identification of genes with signatures favorable to genome editing. Continuous wavelet transform signal processing analysis is applied in the characterization of genome distributions of various features (including variant density, gene density, and methylation profiles) in order to identify chromosome structures such as the centromere. This resulted in the approximate centromere locations on all P. trichocarpa chromosomes, which had previously not been adequately reported in the scientific literature. Discrete wavelet transform signal processing followed by correlation analysis was applied to genomic features from various data types including transposable element density, methylation density, SNP density, gene density, centromere position and putative ancestral centromere position. Subsequent correlation analysis of the resulting wavelet coefficients identified scale-specific relationships between these genomic features, and provide insights into the evolution of the genome structure of P. trichocarpa. These methods have provided strategies to both increase fundamental knowledge about the P. trichocarpa system, as well as to identify new target genes related to biofuels targets. We intend that these approaches will ultimately be used in the designing of better plants for more efficient and sustainable production of bioenergy
Recommended from our members
Wavelet-Based Genomic Signal Processing for Centromere Identification and Hypothesis Generation
Various āomics data types have been generated for Populus trichocarpa, each providing a layer of information which can be represented as a density signal across a chromosome. We make use of genome sequence data, variants data across a population as well as methylation data across 10 different tissues, combined with wavelet-based signal processing to perform a comprehensive analysis of the signature of the centromere in these different data signals, and successfully identify putative centromeric regions in P. trichocarpa from these signals. Furthermore, using SNP (single nucleotide polymorphism) correlations across a natural population of P. trichocarpa, we find evidence for the co-evolution of the centromeric histone CENH3 with the sequence of the newly identified centromeric regions, and identify a new CENH3 candidate in P. trichocarpa
Efficient Reduced BIAS Genetic Algorithm for Generic Community Detection Objectives
The problem of community structure identification has been an extensively investigated area for biology, physics, social sciences, and computer science in recent years for studying the properties of networks representing complex relationships. Most traditional methods, such as K-means and hierarchical clustering, are based on the assumption that communities have spherical configurations. Lately, Genetic Algorithms (GA) are being utilized for efficient community detection without imposing sphericity. GAs are machine learning methods which mimic natural selection and scale with the complexity of the network. However, traditional GA approaches employ a representation method that dramatically increases the solution space to be searched by introducing redundancies. They also utilize a crossover operator which imposes a linear ordering that is not suitable for community detection.
The algorithm presented here is a framework to detect communities for complex biological networks that removes both redundancies and linearity. We also introduce a novel operator, named Gene Repair. This algorithm is unique as it is a flexible community detection technique aimed at maximizing the value of any given mathematical objective for the network. We reduce the memory requirements by representing chromosomes as a 3-dimensional bit array. Furthermore, in order to increase diversity while retaining promising chromosomes, we use natural selection process based on tournament selection with elitism. Additionally, our approach doesnāt require prior information about the number of true communities in the network. We apply our novel algorithm to benchmark datasets and also to a network representing a large cohort of AD cases and controls.
By utilizing this efficient and flexible implementation that is cognizant of characteristics for networks representing complex disease genetics, we sift out communities representing patterns of interacting genetic variants that are associated with this enigmatic disease
Oral Chinese herbal medicine for psoriasis vulgaris
Background: Psoriasis vulgaris is a common chronic immunological inflammatory skin disease without cure. Psoriasis is associated with increased risk of serious co-morbidities such as cardiovascular disease and diabetes. Current conventional therapies can be expensive and commonly have adverse effects. Hence further effective and safe therapies are needed. Chinese herbal medicine (CHM) has been utilised for centuries for the management of psoriasis. This thesis reviewed its current clinical and experimental psoriasis evidence and aimed to develop and test an evidence based CHM formulation. Methods: Firstly, two systematic reviews and meta-analyses were conducted using the Cochrane Library Systematic Review Method to examine published CHM research for psoriasis. 1) Randomised controlled trials (RCTs) of oral CHM for psoriasis vulgaris compared with placebo. 2) RCTs combining oral CHM with conventional therapy for psoriasis vulgaris. Secondly, in vitro and in vivo data for commonly used CHM phytochemicals were reviewed to evaluate their potential biological psoriatic mechanisms. Lastly based on the review findings and available treatment guidelines an optimised oral CHM formulation (PSORI-CM01) was developed for psoriasis vulgaris and then clinically evaluated via a rigorous designed pilot placebo double-blind RCT. Results: Literature review indicated mild&ndash;moderate psoriasis is undertreated, and topical treatments have limited efficacy. Systematic review found oral CHM has benefit compared with placebo; however, the effect size is relatively small. Further systematic review of CHM combined with conventional therapy indicated it increases effects and reduces adverse events. Subsequent in vitro and in vivo review recognised Paeonia. lactiflora Pallas and Paeonia veitchii Lynch constituents and those of other CHM act on pathways similar to conventional psoriasis drugs. The pilot RCT investigates PSORI-CM01 in a mild to moderate psoriasis vulgaris (psoriasis area severity index (PASI) 7&ndash;12) population. Thirty participants undergo a two-week run-in phase then receive 12-weeks of PSORI-CM01 plus calcipotriol or placebo plus calcipotriol. The pilot study is to determine the feasibility for an adequately powered RCT. Primary outcome is PASI change (%) and secondary measures include: PASI 75 rate, QoL change (dermatology life quality index (DLQI) and Skindex 29), acceptability of treatment, change to psoriasis-related cytokines (such as TNF-&alpha; and IL-23) and adverse events. Blood plasma specimens are collected at weeks &ndash;2, 12 and 24 then concentrations of inflammatory cytokines measured via multi-assay technique (Bio-Plex&reg; Multiplex System). The pilot RCT is ongoing, interim results indicated baseline analysis (n=11) mean PASI score 9.0&plusmn;2.4, and DLQI score 10&plusmn;7.6. Conclusion: Oral CHM has promising efficacy for psoriasis and combined with conventional treatments enhances effects. Evidence suggests combined treatment is safe; however, long-term follow-up data are limited. Efficacy of CHM is related to the mechanistic actions of contained constituents, some of which coincide with conventional drug treatment targets. PSORI-CM01 has in vitro and in vivo evidence indicating its therapeutic benefit is via modulation of known psoriatic biological pathways. The current pilot will provide data on the feasibility of a larger-scale study and provide preliminary data for PSORI-CM01 efficacy and safety