Search CORE

296 research outputs found

Artificial neural network approach for selection of susceptible single nucleotide polymorphisms and construction of prediction model on childhood allergic asthma

Author: Hasegawa Yuko
Honda Hiroyuki
Kobayashi Takeshi
Shirakawa Taro
Suzuki Yoichi
Tomida Shuta
Tomita Yasuyuki
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

BACKGROUND: Screening of various gene markers such as single nucleotide polymorphism (SNP) and correlation between these markers and development of multifactorial disease have previously been studied. Here, we propose a susceptible marker-selectable artificial neural network (ANN) for predicting development of allergic disease. RESULTS: To predict development of childhood allergic asthma (CAA) and select susceptible SNPs, we used an ANN with a parameter decreasing method (PDM) to analyze 25 SNPs of 17 genes in 344 Japanese people, and select 10 susceptible SNPs of CAA. The accuracy of the ANN model with 10 SNPs was 97.7% for learning data and 74.4% for evaluation data. Important combinations were determined by effective combination value (ECV) defined in the present paper. Effective 2-SNP or 3-SNP combinations were found to be concentrated among the 10 selected SNPs. CONCLUSION: ANN can reliably select SNP combinations that are associated with CAA. Thus, the ANN can be used to characterize development of complex diseases caused by multiple factors. This is the first report of automatic selection of SNPs related to development of multifactorial disease from SNP data of more than 300 patients

Springer - Publisher Connector

PubMed Central

COMPARISON OF ANN METHOD AND LOGISTIC REGRESSION METHOD ON SINGLE NUCLEOTIDE POLYMORPHISM GENETIC DATA

Author: Setiawan Adi
Wijaya Rachel Wulan Nirmalasari
Publication venue: 'Universitas Pattimura'
Publication date: 16/04/2023
Field of study

This study aims to determine the goodness of classification using the ANN method on Asthma genetic data in the R program package, namely SNPassoc. SNP genetic data was transformed using codominant genetic traits, namely for genetic data AA, AC, CC were given a score of 0, 0.5 and 1, respectively, while CC, CT and TT were scored 0, 0.5 and 1, respectively. The scoring is based on the smallest alphabetical order given a low score. The average accuracy, precision, recall and F1 score were determined using the neural network method if the genetic code was used with variations in the proportion of test data 10%, 20%, 30% and 40% and repeated B = 1000 times. The results obtained were compared with the logistic regression method. If 20% test data is used and the ANN method is used, the accuracy, precision, recall and F1 scores are 0.7756, 0.7844, 0.9844 and 0.8728, respectively. When all information from various countries is used in the Asthma genetic data, the logistic regression method gives higher average accuracy, precision and F1 scores than the ANN method, but the average recall is the opposite. When a separate analysis is performed for each country, the logistic regression method gives higher accuracy, precision, recall and F1 scores in the ANN method compared to the logistic regression method

OJS UNPATTI Publication Center (Universitas Pattimura)

A multifactorial analysis of obesity as CVD risk factor: Use of neural network based methods in a nutrigenetics context

Author: A Boutayeb
A Bureau
AA Motsinger
AE Duncan
AG Heidema
AJ Frint
BM Popkin
BV North
CF Sing
D Goldberg
DL McGee
DS Moore
I Arkadianos
I Valavanis
Ioannis K Valavanis
J Arifovic
J Robitaille
J Stevens
J Wakefield
J Xu
JM Ordovas
Keith A Grimaldi
Konstantina S Nikita
L Briollais
LW Hahn
MD Ritchie
MD Ritchie
MR Chernick
N Karnehed
PH Liu
PWF Wilson
R Nakamichi
R Sodjinou
S Canizales-Cuinteros
S Haykin
S Tomida
SG Mougiakakou
SG Mougiakakou
SG Mougiakakou
SM Hermann
SM Williams
Stavroula G Mougiakakou
TA Pearson
W Yu
Y Tomita
YM Cho
Z Wei
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Obesity is a multifactorial trait, which comprises an independent risk factor for cardiovascular disease (CVD). The aim of the current work is to study the complex etiology beneath obesity and identify genetic variations and/or factors related to nutrition that contribute to its variability. To this end, a set of more than 2300 white subjects who participated in a nutrigenetics study was used. For each subject a total of 63 factors describing genetic variants related to CVD (24 in total), gender, and nutrition (38 in total), e.g. average daily intake in calories and cholesterol, were measured. Each subject was categorized according to body mass index (BMI) as normal (BMI ≤ 25) or overweight (BMI > 25). Two artificial neural network (ANN) based methods were designed and used towards the analysis of the available data. These corresponded to i) a multi-layer feed-forward ANN combined with a parameter decreasing method (PDM-ANN), and ii) a multi-layer feed-forward ANN trained by a hybrid method (GA-ANN) which combines genetic algorithms and the popular back-propagation training algorithm. Results PDM-ANN and GA-ANN were comparatively assessed in terms of their ability to identify the most important factors among the initial 63 variables describing genetic variations, nutrition and gender, able to classify a subject into one of the BMI related classes: normal and overweight. The methods were designed and evaluated using appropriate training and testing sets provided by 3-fold Cross Validation (3-CV) resampling. Classification accuracy, sensitivity, specificity and area under receiver operating characteristics curve were utilized to evaluate the resulted predictive ANN models. The most parsimonious set of factors was obtained by the GA-ANN method and included gender, six genetic variations and 18 nutrition-related variables. The corresponding predictive model was characterized by a mean accuracy equal of 61.46% in the 3-CV testing sets. Conclusions The ANN based methods revealed factors that interactively contribute to obesity trait and provided predictive models with a promising generalization ability. In general, results showed that ANNs and their hybrids can provide useful tools for the study of complex traits in the context of nutrigenetics.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

DSpace at NTUA

Bern Open Repository and Information System (BORIS)

Machine learning in genome-wide association studies

Author: Amos
Amos
Arshadi
Breiman
Breiman
Breiman
Croiseau
Cupples
D'Angelo
Dasarathy
Dietterich
Díaz-Uriarte
Easton
Frazer
Freund
Friedman
Friedman
García-Magariños
González-Recio
Hastie
Heckerman
Hoerl
Kim
Kraja
Lettre
Malo
Marchini
Meier
Mohlke
Park
Pearl
Plenge
Ripley
Rumelhart
Samani
Schwarz
Schwarz
Sebastiani
Stassen
Strobl
Sun
Sun
Sun
Tang
Tibshirani
Tomita
Vapnik
Wan
Wang
Wu
Yang
Yuan
Ziegler
Publication venue: 'Wiley'
Publication date: 01/01/2009
Field of study

Recently, genome-wide association studies have substantially expanded our knowledge about genetic variants that influence the susceptibility to complex diseases. Although standard statistical tests for each single-nucleotide polymorphism (SNP) separately are able to capture main genetic effects, different approaches are necessary to identify SNPs that influence disease risk jointly or in complex interactions. Experimental and simulated genome-wide SNP data provided by the Genetic Analysis Workshop 16 afforded an opportunity to analyze the applicability and benefit of several machine learning methods. Penalized regression, ensemble methods, and network analyses resulted in several new findings while known and simulated genetic risk variants were also identified. In conclusion, machine learning approaches are promising complements to standard single-and multi-SNP analysis methods for understanding the overall genetic architecture of complex human diseases. However, because they are not optimized for genome-wide SNP data, improved implementations and new variable selection procedures are required. Genet. Epidemiol . 33 (Suppl. 1):S51–S57, 2009. © 2009 Wiley-Liss, Inc.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/64533/1/20473_ftp.pd

Crossref

Deep Blue Documents at the University of Michigan

Neural networks for genetic epidemiology: past, present, and future

During the past two decades, the field of human genetics has experienced an information explosion. The completion of the human genome project and the development of high throughput SNP technologies have created a wealth of data; however, the analysis and interpretation of these data have created a research bottleneck. While technology facilitates the measurement of hundreds or thousands of genes, statistical and computational methodologies are lacking for the analysis of these data. New statistical methods and variable selection strategies must be explored for identifying disease susceptibility genes for common, complex diseases. Neural networks (NN) are a class of pattern recognition methods that have been successfully implemented for data mining and prediction in a variety of fields. The application of NN for statistical genetics studies is an active area of research. Neural networks have been applied in both linkage and association analysis for the identification of disease susceptibility genes

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A genetic ensemble approach for gene-gene interaction identification

Author: A Beygelzimer
A Tsymbal
AA Motsinger
AG Heidema
Albert Y Zomaya
B McKinney
Bing B Zhou
C Greene
D Arking
D Nielsen
D Quigley
D Ruta
D Ruta
D Thomas
D Velez
E Rogaeva
G Bontempi
G Brown
H Cordell
H Cordell
H Zhang
J Cleary
J Hoh
J Kittler
J Moore
JC Barrett
JH Moore
JH Moore
JL Haines
Joshua WK Ho
L Breiman
L Briollais
L Lam
LE Mechanic
LI Kuncheva
LW Hahn
M Kudo
M Ritchie
MR Nelson
P Lucek
Pengyi Yang
R Duerr
R Klein
R Somorjai
S Cantor
S Fisher
S Schmidt
SK Musani
TG Dietterich
X Chen
Y Freund
Y Tomita
Z Zhang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background It has now become clear that gene-gene interactions and gene-environment interactions are ubiquitous and fundamental mechanisms for the development of complex diseases. Though a considerable effort has been put into developing statistical models and algorithmic strategies for identifying such interactions, the accurate identification of those genetic interactions has been proven to be very challenging. Methods In this paper, we propose a new approach for identifying such gene-gene and gene-environment interactions underlying complex diseases. This is a hybrid algorithm and it combines genetic algorithm (GA) and an ensemble of classifiers (called genetic ensemble). Using this approach, the original problem of SNP interaction identification is converted into a data mining problem of combinatorial feature selection. By collecting various single nucleotide polymorphisms (SNP) subsets as well as environmental factors generated in multiple GA runs, patterns of gene-gene and gene-environment interactions can be extracted using a simple combinatorial ranking method. Also considered in this study is the idea of combining identification results obtained from multiple algorithms. A novel formula based on pairwise <it>double fault </it>is designed to quantify the degree of complementarity. Conclusions Our simulation study demonstrates that the proposed genetic ensemble algorithm has comparable identification power to Multifactor Dimensionality Reduction (MDR) and is slightly better than Polymorphism Interaction Analysis (PIA), which are the two most popular methods for gene-gene interaction identification. More importantly, the identification results generated by using our genetic ensemble algorithm are highly complementary to those obtained by PIA and MDR. Experimental results from our simulation studies and real world data application also confirm the effectiveness of the proposed genetic ensemble algorithm, as well as the potential benefits of combining identification results from different algorithms.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

TRIO LOGIC REGRESSION - DETECTION OF SNP - SNP INTERACTIONS IN CASE-PARENT TRIOS

Author: Fallin M. Daniele
Li Qing
Louis Thomas A.
Ruczinski Ingo
Publication venue: Collection of Biostatistics Research Archive
Publication date: 01/07/2009
Field of study

Statistical approaches to evaluate higher order SNP-SNP and SNP-environment interactions are critical in genetic association studies, as susceptibility to complex disease is likely to be related to the interaction of multiple SNPs and environmental factors. Logic regression (Kooperberg et al., 2001; Ruczinski et al., 2003) is one such approach, where interactions between SNPs and environmental variables are assessed in a regression framework, and interactions become part of the model search space. In this manuscript we extend the logic regression methodology, originally developed for cohort and case-control studies, for studies of trios with affected probands. Trio logic regression accounts for the linkage disequilibrium (LD) structure in the genotype data, and accommodates missing genotypes via haplotype-based imputation. We also derive an efficient algorithm to simulate case-parent trios where genetic risk is determined via epistatic interactions

Collection Of Biostatistics Research Archive

Large–scale data–driven network analysis of human–plasmodium falciparum interactome: extracting essential targets and processes for malaria drug discovery

Author: Agamah Francis Edem
Publication venue: Department of Pathology
Publication date: 09/09/2020
Field of study

Background: Plasmodium falciparum malaria is an infectious disease considered to have great impact on public health due to its associated high mortality rates especially in sub Saharan Africa. Falciparum drugresistant strains, notably, to chloroquine and sulfadoxine-pyrimethamine in Africa is traced mainly to Southeast Asia where artemisinin resistance rate is increasing. Although careful surveillance to monitor the emergence and spread of artemisinin-resistant parasite strains in Africa is on-going, research into new drugs, particularly, for African populations, is critical since there is no replaceable drug for artemisinin combination therapies (ACTs) yet. Objective: The overall objective of this study is to identify potential protein targets through host–pathogen protein–protein functional interaction network analysis to understand the underlying mechanisms of drug failure and identify those essential targets that can play their role in predicting potential drug candidates specific to the African populations through a protein-based approach of both host and Plasmodium falciparum genomic analysis. Methods: We leveraged malaria-specific genome wide association study summary statistics data obtained from Gambia, Kenya and Malawi populations, Plasmodium falciparum selective pressure variants and functional datasets (protein sequences, interologs, host-pathogen intra-organism and host-pathogen inter-organism protein-protein interactions (PPIs)) from various sources (STRING, Reactome, HPID, Uniprot, IntAct and literature) to construct overlapping functional network for both host and pathogen. Developed algorithms and a large-scale data-driven computational framework were used in this study to analyze the datasets and the constructed networks to identify densely connected subnetworks or hubs essential for network stability and integrity. The host-pathogen network was analyzed to elucidate the influence of parasite candidate key proteins within the network and predict possible resistant pathways due to host-pathogen candidate key protein interactions. We performed biological and pathway enrichment analysis on critical proteins identified to elucidate their functions. In order to leverage disease-target-drug relationships to identify potential repurposable already approved drug candidates that could be used to treat malaria, pharmaceutical datasets from drug bank were explored using semantic similarity approach based of target–associated biological processes Results: About 600,000 significant SNPs (p-value< 0.05) from the summary statistics data were mapped to their associated genes, and we identified 79 human-associated malaria genes. The assembled parasite network comprised of 8 clusters containing 799 functional interactions between 155 reviewed proteins of which 5 clusters contained 43 key proteins (selective variants) and 2 clusters contained 2 candidate key proteins(key proteins characterized by high centrality measure), C6KTB7 and C6KTD2. The human network comprised of 32 clusters containing 4,133,136 interactions between 20,329 unique reviewed proteins of which 7 clusters contained 760 key proteins and 2 clusters contained 6 significant human malaria-associated candidate key proteins or genes P22301 (IL10), P05362 (ICAM1), P01375 (TNF), P30480 (HLA-B), P16284 (PECAM1), O00206 (TLR4). The generated host-pathogen network comprised of 31,512 functional interactions between 8,023 host and pathogen proteins. We also explored the association of pfk13 gene within the host-pathogen. We observed that pfk13 cluster with host kelch–like proteins and other regulatory genes but no direct association with our identified host candidate key malaria targets. We implemented semantic similarity based approach complemented by Kappa and Jaccard statistical measure to identify 115 malaria–similar diseases and 26 potential repurposable drug hits that can be 3 appropriated experimentally for malaria treatment. Conclusion: In this study, we reviewed existing antimalarial drugs and resistance–associated variants contributing to the diminished sensitivity of antimalarials, especially chloroquine, sulfadoxine-pyrimethamine and artemisinin combination therapy within the African population. We also described various computational techniques implemented in predicting drug targets and leads in drug research. In our data analysis, we showed that possible mechanisms of resistance to artemisinin in Africa may arise from the combinatorial effects of many resistant genes to chloroquine and sulfadoxine–pyrimethamine. We investigated the role of pfk13 within the host–pathogen network. We predicted key targets that have been proposed to be essential for malaria drug and vaccine development through structural and functional analysis of host and pathogen function networks. Based on our analysis, we propose these targets as essential co-targets for combinatorial malaria drug discovery

Cape Town University OpenUCT

Discovering Higher-order SNP Interactions in High-dimensional Genomic Data

Author: Uppu Suneetha
Publication venue: Curtin University
Publication date: 01/01/2018
Field of study

In this thesis, a multifactor dimensionality reduction based method on associative classification is employed to identify higher-order SNP interactions for enhancing the understanding of the genetic architecture of complex diseases. Further, this thesis explored the application of deep learning techniques by providing new clues into the interaction analysis. The performance of the deep learning method is maximized by unifying deep neural networks with a random forest for achieving reliable interactions in the presence of noise

espace@Curtin

Recommended from our members

In vitro expanded human CD4+CD25+ regulatory T cells suppress effector T cell proliferation.

Author: Bluestone JA
Bonyhadi ML
Earle KE
Liu W
Tang Q
Zhou X
Zhu S
Publication venue: eScholarship, University of California
Publication date: 01/04/2005
Field of study

Regulatory T cells (Tregs) have been shown to be critical in the balance between autoimmunity and tolerance and have been implicated in several human autoimmune diseases. However, the small number of Tregs in peripheral blood limits their therapeutic potential. Therefore, we developed a protocol that would allow for the expansion of Tregs while retaining their suppressive activity. We isolated CD4+CD25 hi cells from human peripheral blood and expanded them in vitro in the presence of anti-CD3 and anti-CD28 magnetic Xcyte Dynabeads and high concentrations of exogenous Interleukin (IL)-2. Tregs were effectively expanded up to 200-fold while maintaining surface expression of CD25 and other markers of Tregs: CD62L, HLA-DR, CCR6, and FOXP3. The expanded Tregs suppressed proliferation and cytokine secretion of responder PBMCs in co-cultures stimulated with anti-CD3 or alloantigen. Treg expansion is a critical first step before consideration of Tregs as a therapeutic intervention in patients with autoimmune or graft-versus-host disease

eScholarship - University of California