3 research outputs found
Development of an epileptic seizure prediction algorithm using R–R intervals with self-attentive autoencoder
Epilepsy is a neurological disorder that may affect the autonomic nervous system (ANS) from 15 to 20 min before seizure onset, and disturbances of ANS affect R–R intervals (RRI) on an electrocardiogram (ECG). This study aims to develop a machine learning algorithm for predicting focal epileptic seizures by monitoring R–R interval (RRI) data in real time. The developed algorithm adopts a self-attentive autoencoder (SA-AE), which is a neural network for time-series data. The results of applying the developed seizure prediction algorithm to clinical data demonstrated that it functioned well in most patients; however, false positives (FPs) occurred in specific participants. In a future work, we will investigate the causes of FPs and optimize the developing seizure prediction algorithm to further improve performance using newly added clinical data
FexSplice: A LightGBM-Based Model for Predicting the Splicing Effect of a Single Nucleotide Variant Affecting the First Nucleotide G of an Exon
Single nucleotide variants (SNVs) affecting the first nucleotide G of an exon (Fex-SNVs) identified in various diseases are mostly recognized as missense or nonsense variants. Their effect on pre-mRNA splicing has been seldom analyzed, and no curated database is available. We previously reported that Fex-SNVs affect splicing when the length of the polypyrimidine tract is short or degenerate. However, we cannot readily predict the splicing effects of Fex-SNVs. We here scrutinized the available literature and identified 106 splicing-affecting Fex-SNVs based on experimental evidence. We similarly identified 106 neutral Fex-SNVs in the dbSNP database with a global minor allele frequency (MAF) of more than 0.01 and less than 0.50. We extracted 115 features representing the strength of splicing cis-elements and developed machine-learning models with support vector machine, random forest, and gradient boosting to discriminate splicing-affecting and neutral Fex-SNVs. Gradient boosting-based LightGBM outperformed the other two models, and the length and nucleotide compositions of the polypyrimidine tract played critical roles in the discrimination. Recursive feature elimination showed that the LightGBM model using 15 features achieved the best performance with an accuracy of 0.80 ± 0.12 (mean and SD), a Matthews Correlation Coefficient (MCC) of 0.57 ± 0.15, an area under the curve of the receiver operating characteristics curve (AUROC) of 0.86 ± 0.08, and an area under the curve of the precision–recall curve (AUPRC) of 0.87 ± 0.09 using a 10-fold cross-validation. We developed a web service program, named FexSplice that accepts a genomic coordinate either on GRCh37/hg19 or GRCh38/hg38 and returns a predicted probability of aberrant splicing of A, C, and T variants