152 research outputs found
Nonparametric spectral analysis with applications to seizure characterization using EEG time series
Understanding the seizure initiation process and its propagation pattern(s)
is a critical task in epilepsy research. Characteristics of the pre-seizure
electroencephalograms (EEGs) such as oscillating powers and high-frequency
activities are believed to be indicative of the seizure onset and spread
patterns. In this article, we analyze epileptic EEG time series using
nonparametric spectral estimation methods to extract information on
seizure-specific power and characteristic frequency [or frequency band(s)].
Because the EEGs may become nonstationary before seizure events, we develop
methods for both stationary and local stationary processes. Based on penalized
Whittle likelihood, we propose a direct generalized maximum likelihood (GML)
and generalized approximate cross-validation (GACV) methods to estimate
smoothing parameters in both smoothing spline spectrum estimation of a
stationary process and smoothing spline ANOVA time-varying spectrum estimation
of a locally stationary process. We also propose permutation methods to test if
a locally stationary process is stationary. Extensive simulations indicate that
the proposed direct methods, especially the direct GML, are stable and perform
better than other existing methods. We apply the proposed methods to the
intracranial electroencephalograms (IEEGs) of an epileptic patient to gain
insights into the seizure generation process.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS185 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Sequential Kalman filter for fast online changepoint detection in longitudinal health records
This article introduces the sequential Kalman filter, a computationally
scalable approach for online changepoint detection with temporally correlated
data. The temporal correlation was not considered in the Bayesian online
changepoint detection approach due to the large computational cost. Motivated
by detecting COVID-19 infections for dialysis patients from massive
longitudinal health records with a large number of covariates, we develop a
scalable approach to detect multiple changepoints from correlated data by
sequentially stitching Kalman filters of subsequences to compute the joint
distribution of the observations, which has linear computational complexity
with respect to the number of observations between the last detected
changepoint and the current observation at each time point, without
approximating the likelihood function. Compared to other online changepoint
detection methods, simulated experiments show that our approach is more precise
in detecting single or multiple changes in mean, variance, or correlation for
temporally correlated data. Furthermore, we propose a new way to integrate
classification and changepoint detection approaches that improve the detection
delay and accuracy for detecting COVID-19 infection compared to other
alternatives
ExonImpact: prioritizing pathogenic alternative splicing events
Alternative splicing (AS) is a closely regulated process that allows a single gene to encode multiple protein isoforms, thereby contributing to the diversity of the proteome. Dysregulation of the splicing process has been found to be associated with many inherited diseases. However, among the pathogenic AS events, there are numerous “passenger” events whose inclusion or exclusion does not lead to significant changes with respect to protein function. In this study, we evaluate the secondary and tertiary structural features of proteins associated with disease-causing and neutral AS events, and show that several structural features are strongly associated with the pathological impact of exon inclusion. We further develop a machine-learning-based computational model, ExonImpact, for prioritizing and evaluating the functional consequences of hitherto uncharacterized AS events. We evaluated our model using several strategies including cross-validation, and data from the Gene-Tissue Expression (GTEx) and ClinVar databases. ExonImpact is freely available at http://watson.compbio.iupui.edu/ExonImpact
Galaxy Light profile neural Networks (GaLNets). II. Bulge-Disc decomposition in optical space-based observations
Bulge-disk (B-D) decomposition is an effective diagnostic to characterize the
galaxy morphology and understand its evolution across time. So far,
high-quality data have allowed detailed B-D decomposition to redshift below
0.5, with limited excursions over small volumes at higher redshifts.
Next-generation large sky space surveys in optical, e.g. from the China Space
Station Telescope (CSST), and near-infrared, e.g. from the space EUCLID
mission, will produce a gigantic leap in these studies as they will provide
deep, high-quality photometric images over more than 15000 deg2 of the sky,
including billions of galaxies. Here, we extend the use of the Galaxy Light
profile neural Network (GaLNet) to predict 2-S\'ersic model parameters,
specifically from CSST data. We simulate point-spread function (PSF) convolved
galaxies, with realistic B-D parameter distributions, on CSST mock observations
to train the new GaLNet and predict the structural parameters (e.g. magnitude,
effective radius, Sersic index, axis ratio, etc.) of both bulge and disk
components. We find that the GaLNet can achieve very good accuracy for most of
the B-D parameters down to an -band magnitude of 23.5 and redshift 1.
The best accuracy is obtained for magnitudes, implying accurate bulge-to-total
(B/T) estimates. To further forecast the CSST performances, we also discuss the
results of the 1-S\'ersic GaLNet and show that CSST half-depth data will allow
us to derive accurate 1-component models up to 24 and redshift
z1.7
regSNPs-splicing: a tool for prioritizing synonymous single-nucleotide substitution
While synonymous single-nucleotide variants (sSNVs) have largely been unstudied, since they do not alter protein sequence, mounting evidence suggests that they may affect RNA conformation, splicing, and the stability of nascent-mRNAs to promote various diseases. Accurately prioritizing deleterious sSNVs from a pool of neutral ones can significantly improve our ability of selecting functional genetic variants identified from various genome-sequencing projects, and, therefore, advance our understanding of disease etiology. In this study, we develop a computational algorithm to prioritize sSNVs based on their impact on mRNA splicing and protein function. In addition to genomic features that potentially affect splicing regulation, our proposed algorithm also includes dozens structural features that characterize the functions of alternatively spliced exons on protein function. Our systematical evaluation on thousands of sSNVs suggests that several structural features, including intrinsic disorder protein scores, solvent accessible surface areas, protein secondary structures, and known and predicted protein family domains, show significant differences between disease-causing and neutral sSNVs. Our result suggests that the protein structure features offer an added dimension of information while distinguishing disease-causing and neutral synonymous variants. The inclusion of structural features increases the predictive accuracy for functional sSNV prioritization
Genetic evidence for a causal relationship between type 2 diabetes and peripheral artery disease in both Europeans and East Asians
Abstract: Background: Observational studies have revealed that type 2 diabetes (T2D) is associated with an increased risk of peripheral artery disease (PAD). However, whether the two diseases share a genetic basis and whether the relationship is causal remain unclear. It is also unclear as to whether these relationships differ between ethnic groups. Methods: By leveraging large-scale genome-wide association study (GWAS) summary statistics of T2D (European-based: Ncase = 21,926, Ncontrol = 342,747; East Asian-based: Ncase = 36,614, Ncontrol = 155,150) and PAD (European-based: Ncase = 5673, Ncontrol = 359,551; East Asian-based: Ncase = 3593, Ncontrol = 208,860), we explored the genetic correlation and putative causal relationship between T2D and PAD in both Europeans and East Asians using linkage disequilibrium score regression and seven Mendelian randomization (MR) models. We also performed multi-trait analysis of GWAS and two gene-based analyses to reveal candidate variants and risk genes involved in the shared genetic basis between T2D and PAD. Results: We observed a strong genetic correlation (rg) between T2D and PAD in both Europeans (rg = 0.51; p-value = 9.34 × 10−15) and East Asians (rg = 0.46; p-value = 1.67 × 10−12). The MR analyses provided consistent evidence for a causal effect of T2D on PAD in both ethnicities (odds ratio [OR] = 1.05 to 1.28 for Europeans and 1.15 to 1.27 for East Asians) but not PAD on T2D. This putative causal effect was not influenced by total cholesterol, body mass index, systolic blood pressure, or smoking initiation according to multivariable MR analysis, and the genetic overlap between T2D and PAD was further explored employing an independent European sample through polygenic risk score regression. Multi-trait analysis of GWAS revealed two novel European-specific single nucleotide polymorphisms (rs927742 and rs1734409) associated with the shared genetic basis of T2D and PAD. Gene-based analyses consistently identified one gene ANKFY1 and gene-gene interactions (e.g., STARD10 [European-specific] to AP3S2 [East Asian-specific]; KCNJ11 [European-specific] to KCNQ1 [East Asian-specific]) associated with the trans-ethnic genetic overlap between T2D and PAD, reflecting a common genetic basis for the co-occurrence of T2D and PAD in both Europeans and East Asians. Conclusions: Our study provides the first evidence for a genetically causal effect of T2D on PAD in both Europeans and East Asians. Several candidate variants and risk genes were identified as being associated with this genetic overlap. Our findings emphasize the importance of monitoring PAD status in T2D patients and suggest new genetic biomarkers for screening PAD risk among patients with T2D
- …