12 research outputs found

    Statistical Methods for High Dimensional Networked Data Analysis.

    Full text link
    Networked data are frequently encountered in many scientific disciplines. One major challenges in the analysis of such data are its high dimensionality and complex dependence. My dissertation consists of three projects. The first project focuses on the development of sparse multivariate factor analysis regression model to construct the underlying sparse association map between gene expressions and biomarkers. This is motivated by the fact that some associations may be obscured by unknown confounding factors that are not collected in the data. I have shown that accounting for such unobserved confounding factors can increase both sensitivity and specificity for detecting important gene-biomarker associations and thus lead to more interpretable association maps. The second project concerns the reconstruction of the underlying gene regulatory network using directed acyclic graphical models. My project aims to reduce false discoveries by identifying and removing edges resulted from shared confounding factors. I propose sparse structural factor equation models, in which structural equation models are used to capture directed graphs while factor analysis models are used to account for potential latent factors. I have shown that the proposed method enables me to obtain a simpler and more interpretable topology of a gene regulatory network. The third project is devoted to the development of a new regression analysis methodology to analyze electroencephalogram (EEG) neuroimaging data that are correlated among electrodes within an EEG-net. To address analytic challenges pertaining to the integration of network topology into the analysis, I propose hybrid quadratic inference functions that utilize both prior and data-driven correlations among network nodes into statistical estimation and inference. The proposed method is conceptually simple and computationally fast and more importantly has appealing large-sample properties. In a real EEG data analysis I applied the proposed method to detect significant association of iron deficiency on event-related potential measured in two subregions, which was not found using the classical spatial ANOVA random-effects models.PHDBiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111595/1/zhouyan_1.pd

    Bayesian Modeling and Estimation Techniques for the Analysis of Neuroimaging Data

    Get PDF
    Brain function is hallmarked by its adaptivity and robustness, arising from underlying neural activity that admits well-structured representations in the temporal, spatial, or spectral domains. While neuroimaging techniques such as Electroencephalography (EEG) and magnetoencephalography (MEG) can record rapid neural dynamics at high temporal resolutions, they face several signal processing challenges that hinder their full utilization in capturing these characteristics of neural activity. The objective of this dissertation is to devise statistical modeling and estimation methodologies that account for the dynamic and structured representations of neural activity and to demonstrate their utility in application to experimentally-recorded data. The first part of this dissertation concerns spectral analysis of neural data. In order to capture the non-stationarities involved in neural oscillations, we integrate multitaper spectral analysis and state-space modeling in a Bayesian estimation setting. We also present a multitaper spectral analysis method tailored for spike trains that captures the non-linearities involved in neuronal spiking. We apply our proposed algorithms to both EEG and spike recordings, which reveal significant gains in spectral resolution and noise reduction. In the second part, we investigate cortical encoding of speech as manifested in MEG responses. These responses are often modeled via a linear filter, referred to as the temporal response function (TRF). While the TRFs estimated from the sensor-level MEG data have been widely studied, their cortical origins are not fully understood. We define the new notion of Neuro-Current Response Functions (NCRFs) for simultaneously determining the TRFs and their cortical distribution. We develop an efficient algorithm for NCRF estimation and apply it to MEG data, which provides new insights into the cortical dynamics underlying speech processing. Finally, in the third part, we consider the inference of Granger causal (GC) influences in high-dimensional time series models with sparse coupling. We consider a canonical sparse bivariate autoregressive model and define a new statistic for inferring GC influences, which we refer to as the LASSO-based Granger Causal (LGC) statistic. We establish non-asymptotic guarantees for robust identification of GC influences via the LGC statistic. Applications to simulated and real data demonstrate the utility of the LGC statistic in robust GC identification

    Ultra-high field MRI: parallel-transmit arrays and RF pulse design

    Get PDF
    This paper reviews the field of multiple or parallel radiofrequency (RF) transmission for magnetic resonance imaging (MRI). Currently the use of ultra-high field (UHF) MRI at 7 tesla and above is gaining popularity, yet faces challenges with non-uniformity of the RF field and higher RF power deposition. Since its introduction in the early 2000s, parallel transmission (pTx) has been recognized as a powerful tool for accelerating spatially selective RF pulses and combating the challenges associated with RF inhomogeneity at UHF. We provide a survey of the types of dedicated RF coils used commonly for pTx and the important modeling of the coil behavior by electromagnetic (EM) field simulations. We also discuss the additional safety considerations involved with pTx such as the specific absorption rate (SAR) and how to manage them. We then describe the application of pTx with RF pulse design, including a practical guide to popular methods. Finally, we conclude with a description of the current and future prospects for pTx, particularly its potential for routine clinical use

    Modeling Multiple-Subject and Discrete-Valued High-Dimensional Time Series

    Get PDF
    This thesis focuses on two separate topics in modeling of high-dimensional time series (HDTS) with several structures and their various applications. The first topic is on modeling HDTS from multiple subjects. Here, the structures of interest include model components that are shared by all subjects and that are individual to subjects or their groups. A running theme in this modeling is the heterogeneity of subjects. Dealing with heterogeneous data has been of particular interest recently in social, health, behavioral, and other sciences. The second topic is on modeling HDTS that are discrete-valued, including binary, categorical, and non-negative count observations. Compared with continuous time series modeling where autoregressive-type models dominate, there are no generally preferred models in the discrete setting. The models considered in this thesis are based on latent Gaussian processes, which drive the dynamics of the observed discrete-valued series. The models have the advantages of allowing negative autocorrelations, and flexible choices of marginal distributions of discrete observations. The thesis consists of four projects, with two on each topic. The first project proposes a stratified Lasso (multi-task learning) formulation for vector autoregressive (VAR) models from multiple subjects. The VAR transition matrices are decomposed additively into the common components shared across all subjects and individual components specific to each subject. An efficient estimation procedure combined with cross-validation for several tuning parameters is designed. The simulation study shows that the approach performs well in the presence of heterogeneity across individual dynamics for the different levels of sparsity. The model is applied to intensive longitudinal data of the emotional states to reveal common and individual temporal dependences of daily emotions across study participants. The proposed model enhances interpretability and forecasting performance, which are expected to be beneficial in assessing conflicting evidence from empirical studies and establishing universal explanations of the studied phenomenon. The second project develops integrative dynamic factor models (DFMs) for multiple subjects in several groups. The models have components that allow one to explore the inter-differences across subjects (and groups). At the same time, the intra-differences can be investigated by reconstructing the individual temporal dynamics of different subjects. A flexible identifiability condition on the factor covariance is adopted, which expands the scope of heterogeneity and contributes to better model interpretation and forecasting results. From a methodological standpoint, a novel algorithm that combines non-iterative block segmentation, efficient rank selection, and variants of PCA for multiple subjects, is suggested. Simulations under various scenarios and analysis of resting-state functional MRI data collected from multiple subjects are conducted. The third project concerns latent Gaussian DFMs for count HDTS. The proposed estimation procedure combines the classical PCA, Yule-Walker equations, and link functions, which are pairwise mappings of the second-order properties of the latent and observed time series. The forecasting is carried out through a particle-based sequential Monte Carlo method, which approximates predictions of counts, driven by the latent DFM generated through Kalman recursions. Simulation results reveal that the estimation approach performs similarly to the usual DFMs, and the model provides better forecasting results than the considered benchmarks. The model is applied to item response data from psychology, where the existence of latent factors has been verified but their temporal dependence has not been studied yet. The fourth project considers the analogous models for count HDTS but where the latent Gaussian time series follows a sparse VAR. A penalized estimation procedure based on Lasso and its adaptive form is explored for latent Gaussian VAR. An alternative proposed formulation leverages the second-order properties of the latent process directly. Along with the estimation of link functions, we suggest a data-splitting strategy, which can select tuning parameters for penalization. Simulations under various marginal count distributions and patterns of transition matrices are performed. A data example of major depressive disorder in psychiatry is considered to illustrate the modeling approach.Doctor of Philosoph

    Generalized averaged Gaussian quadrature and applications

    Get PDF
    A simple numerical method for constructing the optimal generalized averaged Gaussian quadrature formulas will be presented. These formulas exist in many cases in which real positive GaussKronrod formulas do not exist, and can be used as an adequate alternative in order to estimate the error of a Gaussian rule. We also investigate the conditions under which the optimal averaged Gaussian quadrature formulas and their truncated variants are internal

    MS FT-2-2 7 Orthogonal polynomials and quadrature: Theory, computation, and applications

    Get PDF
    Quadrature rules find many applications in science and engineering. Their analysis is a classical area of applied mathematics and continues to attract considerable attention. This seminar brings together speakers with expertise in a large variety of quadrature rules. It is the aim of the seminar to provide an overview of recent developments in the analysis of quadrature rules. The computation of error estimates and novel applications also are described
    corecore