363 research outputs found

    Semiparametric and nonparametric methods in data mining and statistical learning with applications in public health surveillance and personalized medicine

    Get PDF
    The field of statistical learning has been growing rapidly over the past few decades, with a diverse range of applications. In this dissertation, we develop methodology mainly using semiparametric and nonparametric statistical learning techniques for the areas of public health surveillance and personalized medicine. Surveillance, providing early warning for impending emergencies, is a key function of public health. In Chapter 2, we propose a semiparametric spatiotemporal method to model spatiotemporal lattice data via a local linear fitting combined with day-of-week effects, in which both spatial and temporal information are taken into account. Detection of abnormal events are carried out using an ARMA time series technique for residuals combined with a resampling approach to determine the threshold for significance. We conduct simulations to assess the performance of the proposed method. Also, the method is illustrated using the data on daily asthma admissions collected through North Carolina emergency departments that occurred between 2006 and 2007. There is increasing interest in personalized medicine: the idea of tailoring treatment for each individual to optimize patient outcome. In Chapter 3, we focus on the single-decision setup. We show that estimating such an optimal treatment rule is equivalent to a classification problem where each subject is weighted proportional to his or her clinical outcome, although the true class labels, to which treatment group the patients belong as the optimal, are unknown in the training set. We then propose a new approach based on the support vector machine framework from computer science. We show the resulting estimator of the treatment rule is consistent, and further derive fairly accurate convergence rates for this estimator. The performance of the proposed approach is demonstrated via simulation studies and an analysis of chronic depression data. It is not uncommon that the best clinical strategies may require adaptation over time. We thus in Chapter 4 generalize the outcome weighted learning method to the multi-decision setup, aiming at finding the dynamic treatment regimes, customized sequential decision rules for individual patients which can adapt over time to the evolving illness, to maximize the long term health outcome. Inspired by the intrinsic idea in dynamic programming, we conduct outcome weighted learning for each stage backwards through time. We further introduce an iterative procedure which can improve the performance of the algorithm. The methods are evaluated by simulation studies and an analysis on a smoking cessation data set

    Robust Modeling of Spatio-Temporal Dependencies and Hot Spots

    Get PDF

    Development of a methodology to fill gaps in MODIS LST data for Antarctica

    Get PDF
    Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial TechnologiesLand Surface Temperature (LST) is an essential parameter for analyzing many environmental questions. Lack of high spatio-temporal resolution of LST data in Antarctica limits the understanding of climatological, ecological processes. The MODIS LST product is a promising source that provides daily LST data at 1 km spatial resolution, but MODIS LST data have gaps due to cloud cover. This research developed a method to fill those gaps with user-defined options to balance processing time and accuracy of MODIS LST data. The presented method combined temporal and spatial interpolation, using the nearest MODIS Aqua/Terra scene for temporal interpolation, Generalized Additive Model (GAM) using 3-dimensional spatial trend surface, elevation, and aspect as covariates. The moving window size controls the number of filled pixels and the prediction accuracy in the temporal interpolation. A large moving window filled more pixels with less accuracy but improved the overall accuracy of the method. The developed method's performance validated and compared to Local Weighted Regression (LWR) using 14 images and Thin Plate Spline (TPS) interpolation by filling different sizes of artificial gaps 3%, 10%, and 25% of valid pixels. The developed method performed better with a low percentage of cloud cover by RMSE ranged between 0.72 to 1.70 but tended to have a higher RMSE with a high percentage of cloud cover

    Proceedings of the 35th International Workshop on Statistical Modelling : July 20- 24, 2020 Bilbao, Basque Country, Spain

    Get PDF
    466 p.The InternationalWorkshop on Statistical Modelling (IWSM) is a reference workshop in promoting statistical modelling, applications of Statistics for researchers, academics and industrialist in a broad sense. Unfortunately, the global COVID-19 pandemic has not allowed holding the 35th edition of the IWSM in Bilbao in July 2020. Despite the situation and following the spirit of the Workshop and the Statistical Modelling Society, we are delighted to bring you the proceedings book of extended abstracts

    A comparison of the CAR and DAGAR spatial random effects models with an application to diabetics rate estimation in Belgium

    Get PDF
    When hierarchically modelling an epidemiological phenomenon on a finite collection of sites in space, one must always take a latent spatial effect into account in order to capture the correlation structure that links the phenomenon to the territory. In this work, we compare two autoregressive spatial models that can be used for this purpose: the classical CAR model and the more recent DAGAR model. Differently from the former, the latter has a desirable property: its ρ parameter can be naturally interpreted as the average neighbor pair correlation and, in addition, this parameter can be directly estimated when the effect is modelled using a DAGAR rather than a CAR structure. As an application, we model the diabetics rate in Belgium in 2014 and show the adequacy of these models in predicting the response variable when no covariates are available

    A Statistical Approach to the Alignment of fMRI Data

    Get PDF
    Multi-subject functional Magnetic Resonance Image studies are critical. The anatomical and functional structure varies across subjects, so the image alignment is necessary. We define a probabilistic model to describe functional alignment. Imposing a prior distribution, as the matrix Fisher Von Mises distribution, of the orthogonal transformation parameter, the anatomical information is embedded in the estimation of the parameters, i.e., penalizing the combination of spatially distant voxels. Real applications show an improvement in the classification and interpretability of the results compared to various functional alignment methods

    Proceedings of the 35th International Workshop on Statistical Modelling : July 20- 24, 2020 Bilbao, Basque Country, Spain

    Get PDF
    466 p.The InternationalWorkshop on Statistical Modelling (IWSM) is a reference workshop in promoting statistical modelling, applications of Statistics for researchers, academics and industrialist in a broad sense. Unfortunately, the global COVID-19 pandemic has not allowed holding the 35th edition of the IWSM in Bilbao in July 2020. Despite the situation and following the spirit of the Workshop and the Statistical Modelling Society, we are delighted to bring you the proceedings book of extended abstracts
    corecore