89 research outputs found

    Statistical Inference and Computational Methods for Large High-Dimensional Data with Network Structure.

    Full text link
    New technological advancements have allowed collection of datasets of large volume and different levels of complexity. Many of these datasets have an underlying network structure. Networks are capable of capturing dependence relationship among a group of entities and hence analyzing these datasets unearth the underlying structural dependence among the individuals. Examples include gene regulatory networks, understanding stock markets, protein-protein interaction within the cell, online social networks etc. The thesis addresses two important aspects of large high-dimensional data with network structure. The first one focuses on a high-dimensional data with network structure that evolves over time. Examples of such data sets include time course gene expression data, voting records of legislative bodies etc. The main task is to estimate the change-point as well as the network structures prior and post it. The network structures are obtained by penalized optimization method and we establish a finite sample estimation error bound for the change-point in the high-dimensional regime. The other aspect that we examine is about parameter estimation in large heterogeneous data with network structure. Our primary goal is to develop efficient computational techniques based on random subsampling and parallelization to estimate the parameters. We provide an analysis of rate of decay of bias and variance of our parallel implementation with a single round of communication after every iteration. We further show two applications of our methodology in the case of Gaussian Mixture Model (GMM) and Stochastic Block Model (SBM).The emphasis is placed on developing new theoretical techniques and computational tools for network problems and applying the corresponding methodology in many fields, including biomedical and social science research, where network modeling and analysis plays an exceedingly important role.PhDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113602/1/sandipan_1.pd

    Consistent Multiple Change-point Estimation with Fused Gaussian Graphical Models

    Get PDF

    Likelihood Inference for Large Scale Stochastic Blockmodels with Covariates based on a Divide-and-Conquer Parallelizable Algorithm with Communication

    Get PDF
    We consider a stochastic blockmodel equipped with node covariate information, that is helpful in analyzing social network data. The key objective is to obtain maximum likelihood estimates of the model parameters. For this task, we devise a fast, scalable Monte Carlo EM type algorithm based on case-control approximation of the log-likelihood coupled with a subsampling approach. A key feature of the proposed algorithm is its parallelizability, by processing portions of the data on several cores, while leveraging communication of key statistics across the cores during each iteration of the algorithm. The performance of the algorithm is evaluated on synthetic data sets and compared with competing methods for blockmodel parameter estimation. We also illustrate the model on data from a Facebook derived social network enhanced with node covariate information.Comment: 28 pages, 4 figure

    Bayesian Inference in Nonparametric Dynamic State-Space Models

    Get PDF
    We introduce state-space models where the functionals of the observational and the evolutionary equations are unknown, and treated as random functions evolving with time. Thus, our model is nonparametric and generalizes the traditional parametric state-space models. This random function approach also frees us from the restrictive assumption that the functional forms, although time-dependent, are of fixed forms. The traditional approach of assuming known, parametric functional forms is questionable, particularly in state-space models, since the validation of the assumptions require data on both the observed time series and the latent states; however, data on the latter are not available in state-space models. We specify Gaussian processes as priors of the random functions and exploit the "look-up table approach" of \ctn{Bhattacharya07} to efficiently handle the dynamic structure of the model. We consider both univariate and multivariate situations, using the Markov chain Monte Carlo (MCMC) approach for studying the posterior distributions of interest. In the case of challenging multivariate situations we demonstrate that the newly developed Transformation-based MCMC (TMCMC) of \ctn{Dutta11} provides interesting and efficient alternatives to the usual proposal distributions. We illustrate our methods with a challenging multivariate simulated data set, where the true observational and the evolutionary equations are highly non-linear, and treated as unknown. The results we obtain are quite encouraging. Moreover, using our Gaussian process approach we analysed a real data set, which has also been analysed by \ctn{Shumway82} and \ctn{Carlin92} using the linearity assumption. Our analyses show that towards the end of the time series, the linearity assumption of the previous authors breaks down.Comment: This version contains much greater clarification of the look-up table idea and a theorem regarding this is also proven and included in the supplement. Will appear in Statistical Methodolog

    ERDOSTEINE: AN EFFECTIVE ANTIOXIDANT FOR PROTECTING COMPLETE FREUND’S ADJUVANT INDUCED ARTHRITIS IN RATS

    Get PDF
    Objective: The objective of this study was to evaluate the protective effect of Erdosteine on complete freund’s adjuvant (CFA) induced arthritic rats. Methods: Wistar Albino rats of 100–250 g were divided into five groups (n=6) and administered with 0.1 ml of CFA subcutaneously into the left hind paw except the negative control group. The standard group received methotrexate (MTX) 0.075 mg/kg body weight orally. Besides, the test groups received Erdosteine orally at a dose 10 mg/kg and 20 mg/kg bodyweight for 12 days. The changes in body weight, paw volume, hematological parameters, radiographical, and histological findings were the indicators to evaluate the efficacy of the test product. Discussion: Significant change in the body weight, paw volume, radiographical, hematological, and histological parameters were observed which supports the remarkable reduction of the arthritic development in the standard and test groups compared to the untreated group. However, the test group (Erdosteine) with the dose 20 mg/kg shows to be more potent than the test group (Erdosteine) with a dose 10 mg/kg and the standard group (MTX) to reduce the arthritic effect. Results: The test group with 20 mg/kg Erdosteine showed much better outcome than the standard group at significant (p<0.05). Therefore, Erdosteine acting as an anti-inflammatory and anti-oxidant is effective at a dose 20 mg/kg in treating the progression of rheumatoid arthritis in rats

    Change-Point Estimation in High-Dimensional Markov Random Field Models

    Get PDF
    The paper investigates a change point estimation problem in the context of high dimensional Markov random-field models. Change points represent a key feature in many dynamically evolving network structures. The change point estimate is obtained by maximizing a profile penalized pseudolikelihood function under a sparsity assumption. We also derive a tight bound for the estimate, up to a logarithmic factor, even in settings where the number of possible edges in the network far exceeds the sample size. The performance of the estimator proposed is evaluated on synthetic data sets and is also used to explore voting patterns in the US Senate in the 1979-2012 period

    Drug Burden Index is a Modifiable Predictor of 30-Day-Hospitalization in Community-Dwelling Older Adults with Complex Care Needs:Machine Learning Analysis of InterRAI Data

    Get PDF
    BACKGROUND: Older adults (≥ 65 years) account for a disproportionately high proportion of hospitalization and in-hospital mortality, some of which may be avoidable. Although machine learning (ML) models have already been built and validated for predicting hospitalization and mortality, there remains a significant need to optimise ML models further. Accurately predicting hospitalization may tremendously impact the clinical care of older adults as preventative measures can be implemented to improve clinical outcomes for the patient.METHODS: In this retrospective cohort study, a dataset of 14,198 community-dwelling older adults (≥ 65 years) with complex care needs from the Inter-Resident Assessment Instrument database was used to develop and optimise three ML models to predict 30-day-hospitalization. The models developed and optimized were Random Forest (RF), XGBoost (XGB), and Logistic Regression (LR). Variable importance plots were generated for all three models to identify key predictors of 30-day-hospitalization.RESULTS: The area under the receiver operating characteristics curve for the RF, XGB and LR models were 0.97, 0.90 and 0.72, respectively. Variable importance plots identified the Drug Burden Index and alcohol consumption as important, immediately potentially modifiable variables in predicting 30-day-hospitalization.CONCLUSIONS: Identifying immediately potentially modifiable risk factors such as the Drug Burden Index and alcohol consumption is of high clinical relevance. If clinicians can influence these variables, they could proactively lower the risk of 30-day-hospitalization. ML holds promise to improve the clinical care of older adults. It is crucial that these models undergo extensive validation through large-scale clinical studies before being utilized in the clinical setting.</p

    IN-VITRO STUDY ON THE HEMOLYTIC ACTIVITY OF DIFFERENT EXTRACTS OF INDIAN MEDICINAL PLANT CROTON BONPLANDIANUM WITH PHYTOCHEMICAL ESTIMATION: A NEW ERA IN DRUG DEVELOPMENT

    Get PDF
    In this study different extracts of the leaves of Croton bonplandianum were screened for the haemolytic activity towards human erythrocytes. The haemolytic activity was performed by modified spectroscopic method at four different concentrations (300, 150, 75, 25 ĂŽÂĽg/ml). The haemolytic activity of the different extracts of Croton bonplandianum was found in the following order Ethyl acetate extract &gt; Chloroform extract &gt; Benzene extract. However, all the extracts alone and in combination with each other exhibited very low haemolytic activity. E. ganitrus did not exhibit any haemolytic activity at any dilution. Hence, they can be considered as safe to human erythrocytes. Keywords: Hemolytic activity, Croton bonplandianum, Erythrocyte
    • …
    corecore