99,111 research outputs found

    A Comparison of Clustering and Missing Data Methods for Health Sciences

    Get PDF
    In this paper, we compare and analyze clustering methods with missing data in health behavior research. In particular, we propose and analyze the use of compressive sensing\u27s matrix completion along with spectral clustering to cluster health related data. The empirical tests and real data results show that these methods can outperform standard methods like LPA and FIML, in terms of lower misclassification rates in clustering and better matrix completion performance in missing data problems. According to our examination, a possible explanation of these improvements is that spectral clustering takes advantage of high data dimension and compressive sensing methods utilize the near-to-low-rank property of health data

    A survey of statistical network models

    Full text link
    Networks are ubiquitous in science and have become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a form of graphical representation. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active network community and a substantial literature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning network literature in statistical physics and computer science. The growth of the World Wide Web and the emergence of online networking communities such as Facebook, MySpace, and LinkedIn, and a host of more specialized professional network communities has intensified interest in the study of networks and network data. Our goal in this review is to provide the reader with an entry point to this burgeoning literature. We begin with an overview of the historical development of statistical network modeling and then we introduce a number of examples that have been studied in the network literature. Our subsequent discussion focuses on a number of prominent static and dynamic network models and their interconnections. We emphasize formal model descriptions, and pay special attention to the interpretation of parameters and their estimation. We end with a description of some open problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference

    Informing theoretical development of salutogenic, asset-based health improvement to reduce syndemics among gay, bisexual and other men who have sex with men: empirical evidence from secondary analysis of multi-national, online cross-sectional surveys

    Get PDF
    Globally, gay, bisexual and other men who have sex with men (GBMSM) experience an increased burden of poor sexual, mental and physical health. Syndemics theory provides a framework to understand comorbidities and health among marginalised populations. Syndemics theory attempts to account for the social, environmental, and other structural contexts that are driving and/or sustaining simultaneous multiple negative health outcomes, but has been widely critiqued. In this paper, we conceptualise a new framework to counter syndemics by assessing the key theoretical mechanisms by which pathogenic social context variables relate to ill-health. Subsequently, we examine how salutogenic, assets-based approaches to health improvement could function among GBMSM across diverse national contexts. Comparative quantitative secondary analysis of data on syndemics and community assets are presented from two international, online, cross-sectional surveys of GBMSM (SMMASH2 in Scotland, Wales, Northern Ireland and the Republic of Ireland and Sex Now in Canada). Negative sexual, mental and physical health outcomes were clustered as hypothesised, providing evidence of the syndemic. We found that syndemic ill-health was associated with social isolation and the experience of stigma and discrimination, but this varied across national contexts. Moreover, while some of our measures of community assets appeared to have a protective effect on syndemic ill-health, others did not. These results present an important step forward in our understanding of syndemic ill-health and provide new insights into how to intervene to reduce it. They point to a theoretical mechanism through which salutogenic approaches to health improvement could function and provide new strategies for working with communities to understand the proposed processes of change that are required. To move forward, we suggest conceptualising syndemics within a complex adaptive systems model, which enables consideration of the development, sustainment and resilience to syndemics both within individuals and at the population-level

    Modeling Individual Cyclic Variation in Human Behavior

    Full text link
    Cycles are fundamental to human health and behavior. However, modeling cycles in time series data is challenging because in most cases the cycles are not labeled or directly observed and need to be inferred from multidimensional measurements taken over time. Here, we present CyHMMs, a cyclic hidden Markov model method for detecting and modeling cycles in a collection of multidimensional heterogeneous time series data. In contrast to previous cycle modeling methods, CyHMMs deal with a number of challenges encountered in modeling real-world cycles: they can model multivariate data with discrete and continuous dimensions; they explicitly model and are robust to missing data; and they can share information across individuals to model variation both within and between individual time series. Experiments on synthetic and real-world health-tracking data demonstrate that CyHMMs infer cycle lengths more accurately than existing methods, with 58% lower error on simulated data and 63% lower error on real-world data compared to the best-performing baseline. CyHMMs can also perform functions which baselines cannot: they can model the progression of individual features/symptoms over the course of the cycle, identify the most variable features, and cluster individual time series into groups with distinct characteristics. Applying CyHMMs to two real-world health-tracking datasets -- of menstrual cycle symptoms and physical activity tracking data -- yields important insights including which symptoms to expect at each point during the cycle. We also find that people fall into several groups with distinct cycle patterns, and that these groups differ along dimensions not provided to the model. For example, by modeling missing data in the menstrual cycles dataset, we are able to discover a medically relevant group of birth control users even though information on birth control is not given to the model.Comment: Accepted at WWW 201
    • …
    corecore