22,074 research outputs found

    Modeling Individual Cyclic Variation in Human Behavior

    Full text link
    Cycles are fundamental to human health and behavior. However, modeling cycles in time series data is challenging because in most cases the cycles are not labeled or directly observed and need to be inferred from multidimensional measurements taken over time. Here, we present CyHMMs, a cyclic hidden Markov model method for detecting and modeling cycles in a collection of multidimensional heterogeneous time series data. In contrast to previous cycle modeling methods, CyHMMs deal with a number of challenges encountered in modeling real-world cycles: they can model multivariate data with discrete and continuous dimensions; they explicitly model and are robust to missing data; and they can share information across individuals to model variation both within and between individual time series. Experiments on synthetic and real-world health-tracking data demonstrate that CyHMMs infer cycle lengths more accurately than existing methods, with 58% lower error on simulated data and 63% lower error on real-world data compared to the best-performing baseline. CyHMMs can also perform functions which baselines cannot: they can model the progression of individual features/symptoms over the course of the cycle, identify the most variable features, and cluster individual time series into groups with distinct characteristics. Applying CyHMMs to two real-world health-tracking datasets -- of menstrual cycle symptoms and physical activity tracking data -- yields important insights including which symptoms to expect at each point during the cycle. We also find that people fall into several groups with distinct cycle patterns, and that these groups differ along dimensions not provided to the model. For example, by modeling missing data in the menstrual cycles dataset, we are able to discover a medically relevant group of birth control users even though information on birth control is not given to the model.Comment: Accepted at WWW 201

    Duration and Interval Hidden Markov Model for Sequential Data Analysis

    Full text link
    Analysis of sequential event data has been recognized as one of the essential tools in data modeling and analysis field. In this paper, after the examination of its technical requirements and issues to model complex but practical situation, we propose a new sequential data model, dubbed Duration and Interval Hidden Markov Model (DI-HMM), that efficiently represents "state duration" and "state interval" of data events. This has significant implications to play an important role in representing practical time-series sequential data. This eventually provides an efficient and flexible sequential data retrieval. Numerical experiments on synthetic and real data demonstrate the efficiency and accuracy of the proposed DI-HMM

    Measurement of Plasmodium falciparum transmission intensity using serological cohort data from Indonesian schoolchildren.

    Get PDF
    BACKGROUND: As malaria transmission intensity approaches zero, measuring it becomes progressively more difficult and inefficient because parasite-positive individuals are hard to detect. This situation may arise shortly before achieving local elimination, or during surveillance post-elimination to prevent reintroduction. Antibody responses against the parasite last longer than the infections themselves. This "footprint" of infection may thus be used for assessing transmission intensity. A statistical approach is presented for measuring the seroconversion rate (SCR), a correlate of the force of infection, from individual-level longitudinal data on antibody titres in an area of low Plasmodium falciparum transmission. METHODS: Blood samples were collected from 160 Indonesian schoolchildren every month for six months. Titres of antibodies against AMA-1 and MSP-1(19) antigens of P. falciparum were measured using ELISA. The distribution of antibody titres among seronegative and -positive individuals, respectively, was estimated by comparing the titres from the study data (a mixture of both seropositive and -negative individuals) with titres from a (unexposed) negative control group of Indonesian individuals. Two Markov-Chain models for the transition of individuals between serological states were fitted to individual anti-PfAMA-1 or anti-PfMSP-1 titre time series using Bayesian Markov-Chain-Monte-Carlo (MCMC). This yielded estimates of SCR as well as of the duration of seropositivity. RESULTS: A posterior median SCR of 0.02 (Pf AMA-1) and 0.09 (PfMSP-1) person(-1) year(-1) was estimated, with credible intervals ranging from 1E-4 to 0.2 person(-1) year(-1). This level of transmission intensity is at the lower range of what can reliably be measured with the present study size. A Bayesian test for seroconversion of an individual between two observations is presented and used to identify the subjects who have most likely experienced an infection. Furthermore, the theoretical limits of measuring transmission intensity, and how these depend on duration and size of a study as well as on transmission intensity itself, is illustrated. CONCLUSIONS: This analysis shows that it is possible to measure SCR's from individual-level longitudinal data on antibody titres. In addition, individual seroconversion events can be identified, which can be useful in assessing interruption of transmission. Analyses of further serological datasets using the present method are required to improve and validate it. This includes measurement of the duration of antibody responses, how it depends on host age or cumulative exposure, or on the particular antigen used

    Multi-State Models for Panel Data: The msm Package for R

    Get PDF
    Panel data are observations of a continuous-time process at arbitrary times, for example, visits to a hospital to diagnose disease status. Multi-state models for such data are generally based on the Markov assumption. This article reviews the range of Markov models and their extensions which can be fitted to panel-observed data, and their implementation in the msm package for R. Transition intensities may vary between individuals, or with piecewise-constant time-dependent covariates, giving an inhomogeneous Markov model. Hidden Markov models can be used for multi-state processes which are misclassified or observed only through a noisy marker. The package is intended to be straightforward to use, flexible and comprehensively documented. Worked examples are given of the use of msm to model chronic disease progression and screening. Assessment of model fit, and potential future developments of the software, are also discussed.
    • …
    corecore