10 research outputs found

    Coda4microbiome: compositional data analysis for microbiome cross-sectional and longitudinal studies

    Get PDF
    Background One of the main challenges of microbiome analysis is its compositional nature that if ignored can lead to spurious results. Addressing the compositional structure of microbiome data is particularly critical in longitudinal studies where abundances measured at different times can correspond to different sub-compositions. Results We developed coda4microbiome, a new R package for analyzing microbiome data within the Compositional Data Analysis (CoDA) framework in both, cross-sectional and longitudinal studies. The aim of coda4microbiome is prediction, more specifically, the method is designed to identify a model (microbial signature) containing the minimum number of features with the maximum predictive power. The algorithm relies on the analysis of log-ratios between pairs of components and variable selection is addressed through penalized regression on the “all-pairs log-ratio model”, the model containing all possible pairwise log-ratios. For longitudinal data, the algorithm infers dynamic microbial signatures by performing penalized regression over the summary of the log-ratio trajectories (the area under these trajectories). In both, cross-sectional and longitudinal studies, the inferred microbial signature is expressed as the (weighted) balance between two groups of taxa, those that contribute positively to the microbial signature and those that contribute negatively. The package provides several graphical representations that facilitate the interpretation of the analysis and the identified microbial signatures. We illustrate the new method with data from a Crohn's disease study (cross-sectional data) and on the developing microbiome of infants (longitudinal data). Conclusions coda4microbiome is a new algorithm for identification of microbial signatures in both, cross-sectional and longitudinal studies.Peer ReviewedPostprint (published version

    The role of survival functions in competing risks

    Get PDF
    Competing risks data usually arises in studies in which the failure of an individual may be classified into one of k (k > 1) mutually exclusive causes of failure. When competing risks are present, there are two main differences with classical survival analysis: (i) survival functions are not mainly used to describe cause-specific failures and, (ii) classical estimation techniques may provide biased results. The main goal of this paper is to review, clarify and present the formulation of a competing risks model and the basic nonparametric estimation methods. We show why the use of survival functions in the competing risks framework may mislead the user, and we illustrate the presented methodologies by developing two examples from real data. The methods presented here can be implemented with several statistical packages, including R, SPSS and SAS: we give some highlights on how to perform a competing risks analysis with these software packages

    Review of multivariate survival data

    Get PDF
    Document de recerca publicat per la UPC. Departament d'Estadística i Investigació operativaThis paper reviews some of the main contributions in the area of multivariate survival data and proposes some possible extensions. In particular, we have concentrated our search and study on those papers that are relevant to the situation where two (or more) consecutive variables are followed until a common day of analysis and subject to informative censoring. The paper reviews bivariate nonparametric approaches and extend some of them to the case of two nonconsecutive times. We introduce the notation and construct the likelihood for the general problem of more than two consecutive survival times. We formulate the time dependencies and trends via a Bayesian approach. Finally, three regression models for multivariate survival times are discussed together with the differences among them which will be useful when the main interest is on the effect of covariates on the risk of failure.Postprint (author’s final draft

    Competing risks methods

    Get PDF
    Competing risks data usually arises in studies in which the failure of an individual may be classified into one of k (k > 1) mutually exclusive causes of failure. When competing risks are present, classical survival analysis techniques may not be appropriate to use. The main goal of this paper is to review the specific methods to deal with competing risks. To this aim, we first focus on how to specify a competing risks model, which is the structure of observed data in this framework, and how components of the model are estimated from a given random sample. In addition, we discuss how to correctly interpret probabilities in the presence of competing risks, and regression models are considered in detail. To conclude, we illustrate the problem with data from a bladder cancer study

    Semi-competing risks methods accounting for interval-censoring

    No full text
    Semi-competing risks data arises when a subject is at risk of both a terminating and an intermediate event. In this situation we encounter a bivariate survival analysis problem with dependent censoring if these two events are related. In this work we extended the semi-competing risks data problem to the case where the intermediate event is interval-censored

    The role of survival functions in competing risks

    Get PDF
    Competing risks data usually arises in studies in which the failure of an individual may be classified into one of k (k > 1) mutually exclusive causes of failure. When competing risks are present, there are two main differences with classical survival analysis: (i) survival functions are not mainly used to describe cause-specific failures and, (ii) classical estimation techniques may provide biased results. The main goal of this paper is to review, clarify and present the formulation of a competing risks model and the basic nonparametric estimation methods. We show why the use of survival functions in the competing risks framework may mislead the user, and we illustrate the presented methodologies by developing two examples from real data. The methods presented here can be implemented with several statistical packages, including R, SPSS and SAS: we give some highlights on how to perform a competing risks analysis with these software packages

    The role of survival functions in competing risks

    No full text
    Competing risks data usually arises in studies in which the failure of an individual may be classified into one of k (k > 1) mutually exclusive causes of failure. When competing risks are present, there are two main differences with classical survival analysis: (i) survival functions are not mainly used to describe cause-specific failures and, (ii) classical estimation techniques may provide biased results. The main goal of this paper is to review, clarify and present the formulation of a competing risks model and the basic nonparametric estimation methods. We show why the use of survival functions in the competing risks framework may mislead the user, and we illustrate the presented methodologies by developing two examples from real data. The methods presented here can be implemented with several statistical packages, including R, SPSS and SAS: we give some highlights on how to perform a competing risks analysis with these software packages

    Review of multivariate survival data

    No full text
    Document de recerca publicat per la UPC. Departament d'Estadística i Investigació operativaThis paper reviews some of the main contributions in the area of multivariate survival data and proposes some possible extensions. In particular, we have concentrated our search and study on those papers that are relevant to the situation where two (or more) consecutive variables are followed until a common day of analysis and subject to informative censoring. The paper reviews bivariate nonparametric approaches and extend some of them to the case of two nonconsecutive times. We introduce the notation and construct the likelihood for the general problem of more than two consecutive survival times. We formulate the time dependencies and trends via a Bayesian approach. Finally, three regression models for multivariate survival times are discussed together with the differences among them which will be useful when the main interest is on the effect of covariates on the risk of failure

    Review of multivariate survival data

    No full text
    Document de recerca publicat per la UPC. Departament d'Estadística i Investigació operativaThis paper reviews some of the main contributions in the area of multivariate survival data and proposes some possible extensions. In particular, we have concentrated our search and study on those papers that are relevant to the situation where two (or more) consecutive variables are followed until a common day of analysis and subject to informative censoring. The paper reviews bivariate nonparametric approaches and extend some of them to the case of two nonconsecutive times. We introduce the notation and construct the likelihood for the general problem of more than two consecutive survival times. We formulate the time dependencies and trends via a Bayesian approach. Finally, three regression models for multivariate survival times are discussed together with the differences among them which will be useful when the main interest is on the effect of covariates on the risk of failure

    Competing risks methods

    No full text
    Competing risks data usually arises in studies in which the failure of an individual may be classified into one of k (k > 1) mutually exclusive causes of failure. When competing risks are present, classical survival analysis techniques may not be appropriate to use. The main goal of this paper is to review the specific methods to deal with competing risks. To this aim, we first focus on how to specify a competing risks model, which is the structure of observed data in this framework, and how components of the model are estimated from a given random sample. In addition, we discuss how to correctly interpret probabilities in the presence of competing risks, and regression models are considered in detail. To conclude, we illustrate the problem with data from a bladder cancer study
    corecore