2,742 research outputs found

    A Recurrent Neural Network Survival Model: Predicting Web User Return Time

    Full text link
    The size of a website's active user base directly affects its value. Thus, it is important to monitor and influence a user's likelihood to return to a site. Essential to this is predicting when a user will return. Current state of the art approaches to solve this problem come in two flavors: (1) Recurrent Neural Network (RNN) based solutions and (2) survival analysis methods. We observe that both techniques are severely limited when applied to this problem. Survival models can only incorporate aggregate representations of users instead of automatically learning a representation directly from a raw time series of user actions. RNNs can automatically learn features, but can not be directly trained with examples of non-returning users who have no target value for their return time. We develop a novel RNN survival model that removes the limitations of the state of the art methods. We demonstrate that this model can successfully be applied to return time prediction on a large e-commerce dataset with a superior ability to discriminate between returning and non-returning users than either method applied in isolation.Comment: Accepted into ECML PKDD 2018; 8 figures and 1 tabl

    Semiparametric regression methods for temporal processes subject to multiple sources of censoring

    Full text link
    Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/155547/1/cjs11528.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/155547/2/cjs11528_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/155547/3/cjs11528-sup-0002-SuppInfo2.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/155547/4/cjs11528-sup-0001-SuppInfo1.pd

    Crude incidence in two-phase designs in the presence of competing risks.

    Get PDF
    BackgroundIn many studies, some information might not be available for the whole cohort, some covariates, or even the outcome, might be ascertained in selected subsamples. These studies are part of a broad category termed two-phase studies. Common examples include the nested case-control and the case-cohort designs. For two-phase studies, appropriate weighted survival estimates have been derived; however, no estimator of cumulative incidence accounting for competing events has been proposed. This is relevant in the presence of multiple types of events, where estimation of event type specific quantities are needed for evaluating outcome.MethodsWe develop a non parametric estimator of the cumulative incidence function of events accounting for possible competing events. It handles a general sampling design by weights derived from the sampling probabilities. The variance is derived from the influence function of the subdistribution hazard.ResultsThe proposed method shows good performance in simulations. It is applied to estimate the crude incidence of relapse in childhood acute lymphoblastic leukemia in groups defined by a genotype not available for everyone in a cohort of nearly 2000 patients, where death due to toxicity acted as a competing event. In a second example the aim was to estimate engagement in care of a cohort of HIV patients in resource limited setting, where for some patients the outcome itself was missing due to lost to follow-up. A sampling based approach was used to identify outcome in a subsample of lost patients and to obtain a valid estimate of connection to care.ConclusionsA valid estimator for cumulative incidence of events accounting for competing risks under a general sampling design from an infinite target population is derived

    Cancer of the oral cavity, pharynx/larynx and lung in North Thailand: case-control study and analysis of cigar smoke.

    Get PDF
    The unusually high relative frequency of cancer in the laryngeal region in males (18% of all histologically diagnosed cancers) and a sex ratio of unity for lung cancer in Northern Thailand were further explored in a hospital-based case-control study in Chiang Mai. This compared patients having cancers of the oral cavity (including oropharynx), larynx, hypopharynx and lung, with controls in relation to smoking and chewing habits. Statistical analysis indicated that chewing betel is strongly associated with the occurrence of oral cancer in both sexes, and with cancer of the laryngeal region in males. No factors were strongly linked to lung cancer in men, but, in women, urban residence and miang chewing were associated with lung cancer. Analysis of smoke from the two main types of cigars smoked in the region showed that both had high tar content, but there were marked differences in pH. Smoking cigars with alkaline smoke and high tar had an increased risk for laryngeal cancer in males, whereas other cigars with acid smoke and high tar together with manufactured cigarettes had increased risks for lung cancer. These increased risks were not, however, statistically significant

    Secondary Sex Ratio among Women Exposed to Diethylstilbestrol in Utero

    Get PDF
    BACKGROUND. Diethylstilbestrol (DES), a synthetic estrogen widely prescribed to pregnant women during the mid-1900s, is a potent endocrine disruptor. Previous studies have suggested an association between endocrine-disrupting compounds and secondary sex ratio. METHODS. Data were provided by women participating in the National Cancer Institute (NCI) DES Combined Cohort Study. We used generalized estimating equations to estimate odds ratios (ORs) and 95% confidence intervals (CIs) for the relation of in utero DES exposure to sex ratio (proportion of male births). Models were adjusted for maternal age, child's birth year, parity, and cohort, and accounted for clustering among women with multiple pregnancies. RESULTS. The OR for having a male birth comparing DES-exposed to unexposed women was 1.05 (95% CI, 0.95-1.17). For exposed women with complete data on cumulative DES dose and timing (33%), those first exposed to DES earlier in gestation and to higher doses had the highest odds of having a male birth. The ORs were 0.91 (95% C, 0.65-1.27) for first exposure at ≥ 13 weeks gestation to < 5 g DES; 0.95 (95% CI, 0.71-1.27) for first exposure at ≥ 13 weeks to ≥ 5 g; 1.16 (95% CI, 0.96-1.41) for first exposure at < 13 weeks to < 5 g; and 1.24 (95% CI, 1.04-1.48) for first exposure at < 13 weeks to ≥ 5 g compared with no exposure. Results did not vary appreciably by maternal age, parity, cohort, or infertility history. CONCLUSIONS. Overall, no association was observed between in utero DES exposure and secondary sex ratio, but a significant increase in the proportion of male births was found among women first exposed to DES earlier in gestation and to a higher cumulative dose.National Cancer Institute (N01-CP-21168, N01-CP-51017, N01-CP-01289

    Islet autoantibodies and residual beta cell function in type 1 diabetes children followed for 3-6 years

    Get PDF
    Aims: To test if islet autoantibodies at diagnosis of type 1 diabetes (T1DM) and after 3-6 years with T1D predict residual beta-cell function (RBF) after 3-6 years with T1D. Methods: T1D children (n = 260, median age at diagnosis 9.4, range 0.9-14.7 years) were tested for GAD65, IA-2, ZnT8R, ZnT8W and ZnT8Q autoantibodies (A) at diagnosis, and 3-6 years after diagnosis when also fasting and stimulated RBF were determined. Results: For every 1-year increase in age at diagnosis of TID, the odds of detectable C-peptide increased 1.21 (1.09, 1.34) times for fasting C-peptide and 1.28 (1.15, 1.42) times for stimulated C-peptide. Based on a linear model for subjects with no change in IA-2A levels, the odds of detectable C-peptide were 35% higher than for subjects whose IA-2A levels decreased by half (OR = 1.35 (1.09, 1.67), p = 0.006); similarly for ZnT8WA (OR = 1.39 (1.09, 1.77), p = 0.008) and ZnT8QA (OR = 1.55 (1.06, 2.26) p = 0.024). Such relationship was not detected for GADA or ZnT8RA. All OR adjusted for confounders. Conclusions: Age at diagnosis with T1D was the major predictor of detectable C-peptide 3-6 years post-diagnosis. Decreases in IA-2A, and possibly ZnT8A, levels between diagnosis and post-diagnosis were associated with a reduction in RBF post-diagnosis. (C) 2012 Elsevier Ireland Ltd. All rights reserved

    Pitfalls of using the risk ratio in meta‐analysis

    Get PDF
    For meta-analysis of studies that report outcomes as binomial proportions, the most popular measure of effect is the odds ratio (OR), usually analyzed as log(OR). Many meta-analyses use the risk ratio (RR) and its logarithm, because of its simpler interpretation. Although log(OR) and log(RR) are both unbounded, use of log(RR) must ensure that estimates are compatible with study-level event rates in the interval (0, 1). These complications pose a particular challenge for random-effects models, both in applications and in generating data for simulations. As background we review the conventional random-effects model and then binomial generalized linear mixed models (GLMMs) with the logit link function, which do not have these complications. We then focus on log-binomial models and explore implications of using them; theoretical calculations and simulation show evidence of biases. The main competitors to the binomial GLMMs use the beta-binomial (BB) distribution, either in BB regression or by maximizing a BB likelihood; a simulation produces mixed results. Two examples and an examination of Cochrane meta-analyses that used RR suggest bias in the results from the conventional inverse-variance-weighted approach. Finally, we comment on other measures of effect that have range restrictions, including risk difference, and outline further research
    corecore