1,372 research outputs found

    Development and application of competing risks and multi-state models in cancer epidemiology

    Get PDF
    Competing risks and multi-state models allow us to study complex disease settings and answer composite research questions and should be used more widely in epidemiology. This thesis aims to explore the competing risks and multi-state models areas using flexible parametric survival models (FPSMs), studying several aspects, such as the choice of timescale, choice of multi-state structure, sharing information across transitions by imposing restrictions in the estimation of the parameters, as well as communicating the results of such models to a wider audience and evaluating the use of recurrent multi-state structures in the area of recurrent events when a terminal event is present. In competing risks settings, a common timescale is normally used for all competing events. For example, in a setting where death due to colon cancer is the event of interest and death due to other causes serves as a competing event, time since diagnosis is frequently used as the timescale when modelling the hazard rates for both events. However, attained age has been proposed as a more natural timescale when modelling mortality rate that is not associated with the event of interest (colon cancer). In Study I, the aim was to assess how the choice of timescale for other cause mortality (time since diagnosis versus attained age) influence the estimated cumulative incidence functions (CIFs) and how several factors contribute to that influence (sample size, non-proportional hazards, shape of baseline other cause mortality rate, variance in age at diagnosis) via a simulation analysis, assuming that the mortality rate is a function of attained age. I found that the bias of the CIF estimates for colon cancer mortality is negligible under all the different approaches and all factor levels. The bias in the CIF estimates for other cause mortality is also low when using time since diagnosis as the timescale for both events, provided that we include age at diagnosis in the models with sufficient flexibility (splines). When a covariate has non-proportional hazards for other cause mortality on the attained age scale, using time since diagnosis as the timescale for other cause mortality may lead to a low but non-negligible bias, no matter how flexibly we model the hazard rate. The structural complexity of a multi-state structure and the variety of the predicted measures over time for individuals with different covariate patterns may render the communication of the results complicated and difficult. This issues motivated me to develop an interactive web- tool in Study II that can be used from researchers to present their multi-state model results to audiences with a variety of interactive graphs that will render the results more communicable and intuitive. The name of the application is MSMplus and it was written using the package RShiny in R. Multi-state model results can easily be wrapped up and uploaded to the application using the multistate package in Stata and the MSMplus package in R. When studying a disease process, different research questions may require different multi-state structures in order to be addressed, each structure with different interpretations of the estimated measures, advantages compared to the other structures as well as limitations. There are also a number of modelling choices to consider such as the timescale used for each transition, and sharing information across transitions by imposing specific restrictions in the estimation process. In Study III, we explore different research questions via the use of a range of multi-state models of increasing complexity when dealing with registry-based repeated prescriptions of antidepressants, using the Breast Cancer Data Base Sweden 2.0 research database. I derive probability estimates that address different research questions regarding antidepressant use patterns, beginning with a single-event survival model, moving to a competing risks and a 3- state Illness-Death model, then a 4-state unidirectional and bidirectional model with a post- medication state. Finally, I fit a multi-state structure with recurrent pairs of medication cycles/ discontinuation period states, first with separately estimated transition intensity rates and then allowing sharing of information across transitions by imposing specific restrictions between the baseline transition intensity rates. When we are interested in studying a recurrent event process in the presence of a terminal event, there is a variety of different frameworks and approaches, joint frailty models being a framework that is frequently used. A multi-state model with recurrent event states and an absorbing state representing the terminal event can also be used in this context. In Study IV, I am interested in evaluating via simulation the use of a multi-state model with recurrent states and a competing terminal absorbing state, with and without restrictions among the baseline transition intensity rates, when the underlying data generating mechanism follows a joint frailty model. I focus on the probabilities of death and of a new recurrent event across follow-up time given zero, one, two or three previous recurrences up to the first year of the follow-up, probability measures that can be targeted by both a joint frailty and a multi-state model. Then the bias and relative precision of the different modelling approaches are evaluated. Finally, I engage in a discussion of the similarities, the different assumptions and the focus of each framework

    Component Reliability Estimation From Partially Masked and Censored System Life Data Under Competing Risks.

    Get PDF
    This research presents new approaches to the estimation of component reliability distribution parameters from partially masked and/or censored system life data. Such data are common in continuous production environments. The methods were tested on Monte Carlo simulated data and compared to the only alternative suggested in literature. This alternative did not converge on many masked datasets. The new methods produce accurate parameter estimates, particularly at low masking levels. They show little bias. One method ignores masked data and treats them as censored observations. It works well if at least 2 known-cause failures of each component type have been observed and is particularly useful for analysis of any size datasets with a small fraction of masked observations. It provides quick and accurate estimates. A second method performs well when the number of masked observations is small but forms a significant portion of the dataset and/or when the assumption of independent masking does not hold. The third method provides accurate estimates when the dataset is small but contains a large fraction of masked observations and when independent masking is assumed. The latter two methods provide an indication which component most likely caused each masked system failure, albeit at the price of much computation time. The methods were implemented in user-friendly software that can be used to apply the method on simulated or real-life data. An application of the methods to real-life industrial data is presented. This research shows that masked system life data can be used effectively to estimate component life distribution parameters in a situation where such data form a large portion of the dataset and few known failures exist. It also demonstrates that a small fraction of masked data in a dataset can safely be treated as censored observations without much effect on the accuracy of the resulting estimates. These results are important as masked system life data are becoming more prevalent in industrial production environments. The research results are gauged to be useful in continuous manufacturing environments, e.g. in the petrochemical industry. They will also likely interest the electronics and automotive industry where masked observations are common

    Statistical Models and Methods for Dependent Life History Processes

    Get PDF
    This thesis deals with statistical issues in the analysis of complex life history processes which have characteristics of heterogeneity and dependence. We are motivated, in this thesis, by three specific types of processes; i) processes featuring recurrent episodic conditions ii) multi-type recurrent events, and iii) clustered multi state processes as arise in family studies. In chronic diseases featuring recurrent episodic conditions, symptom onset is followed by a period during which symptoms are present until recovery. In the analysis of data from such processes, analysis is often based only on the recurrent onset of disease, ignoring the duration of symptoms. This loss of information may lead to incorrect conclusions in the analysis of this data. In Chapter 2, we propose a novel model for an alternating two-state process including symptom-free state and symptomatic state to recognize the duration of symptoms. This approach reflects the dynamics of individual's disease process and helps to understand a course of disease. Intensity-based models with multiplicative random effects are considered where the disease onset time is governed by a conditionally Markov intensity and the time of recovery is governed by a conditionally semi-Markov intensity. A bivariate random effect with one multiplicative component for each intensity is introduced to accommodate between-individual heterogeneity and a dependence between bivariate random effect variables offers a natural and more general framework for modeling the two state process. A copula function is used for the joint distribution of random effects which retains the marginal features and gives flexible choices of dependence structure. The proposed model is a semiparametric model for which estimation is carried out using an expectation-maximization algorithm. The aforementioned problem leads us to investigate the impact of ignoring symptom duration in a randomized trial setting. In Chapter 3, we define two risk sets for recurrent event analyses: one involves including individuals during their symptomatic period, and the other excluding individuals from the risk set during symptomatic periods. In a clinical trial, the balance between treatment groups in unmeasured confounders present at the time of randomization can be lost following randomization as the risk set changes, thus, retaining individuals in the risk set is a common approach. Here we examine asymptotic and empirical biases of estimators from the rate-based models when two different risk sets are applied. We assume that the true underlying process is an alternating two-state process where the true risk set is the one that excludes individuals when they are experiencing an exacerbation. We consider two scenarios of the true model. First, there is no between-variation for each process and no dependence between two processes. The second scenario is to use the proposed dependent alternating two-states model in Chapter 2. Issues of model misspecification and causal inference are considered. When focus is on clinical trials, power implications of risk set misspecification is of interest. In Chapter 4, attention is directed at multiple recurrent events where each endpoint is of interest. The use of composite endpoint which is the time point of the first event of any type is a simple way to analyse such data. However, when multiple events are of comparable importance, use of a composite endpoint analysis may not be suitable. We propose a copula-based model for multi-type recurrent events where each type of recurrent event process arises from a mixed-Poisson model and random effects linking the events through a copula function. When more than two types of events are considered, composite likelihood is adopted to ease the computational burden, and simultaneous and two-stage estimation are explored. An aim of family studies is typically to gain knowledge about factors governing the inheritance of diseases. One may be interested in examining a dependence of disease onset between family members, and in identifying genetic markers associated with heritable disease. A common procedure is to collect families is through probands in which such affected individuals are selected from a disease registry and their family members (non-probands) are, then, recruited for examination. This approach to sampling families motivates us to consider the disease onset process along with survival since the proband must be diseased and alive to be recruited, and family members may need to be alive. In Chapter 5, we propose a model for a clustered illness-death process for family studies which accounts for the semi-competing risks problem for disease onset as well as biased sampling. We model within-family association in the age of disease onset via a copula function and applied to the possibly latent disease onset time and incorporate survival through a marginal illness-death model. The ascertainment condition is reflected in the likelihood or composite likelihood construction. Two study designs regarding the recruitment of family members are considered. One involves the collection of disease history from family members via the proband or medical records. The other requires family members to undergo a medical examination in which case they must be alive at the time of the family study. Family data alone are insufficient to estimate all of the parameters of the illness-death processes. We therefore make use of auxiliary data including the population mortality data and additional registry data to address the estimatability issue. Another source of auxiliary data is current status survey. The issue of missing genetic markers is also addressed in each study design

    P.A.L.M. - Physical Asset Lifecycle Modelling in the Healthcare Sector

    Get PDF
    A Private Finance Initiative (PFI) is a way of establishing Public-Private Partnerships (PPP) by funding public infrastructure projects with private capital investment. The election in 1979 of a Conservative government under Margaret Thatcher marked the start of a still-continuing shift of activities away from the UK public sector. PFI was implemented in the UK for the first time in 1992. HCP is an award winning PFI asset-management company and, as part of the EngD course, the researcher has spent a large amount of time based at HCP. HCP stands for Healthcare Projects, and this thesis presents an alternative, combined-methods research approach to one of the most mechanically complex asset types under HCP’s management, in its largest healthcare facility. The research presents a risk-based approach to the operational lifecycle planning of 113 air-handling units at a central London hospital. The two components to the project are engineering risk (How likely is the asset to fail?) and contractual risk (What are the financial implications of such a failure?). Currently, these assets are modelled by HCP on a ‘strategic’ level, but using CIBSE- recommended guidance and part-failure data collected from six other UK-based hospitals, the Physical Asset Lifecycle Model (PALM) produces a funding profile for the replacement of the 1,247 internal components, as opposed to 113 bulk assets. The numerical model has also been visualised through the extraction of 3D BIM geometry into a geometrical-modelling tool (Rhino5) and computational plug-in (Grasshopper) to connect to the lifecycle model and visualise the replacement strategy proposed. The qualitative part of the combined-methods approach involved interviewing HCP Management board members as to their views on the models. The current profile adopted by HCP for the management of the air-handling units involves a £6.045m spend during the remaining 33-year concession period. The main findings of the PALM lifecycle model are that, based on a component-level replacement approach, this figure can be reduced by more than £1m based on a recommended replacement profile (£4.709m). Such a reduction can be based on how HCP currently manages its assets, and the engineering survey conducted showed that three air-handling units currently being life-cycled by HCP either had no components or were decommissioned prior to construction. The main findings of the PALM geometrical model (based on thematic-interview analysis) are that such a tool has largely been unseen in the industry before and it displays major translatability to other complex mechanical assets with component parts. It can also be integrated into HCP business propositions for new and existing clients in the future because of its clarity and ability to produce transparent lifecycle modelling from a decision-maker’s point of view. The research concludes that while the PALM model provides a glimpse as to how lifecycle modelling may be conducted in the future, a number of barriers to its implementation remain (namely data availability in a competitive environment, the time versus income generated business-case paradigm and a generational ability to change and accept technological advancements amongst senior decision-makers)

    The diffusion of process innovation in the UK financial sector: an empirical analysis of automated teller machine (ATM) diffusion

    Get PDF
    Recent policy initiatives have identified that the diffusion of innovation constitutes an important component in technical change and progress and is the impetus behind changes in firm productivity. To date, however, the main emphasis of economists has been on the diffusion of process innovations in the industrial sector with diffusion in the financial sector either ignored or, at best, summarised by a number of stylised facts relating to the spread of information. The objective of this thesis is to explore the inter-firm determinants of ATM adoption and diffusion in the UK financial sector and identify firm-specific and market factors in the diffusion process. The empirical analysis draws on duration analysis which represents the current state-of-art modelling approach to inter-firm diffusion. This approach conceptualises inter-firm diffusion as a cross-section of durations of nonadoption from which, most importantly, hypothesised factors (or `covariates') can be examined by their significance or otherwise on the conditional probability of adoption. The main findings of this thesis support the stylised fact often made in the diffusion literature that the inter-firm diffusion curve is sigmoid and characterised by a nonmonotonic hazard function. Furthermore the empirical analysis supports the hypothesis that firm-specific characteristics and expectations have played a crucial role in the interfirm diffusion of ATMs. In addition, the results indicate that the diffusion of ATMs in the UK has been characterised by the existence of positive network externalities. The results are also shown to be robust across a number of model specifications and assumptions concerning the time-path of covariates

    Voyager Design Study. Vol. I- Design Summary

    Get PDF
    Voyager spacecraft design stud

    TURBULENCE IN INDUSTRIAL POPULATIONS: THE CASE OF THE ITALIAN GRAPHIC PAPER INDUSTRY

    Get PDF
    Significant and persistent flows of entry and exit, that is, turbulence, are a common feature of most industries, across countries and over time. By means of a new database for the Italian graphic papermaking industry between 1964 and 2004, this research inquires into the extent and character of entry and exit in an industry where innovation in products and processes has been incremental and largely predictable and, therefore, unlikely to be the main driving force of firm turbulence. The first part of the thesis deals with methodological issues concerning compilation of the thesis’ database, which records, annually, plants’ and firms’ major demographic events, attributes and proprietary linkages, thus allowing comparison between the dynamics of plants and firms throughout the reference period. Special attention was given to avoid measurement distortions that would have risen from using business register administrative files. The second part focuses on what factors are more likely to affect survival prospects of plants and firms. Using logistical analysis, econometric results confirm that plant exit has been determined by efficiency of its equipment, diversification strategy of the proprietary firm and, unexpectedly, its organizational history. Using survival analysis, econometric results reveal that the risk of exit for firms is lowered by pursuit of external growth strategies (acquisition of plants), concentration of production into graphic paper and being equipped with modern machinery. The third part examines the effects of firm turbulence on the evolution of concentration in the industry. The data show that acquisitions have been an important source of turnover among the leading companies and that a significant portion of the leading companies has been relatively new. The analysis also indicates that at least some turbulence has led to instability of market shares among the leading firms

    Statistical and image analysis methods and applications

    Get PDF

    Investigating the Association between Youth Unemployment and Mental Health Later in Life

    Get PDF
    Background: A small literature shows that youth unemployment is associated with poorer mental health later in life. Methods: Four empirical studies addressed gaps in the literature. Study 1 used Next Steps to estimate the association between youth unemployment and GHQ-12 scores at age 25. Specification curve analysis and a negative control outcome design were used to explore the robustness of the association to different modelling assumptions and to test whether the association could be easily explained by confounding. Study 2 used quantile and multivariate regression to explore heterogeneity in the association. Study 3 used data from the British Household Panel Survey and the United Kingdom Household Longitudinal Study to investigate differences in the association according to age at follow-up, year of birth, and macroeconomic conditions during early adulthood. Study 4 used the same datasets to explore the association between youth unemployment and later allostatic load, a potential mediator of the association between youth unemployment and mental health. Results: Youth unemployment was associated with worse GHQ-12 scores at age 25. The association was robust to defensible modelling assumptions. There was no association between youth unemployment and two placebo outcomes (Study 1). Quantile regression results suggested the association was driven by a minority of individuals with particularly poor GHQ-12 scores at age 25, but there were no clear differences in the association according to candidate moderators (Study 2). Youth unemployment was associated with poorer GHQ-12 regardless of age at follow-up, birth year, or unemployment rates during early adulthood (Study 3). Youth unemployment was related to higher allostatic load in females but not males. There was little evidence that allostatic load mediated associations with later mental health (Study 4). Conclusions: Research should attempt to identify individuals for whom youth unemployment is a stronger signal of future mental health problems and explore the factors which may mediate the association

    Advanced Dependency Modeling in Credit Risk - Lessons for Loss Given Default, Lifetime Expected Loss and Bank Capital Requirements

    Get PDF
    This cumulative thesis contributes to the literature on credit risk modeling and focuses on comovements of risk parameters that intensify losses during recessions. The models provide more precise estimates of credit risk and a better understanding of systematic risk. This can improve risk-based capital reserves and can help to avoid a severe underestimation of risk and capital shortfalls in economic downturn periods. Furthermore, the discussion of regulatory requirements and the supervision of internal risk models can benefit from empirical results. The first study extends the scope of loss given default (LGD) modeling by proposing the quantile regression to separately regress each quantile of the distribution. This approach enables a new look on covariate and particularly downturn effects that vary over quantiles. The second study analyzes the length of workout processes by a Cox proportional hazards model. Systematic effects are examined by the inclusion of time-varying frailties. The third study presents a copula model for the lifetime expected loss that combines accelerated failure time models for the default time with a beta regression of the LGD. The use of copulas provide continuous-time LGD forecasts and flexible dependence structures between default risk and loss severity. The fourth study combines a Probit model for the probability of default and a fractional response model for the LGD to demonstrate the impact of revised loan loss provisioning on bank capital requirements. In addition, goodness-of-fit measures enable to validate these approaches. Simulation studies and analyses of representative portfolios provide implications and demonstrate the significance of empirical results
    • …
    corecore