4 research outputs found

    Quantile regression models for current status data

    Get PDF
    Current status data arise frequently in demography, epidemiology, and econometrics where the exact failure time cannot be determined but is only known to have occurred before or after a known observation time. We propose a quantile regression model to analyze current status data, because it does not require distributional assumptions and the coefficients can be interpreted as direct regression effects on the distribution of failure time in the original time scale. Our model assumes that the conditional quantile of failure time is a linear function of covariates. We assume conditional independence between the failure time and observation time. An M-estimator is developed for parameter estimation which is computed using the concave-convex procedure and its confidence intervals are constructed using a subsampling method. Asymptotic properties for the estimator are derived and proven using modern empirical process theory. The small sample performance of the proposed method is demonstrated via simulation studies. Finally, we apply the proposed method to analyze data from the Mayo Clinic Study of Aging

    On Rank-Based Inference for Quantile Regression

    Full text link
    Quantile regression is a useful tool for testing the possible effect of covariates, especially when the effect is heterogeneous. Classical methods designed to test the effect at one quantile level can be sensitive to the quantile level choice. In this dissertation, we propose a regional quantile regression rank test as a generalization of the rank test at an individual quantile level. The proposed test statistic allows us to detect the treatment effect for a prespecified quantile interval by integrating the regression rank scores over the quantile region of interest. A new model-based bootstrap method is constructed to estimate the null distribution of the test statistic. A simulation study is conducted to demonstrate the validity and usefulness of the proposed test. We also illustrate the power of the proposed test using sub-samples from the 2016 US birth weight data. We then generalize the regional quantile regression rank test to censored quantile regression settings. We propose a censored version of the regression rank score using the redistribution of the probability mass for each censored observation. The model-based bootstrap algorithm is also generalized to implement the test. We illustrate the advantage of the proposed method through simulation and apply our method to study how the early environment condition influences the survival time of the bighorn sheep. In a related study, we consider the genome-wide association study where the goal is to select genes that are associated with an outcome of interest. One major challenge for the genome-wide association study is how to handle the possible interactions between the genes and the environment. We propose a two-stage model, including one that relies on the conditional quantile levels of the outcome variables, to allow the genes to have comprehensive interactions with the environment. We use the two-stage model to study a lung cancer data set to identify new genes that can potentially influence lung cancer patients' survival time.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163260/1/yuansun_1.pd

    Three essays in econometrics: heterogeneity across outcomes, time and physicians

    Get PDF
    This thesis consists of three essays in econometrics. The first part of the thesis, consisting of chapters one and two, treats novel estimators aiming at distributional effects. In the first chapter, I introduce a methodology to estimate the joint distribution of multiple outcome variables. The second chapter incorporates censoring, a prevalent challenge when analyzing duration data, into distribution regression methods. Finally, the third chapter analyzes a reform in the health care sector. Chapter 1 introduces Multivariate Distribution Regression (MDR), a semi-parametric approach to model the joint distribution of several outcome variables. Typically, researchers are interested in the effects on multiple outcomes when the latter are correlated (see Patton, 2012, for an overview). For instance, this is the case for the impact of a crisis on asset prices. Asset prices co-move tightly as they depend on common factors such as market cycles. A shock on one price index may thus affect many other indices. In addition, the effect could vary across the distribution of the prices - a peculiarity that MDR accounts for. Essentially, MDR estimates the impact of interest at every point of the outcome’s distribution. MDR’s most obvious advantage is its flexibility. Existing methods, such as copula models, typically impose a parametric form of the dependence structure across outcomes (i.e. Klein et al., 2019). In contrast, MDR does not require equally restrictive, parametric assumptions. Thus, the effects estimated using MDR describe the underlying mechanisms more accurately. Further, MDR generalizes two well-known estimators: (i) the empirical multivariate cumulative CDF by allowing for covariates and (ii) univariate Distribution Regression (DR) by considering multiple outcomes. Building on earlier work in the field (Chernozhukov et al., 2013), I establish that MDR consistently estimates the regression coefficient process. Further, I show that coefficients are well-behaved and converge to a Gaussian process, with the bootstrap being a consistent tool to assess the asymptotic distribution. To illustrate the usefulness of MDR, I estimate the effect of disability insurance benefits on labor supply responses among Swiss households. Generally, receiving these benefits is related to lower incentives to supply labor (i.e. Autor et al., 2016). Autor et al. (2019) find that spouses increase their labor supply once their partner is disabled. My results indicate that spouses of low-income partners do respond as suggested by Autor et al. (2019). Yet, among average to high-income households, the need to compensate for the financial loss appears less immediate. In Chapter 2, co-authored with Blaise Melly, we incorporate censoring into the univariate DR model. The resulting estimator, censored distribution regression (CDR), allows studying how the covariates’ effects vary over time. From a theoretical perspective, CDR represents a generalization of three existing estimators. In particular, CDR simplifies (i) to the Kaplan-Meier estimator in the absence of covariates Kaplan and Meier (1958), (ii) to distribution regression in the absence of censoring, and (iii) to Cox’s proportional hazard estimator in the absence of heterogeneity (Cox, 1972). As our main results, we establish weak convergence of the coefficient process to a Gaussian process. The standard tool to analyze duration data is Cox’s proportional hazard model, which assumes time-constant effects. On many occasions, this assumption seems too restrictive. For instance, job search behavior differs during unemployment. In this context, we apply the CDR estimator to estimate the effect of potential benefit duration (PBD) on unemployment spells. Search models suggest that faced with the upcoming exhaustion of benefits, individuals intensify their search efforts and lower their target wages (Krueger and Mueller, 2016; Marinescu and Skandalis, 2021). Our results indicate that PBD has a negligible effect for short-term unemployed but a strong and significant effect for the long-term unemployed. This is in line with an increased likelihood of finding a job once the benefits are close to exhaustion. In Chapter 3, co-authored with Tamara Bischof, we address how physicians respond to changes in their financial incentives. We exploit plausibly exogenous changes in the fee structure for medical services in the outpatient sector. The tariff partners, the health care providers and insurances, failed to reach an agreement on how to reform the outdated tariff scheme TARMED. In response, the federal government set the new fees, causing a revenue loss of up to 40% for single physicians. Previous research suggests that physicians may respond in two different ways: Faced with a revenue loss, physicians can (i) substitute from low-paying to more attractive services and (ii) increase their overall health care supply (i.e. Clemens and Gottlieb, 2014; McGuire and Pauly, 1991; Yip, 1998). Our main goal is to disentangle these two channels and to quantify their relative importance. Our results are threefold. (i) We find that providers raise (lower) the volume of services that have become relatively more (less) attractive. (ii) Physicians increase their overall volume of services and treat more patients when they lose a significant share of their revenue. (iii) Finally, a comparative exercise indicates that volume expansions are far more important than substitution responses. In particular, a revenue loss of 5% leads to an increase in the overall supply of roughly 3% whereas we do not observe a significant rise in substitution responses. Concerning policy implications, our results suggest (i) that gradual fee changes may prevent strong and costly reactions due to more considerable revenue losses. (ii) Further, policy-makers could directly incentivize physicians to provide services that are of high value for consumers

    QUANTILE REGRESSION MODELS FOR INTERVAL-CENSORED FAILURE TIME DATA

    Get PDF
    Quantile regression models the conditional quantile as a function of independent variables providing a complete association between the response and predictors. Quantile regression can describe the association at different quantiles yielding more information than the least squares method which only detects associations with the conditional mean. Quantile regression models have gained popularity in many disciplines including medicine, finance, economics, and ecology as they can accommodate heteroscedasticity. A specific type of failure time data is called interval-censored where the failure time is only known to have occurred between certain observation times. Such data appears commonly in medical or longitudinal studies because disease onset is known to have occurred between scheduled visits but the exact time is unknown. Quantile regression has been extended to survival analysis with random censoring time. Most methods focus on survival analysis with right-censored data while a few were developed for data with other censoring mechanisms. Despite the fact that the development for censored quantile regression flourishes, limited work has been done to handle interval-censored failure time data under the quantile regression framework. In this dissertation, we developed a new method to analyze interval-censored failure time data using conditional quantile regression models. Our method can handle both Case I and Case II interval-censored data and allow the censoring time to depend on covariates. We developed an estimation procedure that is computationally efficient and easy to implement with inference performed using a subsampling method. The consistency and asymptotic distribution of the resulting estimators were established using modern empirical process theory. The developed method was extended as a computational tool to analyze interval-censored data for accelerated failure time models. The estimators from different quantiles were combined to increase the efficiency of the estimators. The small sample performances were demonstrated via simulation studies. The proposed methods were illustrated with current status datasets, data from the Voluntary HIV-1 Counseling and Testing Efficacy Study Group and calcification study, and Case II interval-censored data, data from the Atherosclerosis Risk in Communities Study and breast cosmesis data.Doctor of Philosoph
    corecore