38 research outputs found

    Optimal-kk difference sequence in nonparametric regression

    Full text link
    Difference-based methods have been attracting increasing attention in nonparametric regression, in particular for estimating the residual variance.To implement the estimation, one needs to choose an appropriate difference sequence, mainly between {\em the optimal difference sequence} and {\em the ordinary difference sequence}. The difference sequence selection is a fundamental problem in nonparametric regression, and it remains a controversial issue for over three decades. In this paper, we propose to tackle this challenging issue from a very unique perspective, namely by introducing a new difference sequence called {\em the optimal-kk difference sequence}. The new difference sequence not only provides a better balance between the bias-variance trade-off, but also dramatically enlarges the existing family of difference sequences that includes the optimal and ordinary difference sequences as two important special cases. We further demonstrate, by both theoretical and numerical studies, that the optimal-kk difference sequence has been pushing the boundaries of our knowledge in difference-based methods in nonparametric regression, and it always performs the best in practical situations

    Propensity score regression for causal inference with treatment heterogeneity

    Full text link
    Understanding how treatment effects vary on individual characteristics is critical in the contexts of personalized medicine, personalized advertising and policy design. When the characteristics are of practical interest are only a subset of full covariate, non-parametric estimation is often desirable; but few methods are available due to the computational difficult. Existing non-parametric methods such as the inverse probability weighting methods have limitations that hinder their use in many practical settings where the values of propensity scores are close to 0 or 1. We propose the propensity score regression (PSR) that allows the non-parametric estimation of the heterogeneous treatment effects in a wide context. PSR includes two non-parametric regressions in turn, where it first regresses on the propensity scores together with the characteristics of interest, to obtain an intermediate estimate; and then, regress the intermediate estimates on the characteristics of interest only. By including propensity scores as regressors in the non-parametric manner, PSR is capable of substantially easing the computational difficulty while remain (locally) insensitive to any value of propensity scores. We present several appealing properties of PSR, including the consistency and asymptotical normality, and in particular the existence of an explicit variance estimator, from which the analytical behaviour of PSR and its precision can be assessed. Simulation studies indicate that PSR outperform existing methods in varying settings with extreme values of propensity scores. We apply our method to the national 2009 flu survey (NHFS) data to investigate the effects of seasonal influenza vaccination and having paid sick leave across different age groups

    Globalization? Trade war? A counterbalance perspective

    Get PDF
    The embrace of globalization and protectionism among economies has ebbed and flowed over the past few decades. These fluctuations call for quantitative analytics to help countries improve their trade policies. Changing attitudes about globalization also imply that the best trade policies may vary over time and be country-specific. We argue that the imports and exports of all economies constitute a counterbalanced network where conflict and cooperation are two sides of the same coin. Quantitative competitiveness is then formulated for each country using a network counterbalance equilibrium. A country could improve its relative strength in the network by embracing globalization, protectionism, trade collaboration, or conflict. This paper presents the necessary conditions for globalization and trade wars, evaluates their side effects, derives national bargaining powers, identifies appropriate targets for conflict or collaboration, and recommends fair resolutions for trade conflicts. Data and events from the past twenty years support these conditions

    Machine Learning Approaches to Predict Risks of Diabetic Complications and Poor Glycemic Control in Nonadherent Type 2 Diabetes

    Get PDF
    Purpose: The objective of this study was to evaluate the efficacy of machine learning algorithms in predicting risks of complications and poor glycemic control in nonadherent type 2 diabetes (T2D).Materials and Methods: This study was a real-world study of the complications and blood glucose prognosis of nonadherent T2D patients. Data of inpatients in Sichuan Provincial People’s Hospital from January 2010 to December 2015 were collected. The T2D patients who had neither been monitored for glycosylated hemoglobin A nor had changed their hyperglycemia treatment regimens within the last 12 months were the object of this study. Seven types of machine learning algorithms were used to develop 18 prediction models. The predictive performance was mainly assessed using the area under the curve of the testing set.Results: Of 800 T2D patients, 165 (20.6%) met the inclusion criteria, of which 129 (78.2%) had poor glycemic control (defined as glycosylated hemoglobin A ≥7%). The highest area under the curves of the testing set for diabetic nephropathy, diabetic peripheral neuropathy, diabetic angiopathy, diabetic eye disease, and glycosylated hemoglobin A were 0.902 ± 0.040, 0.859 ± 0.050, 0.889 ± 0.059, 0.832 ± 0.086, and 0.825 ± 0.092, respectively.Conclusion: Both univariate analysis and machine learning methods reached the same conclusion. The duration of T2D and the duration of unadjusted hypoglycemic treatment were the key risk factors of diabetic complications, and the number of hypoglycemic drugs was the key risk factor of glycemic control of nonadherent T2D. This was the first study to use machine learning algorithms to explore the potential adverse outcomes of nonadherent T2D. The performances of the final prediction models we developed were acceptable; our prediction performances outperformed most other previous studies in most evaluation measures. Those models have potential clinical applicability in improving T2D care

    LLM-Mini-CEX: Automatic Evaluation of Large Language Model for Diagnostic Conversation

    Full text link
    There is an increasing interest in developing LLMs for medical diagnosis to improve diagnosis efficiency. Despite their alluring technological potential, there is no unified and comprehensive evaluation criterion, leading to the inability to evaluate the quality and potential risks of medical LLMs, further hindering the application of LLMs in medical treatment scenarios. Besides, current evaluations heavily rely on labor-intensive interactions with LLMs to obtain diagnostic dialogues and human evaluation on the quality of diagnosis dialogue. To tackle the lack of unified and comprehensive evaluation criterion, we first initially establish an evaluation criterion, termed LLM-specific Mini-CEX to assess the diagnostic capabilities of LLMs effectively, based on original Mini-CEX. To address the labor-intensive interaction problem, we develop a patient simulator to engage in automatic conversations with LLMs, and utilize ChatGPT for evaluating diagnosis dialogues automatically. Experimental results show that the LLM-specific Mini-CEX is adequate and necessary to evaluate medical diagnosis dialogue. Besides, ChatGPT can replace manual evaluation on the metrics of humanistic qualities and provides reproducible and automated comparisons between different LLMs

    Efficient estimation of nonparametric genetic risk function with censored data

    Get PDF
    With an increasing number of causal genes discovered for complex human disorders, it is crucial to assess the genetic risk of disease onset for individuals who are carriers of these causal mutations and compare the distribution of age-at-onset with that in non-carriers. In many genetic epidemiological studies aiming at estimating causal gene effect on disease, the age-at-onset of disease is subject to censoring. In addition, some individuals’ mutation carrier or non-carrier status can be unknown due to the high cost of in-person ascertainment to collect DNA samples or death in older individuals. Instead, the probability of these individuals’ mutation status can be obtained from various sources. When mutation status is missing, the available data take the form of censored mixture data. Recently, various methods have been proposed for risk estimation from such data, but none is efficient for estimating a nonparametric distribution. We propose a fully efficient sieve maximum likelihood estimation method, in which we estimate the logarithm of the hazard ratio between genetic mutation groups using B-splines, while applying nonparametric maximum likelihood estimation for the reference baseline hazard function. Our estimator can be calculated via an expectation-maximization algorithm which is much faster than existing methods. We show that our estimator is consistent and semiparametrically efficient and establish its asymptotic distribution. Simulation studies demonstrate superior performance of the proposed method, which is applied to the estimation of the distribution of the age-at-onset of Parkinson's disease for carriers of mutations in the leucine-rich repeat kinase 2 gene

    Tapered Optical Fiber Sensor for Label-Free Detection of Biomolecules

    Get PDF
    This paper presents a fast, highly sensitive and low-cost tapered optical fiber biosensor that enables the label-free detection of biomolecules. The sensor takes advantage of the interference effect between the fiber’s first two propagation modes along the taper waist region. The biomolecules bonded on the taper surface were determined by demodulating the transmission spectrum phase shift. Because of the sharp spectrum fringe signals, as well as a relatively long biomolecule testing region, the sensor displayed a fast response and was highly sensitive. To better understand the influence of various biomolecules on the sensor, a numerical simulation that varied biolayer parameters such as thickness and refractive index was performed. The results showed that the spectrum fringe shift was obvious to be measured even when the biolayer was only nanometers thick. A microchannel chip was designed and fabricated for the protection of the sensor and biotesting. Microelectromechanical systems (MEMS) fabrication techniques were used to precisely control the profile and depth of the microchannel on the silicon chip with an accuracy of 2 μm. A tapered optical fiber biosensor was fabricated and evaluated with an Immune globulin G (IgG) antibody-antigen pair

    Censored quantile regression via Box-Cox transformation under conditional independence

    No full text
    We propose a new quantile regression model when data are subject to censoring. Our model does not require any global linearity assumption, or independence of the covariates and the censoring time. We develop a class of power-transformed quantile regression models such that the transformed survival time can be better characterized by linear regression quantiles. Consistency and asymptotic normality of the resulting estimators are shown. A re-sampling based approach is proposed for statistical inference. Empirically, the new estimator is shown to outperform its competitors under conditional independence, and perform similarly under unconditional independence. The proposed method is illustrated with a data analysis
    corecore