54 research outputs found

    Doubly-Robust Off-Policy Evaluation with Estimated Logging Policy

    Full text link
    We introduce a novel doubly-robust (DR) off-policy evaluation (OPE) estimator for Markov decision processes, DRUnknown, designed for situations where both the logging policy and the value function are unknown. The proposed estimator initially estimates the logging policy and then estimates the value function model by minimizing the asymptotic variance of the estimator while considering the estimating effect of the logging policy. When the logging policy model is correctly specified, DRUnknown achieves the smallest asymptotic variance within the class containing existing OPE estimators. When the value function model is also correctly specified, DRUnknown is optimal as its asymptotic variance reaches the semiparametric lower bound. We present experimental results conducted in contextual bandits and reinforcement learning to compare the performance of DRUnknown with that of existing methods

    Squeeze All: Novel Estimator and Self-Normalized Bound for Linear Contextual Bandits

    Full text link
    We propose a linear contextual bandit algorithm with O(dTlogT)O(\sqrt{dT\log T}) regret bound, where dd is the dimension of contexts and TT isthe time horizon. Our proposed algorithm is equipped with a novel estimator in which exploration is embedded through explicit randomization. Depending on the randomization, our proposed estimator takes contributions either from contexts of all arms or from selected contexts. We establish a self-normalized bound for our estimator, which allows a novel decomposition of the cumulative regret into \textit{additive} dimension-dependent terms instead of multiplicative terms. We also prove a novel lower bound of Ω(dT)\Omega(\sqrt{dT}) under our problem setting. Hence, the regret of our proposed algorithm matches the lower bound up to logarithmic factors. The numerical experiments support the theoretical guarantees and show that our proposed method outperforms the existing linear bandit algorithms.Comment: Accepted in Artificial Intelligence and Statistics 202

    Kernel-convoluted Deep Neural Networks with Data Augmentation

    Full text link
    The Mixup method (Zhang et al. 2018), which uses linearly interpolated data, has emerged as an effective data augmentation tool to improve generalization performance and the robustness to adversarial examples. The motivation is to curtail undesirable oscillations by its implicit model constraint to behave linearly at in-between observed data points and promote smoothness. In this work, we formally investigate this premise, propose a way to explicitly impose smoothness constraints, and extend it to incorporate with implicit model constraints. First, we derive a new function class composed of kernel-convoluted models (KCM) where the smoothness constraint is directly imposed by locally averaging the original functions with a kernel function. Second, we propose to incorporate the Mixup method into KCM to expand the domains of smoothness. In both cases of KCM and the KCM adapted with the Mixup, we provide risk analysis, respectively, under some conditions for kernels. We show that the upper bound of the excess risk is not slower than that of the original function class. The upper bound of the KCM with the Mixup remains dominated by that of the KCM if the perturbation of the Mixup vanishes faster than O(n1/2)O(n^{-1/2}) where nn is a sample size. Using CIFAR-10 and CIFAR-100 datasets, our experiments demonstrate that the KCM with the Mixup outperforms the Mixup method in terms of generalization and robustness to adversarial examples

    A robust calibration-assisted method for linear mixed effects model under cluster-specific nonignorable missingness

    Get PDF
    We propose a method for linear mixed effects models when the covariates are completely observed but the outcome of interest is subject to missing under cluster-specific nonignorable (CSNI) missingness. Our strategy is to replace missing quantities in the full-data objective function with unbiased predictors derived from inverse probability weighting and calibration technique. The proposed approach can be applied to estimating equations or likelihood functions with modified E-step, and does not require numerical integration as do previous methods. Unlike usual inverse probability weighting, the proposed method does not require correct specification of the response model as long as the CSNI assumption is correct, and renders inference under CSNI without a full distributional assumption. Consistency and asymptotic normality are shown with a consistent variance estimator. Simulation results and a data example are presented

    Wasserstein Geodesic Generator for Conditional Distributions

    Full text link
    Generating samples given a specific label requires estimating conditional distributions. We derive a tractable upper bound of the Wasserstein distance between conditional distributions to lay the theoretical groundwork to learn conditional distributions. Based on this result, we propose a novel conditional generation algorithm where conditional distributions are fully characterized by a metric space defined by a statistical distance. We employ optimal transport theory to propose the \textit{Wasserstein geodesic generator}, a new conditional generator that learns the Wasserstein geodesic. The proposed method learns both conditional distributions for observed domains and optimal transport maps between them. The conditional distributions given unobserved intermediate domains are on the Wasserstein geodesic between conditional distributions given two observed domain labels. Experiments on face images with light conditions as domain labels demonstrate the efficacy of the proposed method

    Lipschitz Continuous Autoencoders in Application to Anomaly Detection

    Get PDF
    Anomaly detection is the task of finding abnormal data that are distinct from normal behavior. Current deep learning-based anomaly detection methods train neural networks with normal data alone and calculate anomaly scores based on the trained model. In this work, we formalize current practices, build a theoretical framework of anomaly detection algorithms equipped with an objective function and a hypothesis space, and establish a desirable property of the anomaly detection algorithm, namely, admissibility. Admissibility implies that optimal autoencoders for normal data yield a larger reconstruction error for anomalous data than that for normal data on average. We then propose a class of admissible anomaly detection algorithms equipped with an integral probability metric-based objective function and a class of autoencoders, Lipschitz continuous autoencoders. The proposed algorithm for Wasserstein distance is implemented by minimizing an approximated Wasserstein distance with a penalty to enforce Lipschitz continuity with respect to Wasserstein distance. Through ablation studies, we demonstrate the efficacy of enforcing Lipschitz continuity of the proposed method. The proposed method is shown to be more effective in detecting anomalies than existing methods via applications to network traffic and image datasets(1).N

    Predictors and outcomes of unplanned readmission to a different hospital

    Get PDF
    Objectives: To examine patient, hospital and market factors and outcomes associated with readmission to a different hospital compared with the same hospital. Design: A population-based, secondary analysis using multilevel causal modeling. Setting: Acute care hospitals in California in the USA. Participants: In total, 509 775 patients aged 50 or older who were discharged alive from acute care hospitals (index hospitalizations), and 59 566 who had a rehospitalization within 30 days following their index discharge. Intervention: No intervention. Main Outcome Measures(s): Thirty-day unplanned readmissions to a different hospital compared with the same hospital and also the costs and health outcomes of the readmissions. Results: Twenty-one percent of patients with a rehospitalization had a different-hospital readmission. Compared with the same-hospital readmission group, the different-hospital readmission group was more likely to be younger, male and have a lower income. The index hospitals of the different-hospital readmission group were more likely to be smaller, for-profit hospitals, which were also more likely to be located in counties with higher competition. The different-hospital readmission group had higher odds for in-hospital death (8.1 vs. 6.7%; P < 0.0001) and greater readmission hospital costs (15671.8vs.15 671.8 vs. 14 286.4; P < 0.001) than the same-hospital readmission group. Conclusions: Patient, hospital and market characteristics predicted different-hospital readmissions compared with same-hospital readmissions. Mortality and cost outcomes were worse among patients with different-hospital readmissions. Strategies for better care coordination targeting people at risk for different-hospital readmissions are necessary
    corecore