37 research outputs found

    Squeeze All: Novel Estimator and Self-Normalized Bound for Linear Contextual Bandits

    Full text link
    We propose a linear contextual bandit algorithm with O(dTlogT)O(\sqrt{dT\log T}) regret bound, where dd is the dimension of contexts and TT isthe time horizon. Our proposed algorithm is equipped with a novel estimator in which exploration is embedded through explicit randomization. Depending on the randomization, our proposed estimator takes contributions either from contexts of all arms or from selected contexts. We establish a self-normalized bound for our estimator, which allows a novel decomposition of the cumulative regret into \textit{additive} dimension-dependent terms instead of multiplicative terms. We also prove a novel lower bound of Ω(dT)\Omega(\sqrt{dT}) under our problem setting. Hence, the regret of our proposed algorithm matches the lower bound up to logarithmic factors. The numerical experiments support the theoretical guarantees and show that our proposed method outperforms the existing linear bandit algorithms.Comment: Accepted in Artificial Intelligence and Statistics 202

    Kernel-convoluted Deep Neural Networks with Data Augmentation

    Full text link
    The Mixup method (Zhang et al. 2018), which uses linearly interpolated data, has emerged as an effective data augmentation tool to improve generalization performance and the robustness to adversarial examples. The motivation is to curtail undesirable oscillations by its implicit model constraint to behave linearly at in-between observed data points and promote smoothness. In this work, we formally investigate this premise, propose a way to explicitly impose smoothness constraints, and extend it to incorporate with implicit model constraints. First, we derive a new function class composed of kernel-convoluted models (KCM) where the smoothness constraint is directly imposed by locally averaging the original functions with a kernel function. Second, we propose to incorporate the Mixup method into KCM to expand the domains of smoothness. In both cases of KCM and the KCM adapted with the Mixup, we provide risk analysis, respectively, under some conditions for kernels. We show that the upper bound of the excess risk is not slower than that of the original function class. The upper bound of the KCM with the Mixup remains dominated by that of the KCM if the perturbation of the Mixup vanishes faster than O(n1/2)O(n^{-1/2}) where nn is a sample size. Using CIFAR-10 and CIFAR-100 datasets, our experiments demonstrate that the KCM with the Mixup outperforms the Mixup method in terms of generalization and robustness to adversarial examples

    Wasserstein Geodesic Generator for Conditional Distributions

    Full text link
    Generating samples given a specific label requires estimating conditional distributions. We derive a tractable upper bound of the Wasserstein distance between conditional distributions to lay the theoretical groundwork to learn conditional distributions. Based on this result, we propose a novel conditional generation algorithm where conditional distributions are fully characterized by a metric space defined by a statistical distance. We employ optimal transport theory to propose the \textit{Wasserstein geodesic generator}, a new conditional generator that learns the Wasserstein geodesic. The proposed method learns both conditional distributions for observed domains and optimal transport maps between them. The conditional distributions given unobserved intermediate domains are on the Wasserstein geodesic between conditional distributions given two observed domain labels. Experiments on face images with light conditions as domain labels demonstrate the efficacy of the proposed method

    A robust calibration-assisted method for linear mixed effects model under cluster-specific nonignorable missingness

    Get PDF
    We propose a method for linear mixed effects models when the covariates are completely observed but the outcome of interest is subject to missing under cluster-specific nonignorable (CSNI) missingness. Our strategy is to replace missing quantities in the full-data objective function with unbiased predictors derived from inverse probability weighting and calibration technique. The proposed approach can be applied to estimating equations or likelihood functions with modified E-step, and does not require numerical integration as do previous methods. Unlike usual inverse probability weighting, the proposed method does not require correct specification of the response model as long as the CSNI assumption is correct, and renders inference under CSNI without a full distributional assumption. Consistency and asymptotic normality are shown with a consistent variance estimator. Simulation results and a data example are presented

    Lipschitz Continuous Autoencoders in Application to Anomaly Detection

    Get PDF
    Anomaly detection is the task of finding abnormal data that are distinct from normal behavior. Current deep learning-based anomaly detection methods train neural networks with normal data alone and calculate anomaly scores based on the trained model. In this work, we formalize current practices, build a theoretical framework of anomaly detection algorithms equipped with an objective function and a hypothesis space, and establish a desirable property of the anomaly detection algorithm, namely, admissibility. Admissibility implies that optimal autoencoders for normal data yield a larger reconstruction error for anomalous data than that for normal data on average. We then propose a class of admissible anomaly detection algorithms equipped with an integral probability metric-based objective function and a class of autoencoders, Lipschitz continuous autoencoders. The proposed algorithm for Wasserstein distance is implemented by minimizing an approximated Wasserstein distance with a penalty to enforce Lipschitz continuity with respect to Wasserstein distance. Through ablation studies, we demonstrate the efficacy of enforcing Lipschitz continuity of the proposed method. The proposed method is shown to be more effective in detecting anomalies than existing methods via applications to network traffic and image datasets(1).N

    Predictors and outcomes of unplanned readmission to a different hospital

    Get PDF
    Objectives: To examine patient, hospital and market factors and outcomes associated with readmission to a different hospital compared with the same hospital. Design: A population-based, secondary analysis using multilevel causal modeling. Setting: Acute care hospitals in California in the USA. Participants: In total, 509 775 patients aged 50 or older who were discharged alive from acute care hospitals (index hospitalizations), and 59 566 who had a rehospitalization within 30 days following their index discharge. Intervention: No intervention. Main Outcome Measures(s): Thirty-day unplanned readmissions to a different hospital compared with the same hospital and also the costs and health outcomes of the readmissions. Results: Twenty-one percent of patients with a rehospitalization had a different-hospital readmission. Compared with the same-hospital readmission group, the different-hospital readmission group was more likely to be younger, male and have a lower income. The index hospitals of the different-hospital readmission group were more likely to be smaller, for-profit hospitals, which were also more likely to be located in counties with higher competition. The different-hospital readmission group had higher odds for in-hospital death (8.1 vs. 6.7%; P < 0.0001) and greater readmission hospital costs (15671.8vs.15 671.8 vs. 14 286.4; P < 0.001) than the same-hospital readmission group. Conclusions: Patient, hospital and market characteristics predicted different-hospital readmissions compared with same-hospital readmissions. Mortality and cost outcomes were worse among patients with different-hospital readmissions. Strategies for better care coordination targeting people at risk for different-hospital readmissions are necessary

    Evaluation of a technology-enhanced integrated care model for frail older persons: protocol of the SPEC study, a stepped-wedge cluster randomized trial in nursing homes

    Get PDF
    Background Limited evidence exists on the effectiveness of the chronic care model for people with multimorbidity. This study aims to evaluate the effectiveness of an information and communication technology- (ICT-)enhanced integrated care model, called Systems for Person-centered Elder Care (SPEC), for frail older adults at nursing homes. Methods/Design SPEC is a prospective stepped-wedge cluster randomized trial conducted at 10 nursing homes in South Korea. Residents aged 65 or older meeting the inclusion/exclusion criteria in all the homes are eligible to participate. The multifaceted SPEC intervention, a geriatric care model guided by the chronic care model, consists of five components: comprehensive geriatric assessment for need/risk profiling, individual need-based care planning, interdisciplinary case conferences, person-centered care coordination, and a cloud-based information and communications technology (ICT) tool supporting the intervention process. The primary outcome is quality of care for older residents using a composite measure of quality indicators from the interRAI LTCF assessment system. Outcome assessors and data analysts will be blinded to group assignment. Secondary outcomes include quality of life, healthcare utilization, and cost. Process evaluation will be also conducted. Discussion This study is expected to provide important new evidence on the effectiveness, cost-effectiveness, and implementation process of an ICT-supported chronic care model for older persons with multiple chronic illnesses. The SPEC intervention is also unique as the first registered trial implementing an integrated care model using technology to promote person-centered care for frail older nursing home residents in South Korea, where formal LTC was recently introduced

    ISLES 2016 and 2017-Benchmarking ischemic stroke lesion outcome prediction based on multispectral MRI

    Get PDF
    Performance of models highly depend not only on the used algorithm but also the data set it was applied to. This makes the comparison of newly developed tools to previously published approaches difficult. Either researchers need to implement others' algorithms first, to establish an adequate benchmark on their data, or a direct comparison of new and old techniques is infeasible. The Ischemic Stroke Lesion Segmentation (ISLES) challenge, which has ran now consecutively for 3 years, aims to address this problem of comparability. ISLES 2016 and 2017 focused on lesion outcome prediction after ischemic stroke: By providing a uniformly pre-processed data set, researchers from all over the world could apply their algorithm directly. A total of nine teams participated in ISLES 2015, and 15 teams participated in ISLES 2016. Their performance was evaluated in a fair and transparent way to identify the state-of-the-art among all submissions. Top ranked teams almost always employed deep learning tools, which were predominately convolutional neural networks (CNNs). Despite the great efforts, lesion outcome prediction persists challenging. The annotated data set remains publicly available and new approaches can be compared directly via the online evaluation system, serving as a continuing benchmark (www.isles-challenge.org).Fundacao para a Ciencia e Tecnologia (FCT), Portugal (scholarship number PD/BD/113968/2015). FCT with the UID/EEA/04436/2013, by FEDER funds through COMPETE 2020, POCI-01-0145-FEDER-006941. NIH Blueprint for Neuroscience Research (T90DA022759/R90DA023427) and the National Institute of Biomedical Imaging and Bioengineering (NIBIB) of the National Institutes of Health under award number 5T32EB1680. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. PAC-PRECISE-LISBOA-01-0145-FEDER-016394. FEDER-POR Lisboa 2020-Programa Operacional Regional de Lisboa PORTUGAL 2020 and Fundacao para a Ciencia e a Tecnologia. GPU computing resources provided by the MGH and BWH Center for Clinical Data Science Graduate School for Computing in Medicine and Life Sciences funded by Germany's Excellence Initiative [DFG GSC 235/2]. National Research National Research Foundation of Korea (NRF) MSIT, NRF-2016R1C1B1012002, MSIT, No. 2014R1A4A1007895, NRF-2017R1A2B4008956 Swiss National Science Foundation-DACH 320030L_163363
    corecore