26,756 research outputs found

    Continual Invariant Risk Minimization

    Full text link
    Empirical risk minimization can lead to poor generalization behavior on unseen environments if the learned model does not capture invariant feature representations. Invariant risk minimization (IRM) is a recent proposal for discovering environment-invariant representations. IRM was introduced by Arjovsky et al. (2019) and extended by Ahuja et al. (2020). IRM assumes that all environments are available to the learning system at the same time. With this work, we generalize the concept of IRM to scenarios where environments are observed sequentially. We show that existing approaches, including those designed for continual learning, fail to identify the invariant features and models across sequentially presented environments. We extend IRM under a variational Bayesian and bilevel framework, creating a general approach to continual invariant risk minimization. We also describe a strategy to solve the optimization problems using a variant of the alternating direction method of multiplier (ADMM). We show empirically using multiple datasets and with multiple sequential environments that the proposed methods outperform or is competitive with prior approaches.Comment: Shorter version of this paper was presented at RobustML workshop of ICLR 202

    Conformal Inference for Invariant Risk Minimization

    Full text link
    The application of machine learning models can be significantly impeded by the occurrence of distributional shifts, as the assumption of homogeneity between the population of training and testing samples in machine learning and statistics may not be feasible in practical situations. One way to tackle this problem is to use invariant learning, such as invariant risk minimization (IRM), to acquire an invariant representation that aids in generalization with distributional shifts. This paper develops methods for obtaining distribution-free prediction regions to describe uncertainty estimates for invariant representations, accounting for the distribution shifts of data from different environments. Our approach involves a weighted conformity score that adapts to the specific environment in which the test sample is situated. We construct an adaptive conformal interval using the weighted conformity score and prove its conditional average under certain conditions. To demonstrate the effectiveness of our approach, we conduct several numerical experiments, including simulation studies and a practical example using real-world data.Comment: arXiv admin note: text overlap with arXiv:2209.1135
    • …
    corecore