1,935 research outputs found

    Does Invariant Graph Learning via Environment Augmentation Learn Invariance?

    Full text link
    Invariant graph representation learning aims to learn the invariance among data from different environments for out-of-distribution generalization on graphs. As the graph environment partitions are usually expensive to obtain, augmenting the environment information has become the de facto approach. However, the usefulness of the augmented environment information has never been verified. In this work, we find that it is fundamentally impossible to learn invariant graph representations via environment augmentation without additional assumptions. Therefore, we develop a set of minimal assumptions, including variation sufficiency and variation consistency, for feasible invariant graph learning. We then propose a new framework Graph invAriant Learning Assistant (GALA). GALA incorporates an assistant model that needs to be sensitive to graph environment changes or distribution shifts. The correctness of the proxy predictions by the assistant model hence can differentiate the variations in spurious subgraphs. We show that extracting the maximally invariant subgraph to the proxy predictions provably identifies the underlying invariant subgraph for successful OOD generalization under the established minimal assumptions. Extensive experiments on datasets including DrugOOD with various graph distribution shifts confirm the effectiveness of GALA.Comment: NeurIPS 2023, 34 pages, 35 figure

    Positional Information Matters for Invariant In-Context Learning: A Case Study of Simple Function Classes

    Full text link
    In-context learning (ICL) refers to the ability of a model to condition on a few in-context demonstrations (input-output examples of the underlying task) to generate the answer for a new query input, without updating parameters. Despite the impressive ICL ability of LLMs, it has also been found that ICL in LLMs is sensitive to input demonstrations and limited to short context lengths. To understand the limitations and principles for successful ICL, we conduct an investigation with ICL linear regression of transformers. We characterize several Out-of-Distribution (OOD) cases for ICL inspired by realistic LLM ICL failures and compare transformers with DeepSet, a simple yet powerful architecture for ICL. Surprisingly, DeepSet outperforms transformers across a variety of distribution shifts, implying that preserving permutation invariance symmetry to input demonstrations is crucial for OOD ICL. The phenomenon specifies a fundamental requirement by ICL, which we termed as ICL invariance. Nevertheless, the positional encodings in LLMs will break ICL invariance. To this end, we further evaluate transformers with identical positional encodings and find preserving ICL invariance in transformers achieves state-of-the-art performance across various ICL distribution shiftsComment: Ongoing work; preliminary versio

    FOCUS: Fairness via Agent-Awareness for Federated Learning on Heterogeneous Data

    Full text link
    Federated learning (FL) allows agents to jointly train a global model without sharing their local data. However, due to the heterogeneous nature of local data, it is challenging to optimize or even define fairness of the trained global model for the agents. For instance, existing work usually considers accuracy equity as fairness for different agents in FL, which is limited, especially under the heterogeneous setting, since it is intuitively "unfair" to enforce agents with high-quality data to achieve similar accuracy to those who contribute low-quality data, which may discourage the agents from participating in FL. In this work, we propose a formal FL fairness definition, fairness via agent-awareness (FAA), which takes different contributions of heterogeneous agents into account. Under FAA, the performance of agents with high-quality data will not be sacrificed just due to the existence of large amounts of agents with low-quality data. In addition, we propose a fair FL training algorithm based on agent clustering (FOCUS) to achieve fairness in FL measured by FAA. Theoretically, we prove the convergence and optimality of FOCUS under mild conditions for linear and general convex loss functions with bounded smoothness. We also prove that FOCUS always achieves higher fairness in terms of FAA compared with standard FedAvg under both linear and general convex loss functions. Empirically, we show that on four FL datasets, including synthetic data, images, and texts, FOCUS achieves significantly higher fairness in terms of FAA while maintaining competitive prediction accuracy compared with FedAvg and state-of-the-art fair FL algorithms

    Transportation dynamics on networks of mobile agents

    Full text link
    Most existing works on transportation dynamics focus on networks of a fixed structure, but networks whose nodes are mobile have become widespread, such as cell-phone networks. We introduce a model to explore the basic physics of transportation on mobile networks. Of particular interest are the dependence of the throughput on the speed of agent movement and communication range. Our computations reveal a hierarchical dependence for the former while, for the latter, we find an algebraic power law between the throughput and the communication range with an exponent determined by the speed. We develop a physical theory based on the Fokker-Planck equation to explain these phenomena. Our findings provide insights into complex transportation dynamics arising commonly in natural and engineering systems

    Beyond Unfolding: Exact Recovery of Latent Convex Tensor Decomposition under Reshuffling

    Full text link
    Exact recovery of tensor decomposition (TD) methods is a desirable property in both unsupervised learning and scientific data analysis. The numerical defects of TD methods, however, limit their practical applications on real-world data. As an alternative, convex tensor decomposition (CTD) was proposed to alleviate these problems, but its exact-recovery property is not properly addressed so far. To this end, we focus on latent convex tensor decomposition (LCTD), a practically widely-used CTD model, and rigorously prove a sufficient condition for its exact-recovery property. Furthermore, we show that such property can be also achieved by a more general model than LCTD. In the new model, we generalize the classic tensor (un-)folding into reshuffling operation, a more flexible mapping to relocate the entries of the matrix into a tensor. Armed with the reshuffling operations and exact-recovery property, we explore a totally novel application for (generalized) LCTD, i.e., image steganography. Experimental results on synthetic data validate our theory, and results on image steganography show that our method outperforms the state-of-the-art methods.Comment: AAAI-202

    Privileged Prior Information Distillation for Image Matting

    Full text link
    Performance of trimap-free image matting methods is limited when trying to decouple the deterministic and undetermined regions, especially in the scenes where foregrounds are semantically ambiguous, chromaless, or high transmittance. In this paper, we propose a novel framework named Privileged Prior Information Distillation for Image Matting (PPID-IM) that can effectively transfer privileged prior environment-aware information to improve the performance of students in solving hard foregrounds. The prior information of trimap regulates only the teacher model during the training stage, while not being fed into the student network during actual inference. In order to achieve effective privileged cross-modality (i.e. trimap and RGB) information distillation, we introduce a Cross-Level Semantic Distillation (CLSD) module that reinforces the trimap-free students with more knowledgeable semantic representations and environment-aware information. We also propose an Attention-Guided Local Distillation module that efficiently transfers privileged local attributes from the trimap-based teacher to trimap-free students for the guidance of local-region optimization. Extensive experiments demonstrate the effectiveness and superiority of our PPID framework on the task of image matting. In addition, our trimap-free IndexNet-PPID surpasses the other competing state-of-the-art methods by a large margin, especially in scenarios with chromaless, weak texture, or irregular objects.Comment: 15 pages, 7 figure

    Multi-dark-state resonances in cold multi-Zeeman-sublevel atoms

    Get PDF
    We present our experimental and theoretical studies of multi-dark-state resonances (MDSRs) generated in a unique cold rubidium atomic system with only one coupling laser beam. Such MDSRs are caused by different transition strengths of the strong coupling beam connecting different Zeeman sublevels. Controlling the transparency windows in such electromagnetically induced transparency system can have potential applications in multi-wavelength optical communication and quantum information processing.Comment: 11pages, 4figure
    corecore