11 research outputs found

    Causal Interpretation of Self-Attention in Pre-Trained Transformers

    Full text link
    We propose a causal interpretation of self-attention in the Transformer neural network architecture. We interpret self-attention as a mechanism that estimates a structural equation model for a given input sequence of symbols (tokens). The structural equation model can be interpreted, in turn, as a causal structure over the input symbols under the specific context of the input sequence. Importantly, this interpretation remains valid in the presence of latent confounders. Following this interpretation, we estimate conditional independence relations between input symbols by calculating partial correlations between their corresponding representations in the deepest attention layer. This enables learning the causal structure over an input sequence using existing constraint-based algorithms. In this sense, existing pre-trained Transformers can be utilized for zero-shot causal-discovery. We demonstrate this method by providing causal explanations for the outcomes of Transformers in two tasks: sentiment classification (NLP) and recommendation.Comment: 37th Conference on Neural Information Processing Systems (NeurIPS 2023). arXiv admin note: text overlap with arXiv:2210.1062

    From Temporal to Contemporaneous Iterative Causal Discovery in the Presence of Latent Confounders

    Full text link
    We present a constraint-based algorithm for learning causal structures from observational time-series data, in the presence of latent confounders. We assume a discrete-time, stationary structural vector autoregressive process, with both temporal and contemporaneous causal relations. One may ask if temporal and contemporaneous relations should be treated differently. The presented algorithm gradually refines a causal graph by learning long-term temporal relations before short-term ones, where contemporaneous relations are learned last. This ordering of causal relations to be learnt leads to a reduction in the required number of statistical tests. We validate this reduction empirically and demonstrate that it leads to higher accuracy for synthetic data and more plausible causal graphs for real-world data compared to state-of-the-art algorithms.Comment: Proceedings of the 40-th International Conference on Machine Learning (ICML), 202

    Quantum Conserved Currents in Affine Toda Theories

    Full text link
    We study the renormalization and conservation at the quantum level of higher-spin currents in affine Toda theories with particular emphasis on the nonsimply-laced cases. For specific examples, namely the spin-3 current for the a3(2)a_3^{(2)} and c2(1)c_2^{(1)} theories, we prove conservation to all-loop order, thus establishing the existence of factorized S-matrices. For these theories, as well as the simply-laced a2(1)a_2^{(1)} theory, we compute one-loop corrections to the corresponding higher-spin charges and study charge conservation for the three-particle vertex function. For the a3(2)a_3^{(2)} theory we show that although the current is conserved, anomalous threshold singularities spoil the conservation of the corresponding charge for the on-shell vertex function, implying a breakdown of some of the bootstrap procedures commonly used in determining the exact S-matrix.Comment: 19 page

    Quantum Conserved Currents in Supersymmetric Toda Theories

    Full text link
    We consider N=1N=1 supersymmetric Toda theories which admit a fermionic untwisted affine extension, i.e. the systems based on the A(n,n)A(n,n), D(n+1,n)D(n+1,n) and B(n,n)B(n,n) superalgebras. We construct the superspace Miura trasformations which allow to determine the W-supercurrents of the conformal theories and we compute their renormalized expressions. The analysis of the renormalization and conservation of higher-spin currents is then performed for the corresponding supersymmetric massive theories. We establish the quantum integrability of these models and show that although their Lagrangian is not hermitian, the masses of the fundamental particles are real, a property which is maintained by one-loop corrections. The spectrum is actually much richer, since the theories admit solitons. The existence of quantum conserved higher-spin charges implies that elastic, factorized S-matrices can be constructed.Comment: 35 pages, IFUM 426/F

    KINETIC IONIZATION WAVES IN A CHARGE IN NEON

    No full text
    A positive pole of a glow discharge in inert gases in considered in the paper aiming at the complex theoretical and experimental investigation of a longitudinal-homogeneous and stratified pole. During the investigation sound and optical methods for the investigation of a low-temperature plasma, solution numerical methods for intergrodifferential equations have been used. As a result the methodology for the self-coordinate calculation of characteristics of a longitudinal-homogeneous positive pole has been developed and realized. The kinetic theory for the formation of strats near a lower boundary of their existence on current has been developed. Mechanisms of the discharge stratification have been cleared up. The theoretical and experimental results are in good coordination, that gives the possibilities of the qualitative description of various types of discharges at low pressureAvailable from VNTIC / VNTIC - Scientific & Technical Information Centre of RussiaSIGLERURussian Federatio

    Actinobacillus actinomycetemcomitans Serotype b Lipopolysaccharide Mediates Coaggregation with Fusobacterium nucleatum

    No full text
    Purified Actinobacillus actinomycetemcomitans serotype b lipopolysaccharide (LPS) was found to be able to bind Fusobacterium nucleatum cells and to inhibit binding of F. nucleatum to A. actinomycetemcomitans serotype b. Sugar binding studies showed that the requirements for binding of A. actinomycetemcomitans serotype b LPS to the F. nucleatum lectin are the presence of a metal divalent ion, an axial free hydroxyl group at position 4, and free equatorial hydroxyl groups at positions 3 and 6 of d-galactose, indicating that the β-N-acetyl-d-galactosamine in the serotype b LPS trisaccharide repeating unit is the monosaccharide residue recognized by the F. nucleatum lectin. These data strongly suggest that A. actinomycetemcomitans serotype b LPS is one of the receptors responsible for the lactose-inhibitable coaggregation of A. actinomycetemcomitans to fusobacteria
    corecore