84 research outputs found
Enabling Efficient Equivariant Operations in the Fourier Basis via Gaunt Tensor Products
Developing equivariant neural networks for the E(3) group plays an important
role in modeling 3D data across real-world applications. Enforcing this
equivariance primarily involves the tensor products of irreducible
representations (irreps). However, the computational complexity of such
operations increases significantly as higher-order tensors are used. In this
work, we propose a systematic approach to substantially accelerate the
computation of the tensor products of irreps. We mathematically connect the
commonly used Clebsch-Gordan coefficients to the Gaunt coefficients, which are
integrals of products of three spherical harmonics. Through Gaunt coefficients,
the tensor product of irreps becomes equivalent to the multiplication between
spherical functions represented by spherical harmonics. This perspective
further allows us to change the basis for the equivariant operations from
spherical harmonics to a 2D Fourier basis. Consequently, the multiplication
between spherical functions represented by a 2D Fourier basis can be
efficiently computed via the convolution theorem and Fast Fourier Transforms.
This transformation reduces the complexity of full tensor products of irreps
from to , where is the max degree of
irreps. Leveraging this approach, we introduce the Gaunt Tensor Product, which
serves as a new method to construct efficient equivariant operations across
different model architectures. Our experiments on the Open Catalyst Project and
3BPA datasets demonstrate both the increased efficiency and improved
performance of our approach.Comment: 36 pages; ICLR 2024 (Spotlight Presentation); Code:
https://github.com/lsj2408/Gaunt-Tensor-Produc
Rethinking the Expressive Power of GNNs via Graph Biconnectivity
Designing expressive Graph Neural Networks (GNNs) is a central topic in
learning graph-structured data. While numerous approaches have been proposed to
improve GNNs in terms of the Weisfeiler-Lehman (WL) test, generally there is
still a lack of deep understanding of what additional power they can
systematically and provably gain. In this paper, we take a fundamentally
different perspective to study the expressive power of GNNs beyond the WL test.
Specifically, we introduce a novel class of expressivity metrics via graph
biconnectivity and highlight their importance in both theory and practice. As
biconnectivity can be easily calculated using simple algorithms that have
linear computational costs, it is natural to expect that popular GNNs can learn
it easily as well. However, after a thorough review of prior GNN architectures,
we surprisingly find that most of them are not expressive for any of these
metrics. The only exception is the ESAN framework (Bevilacqua et al., 2022),
for which we give a theoretical justification of its power. We proceed to
introduce a principled and more efficient approach, called the Generalized
Distance Weisfeiler-Lehman (GD-WL), which is provably expressive for all
biconnectivity metrics. Practically, we show GD-WL can be implemented by a
Transformer-like architecture that preserves expressiveness and enjoys full
parallelizability. A set of experiments on both synthetic and real datasets
demonstrates that our approach can consistently outperform prior GNN
architectures.Comment: ICLR 2023 notable top-5%; 58 pages, 11 figure
Your Transformer May Not be as Powerful as You Expect
Relative Positional Encoding (RPE), which encodes the relative distance
between any pair of tokens, is one of the most successful modifications to the
original Transformer. As far as we know, theoretical understanding of the
RPE-based Transformers is largely unexplored. In this work, we mathematically
analyze the power of RPE-based Transformers regarding whether the model is
capable of approximating any continuous sequence-to-sequence functions. One may
naturally assume the answer is in the affirmative -- RPE-based Transformers are
universal function approximators. However, we present a negative result by
showing there exist continuous sequence-to-sequence functions that RPE-based
Transformers cannot approximate no matter how deep and wide the neural network
is. One key reason lies in that most RPEs are placed in the softmax attention
that always generates a right stochastic matrix. This restricts the network
from capturing positional information in the RPEs and limits its capacity. To
overcome the problem and make the model more powerful, we first present
sufficient conditions for RPE-based Transformers to achieve universal function
approximation. With the theoretical guidance, we develop a novel attention
module, called Universal RPE-based (URPE) Attention, which satisfies the
conditions. Therefore, the corresponding URPE-based Transformers become
universal function approximators. Extensive experiments covering typical
architectures and tasks demonstrate that our model is parameter-efficient and
can achieve superior performance to strong baselines in a wide range of
applications. The code will be made publicly available at
https://github.com/lsj2408/URPE.Comment: 22 pages; NeurIPS 2022, Camera Ready Versio
One Transformer Can Understand Both 2D & 3D Molecular Data
Unlike vision and language data which usually has a unique format, molecules
can naturally be characterized using different chemical formulations. One can
view a molecule as a 2D graph or define it as a collection of atoms located in
a 3D space. For molecular representation learning, most previous works designed
neural networks only for a particular data format, making the learned models
likely to fail for other data formats. We believe a general-purpose neural
network model for chemistry should be able to handle molecular tasks across
data modalities. To achieve this goal, in this work, we develop a novel
Transformer-based Molecular model called Transformer-M, which can take
molecular data of 2D or 3D formats as input and generate meaningful semantic
representations. Using the standard Transformer as the backbone architecture,
Transformer-M develops two separated channels to encode 2D and 3D structural
information and incorporate them with the atom features in the network modules.
When the input data is in a particular format, the corresponding channel will
be activated, and the other will be disabled. By training on 2D and 3D
molecular data with properly designed supervised signals, Transformer-M
automatically learns to leverage knowledge from different data modalities and
correctly capture the representations. We conducted extensive experiments for
Transformer-M. All empirical results show that Transformer-M can simultaneously
achieve strong performance on 2D and 3D tasks, suggesting its broad
applicability. The code and models will be made publicly available at
https://github.com/lsj2408/Transformer-M.Comment: 20 pages; ICLR 2023, Camera Ready Version; Code:
https://github.com/lsj2408/Transformer-
Genome-wide eQTLs and heritability for gene expression traits in unrelated individuals
BACKGROUND: While the possible sources underlying the so-called ‘missing heritability’ evident in current genome-wide association studies (GWAS) of complex traits have been actively pursued in recent years, resolving this mystery remains a challenging task. Studying heritability of genome-wide gene expression traits can shed light on the goal of understanding the relationship between phenotype and genotype. Here we used microarray gene expression measurements of lymphoblastoid cell lines and genome-wide SNP genotype data from 210 HapMap individuals to examine the heritability of gene expression traits. RESULTS: Heritability levels for expression of 10,720 genes were estimated by applying variance component model analyses and 1,043 expression quantitative loci (eQTLs) were detected. Our results indicate that gene expression traits display a bimodal distribution of heritability, one peak close to 0% and the other summit approaching 100%. Such a pattern of the within-population variability of gene expression heritability is common among different HapMap populations of unrelated individuals but different from that obtained in the CEU and YRI trio samples. Higher heritability levels are shown by housekeeping genes and genes associated with cis eQTLs. Both cis and trans eQTLs make comparable cumulative contributions to the heritability. Finally, we modelled gene-gene interactions (epistasis) for genes with multiple eQTLs and revealed that epistasis was not prevailing in all genes but made a substantial contribution in explaining total heritability for some genes analysed. CONCLUSIONS: We utilised a mixed effect model analysis for estimating genetic components from population based samples. On basis of analyses of genome-wide gene expression from four HapMap populations, we demonstrated detailed exploitation of the distribution of genetic heritabilities for expression traits from different populations, and highlighted the importance of studying interaction at the gene expression level as an important source of variation underlying missing heritability. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-13) contains supplementary material, which is available to authorized users
Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples
Funder: NCI U24CA211006Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts
- …