2,596 research outputs found
Differential geometric regularization for supervised learning of classifiers
We study the problem of supervised learning for both binary and multiclass classification from a unified geometric perspective. In particular, we propose a geometric regularization technique to find the submanifold corresponding to an estimator of the class probability P(y|\vec x). The regularization term measures the volume of this submanifold, based on the intuition that overfitting produces rapid local oscillations and hence large volume of the estimator. This technique can be applied to regularize any classification function that satisfies two requirements: firstly, an estimator of the class probability can be obtained; secondly, first and second derivatives of the class probability estimator can be calculated. In experiments, we apply our regularization technique to standard loss functions for classification, our RBF-based implementation compares favorably to widely used regularization methods for both binary and multiclass classification.http://proceedings.mlr.press/v48/baia16.pdfPublished versio
A Hierarchical Training Paradigm for Antibody Structure-sequence Co-design
Therapeutic antibodies are an essential and rapidly expanding drug modality.
The binding specificity between antibodies and antigens is decided by
complementarity-determining regions (CDRs) at the tips of these Y-shaped
proteins. In this paper, we propose a hierarchical training paradigm (HTP) for
the antibody sequence-structure co-design. HTP consists of four levels of
training stages, each corresponding to a specific protein modality within a
particular protein domain. Through carefully crafted tasks in different stages,
HTP seamlessly and effectively integrates geometric graph neural networks
(GNNs) with large-scale protein language models to excavate evolutionary
information from not only geometric structures but also vast antibody and
non-antibody sequence databases, which determines ligand binding pose and
strength. Empirical experiments show that HTP sets the new state-of-the-art
performance in the co-design problem as well as the fix-backbone design. Our
research offers a hopeful path to unleash the potential of deep generative
architectures and seeks to illuminate the way forward for the antibody sequence
and structure co-design challenge
Integration of Pre-trained Protein Language Models into Geometric Deep Learning Networks
Geometric deep learning has recently achieved great success in non-Euclidean
domains, and learning on 3D structures of large biomolecules is emerging as a
distinct research area. However, its efficacy is largely constrained due to the
limited quantity of structural data. Meanwhile, protein language models trained
on substantial 1D sequences have shown burgeoning capabilities with scale in a
broad range of applications. Several previous studies consider combining these
different protein modalities to promote the representation power of geometric
neural networks, but fail to present a comprehensive understanding of their
benefits. In this work, we integrate the knowledge learned by well-trained
protein language models into several state-of-the-art geometric networks and
evaluate a variety of protein representation learning benchmarks, including
protein-protein interface prediction, model quality assessment, protein-protein
rigid-body docking, and binding affinity prediction. Our findings show an
overall improvement of 20% over baselines. Strong evidence indicates that the
incorporation of protein language models' knowledge enhances geometric
networks' capacity by a significant margin and can be generalized to complex
tasks
Quantifying the Knowledge in GNNs for Reliable Distillation into MLPs
To bridge the gaps between topology-aware Graph Neural Networks (GNNs) and
inference-efficient Multi-Layer Perceptron (MLPs), GLNN proposes to distill
knowledge from a well-trained teacher GNN into a student MLP. Despite their
great progress, comparatively little work has been done to explore the
reliability of different knowledge points (nodes) in GNNs, especially their
roles played during distillation. In this paper, we first quantify the
knowledge reliability in GNN by measuring the invariance of their information
entropy to noise perturbations, from which we observe that different knowledge
points (1) show different distillation speeds (temporally); (2) are
differentially distributed in the graph (spatially). To achieve reliable
distillation, we propose an effective approach, namely Knowledge-inspired
Reliable Distillation (KRD), that models the probability of each node being an
informative and reliable knowledge point, based on which we sample a set of
additional reliable knowledge points as supervision for training student MLPs.
Extensive experiments show that KRD improves over the vanilla MLPs by 12.62%
and outperforms its corresponding teacher GNNs by 2.16% averaged over 7
datasets and 3 GNN architectures
Unveiling the Power of Mixup for Stronger Classifiers
Mixup-based data augmentations have achieved great success as regularizers
for deep neural networks. However, existing methods rely on deliberately
handcrafted mixup policies, which ignore or oversell the semantic matching
between mixed samples and labels. Driven by their prior assumptions, early
methods attempt to smooth decision boundaries by random linear interpolation
while others focus on maximizing class-related information via offline saliency
optimization. As a result, the issue of label mismatch has not been well
addressed. Additionally, the optimization stability of mixup training is
constantly troubled by the label mismatch. To address these challenges, we
first reformulate mixup for supervised classification as two sub-tasks, mixup
sample generation and classification, then propose Automatic Mixup (AutoMix), a
revolutionary mixup framework. Specifically, a learnable lightweight Mix Block
(MB) with a cross-attention mechanism is proposed to generate a mixed sample by
modeling a fair relationship between the pair of samples under direct
supervision of the corresponding mixed label. Moreover, the proposed Momentum
Pipeline (MP) enhances training stability and accelerates convergence on top of
making the Mix Block fully trained end-to-end. Extensive experiments on five
popular classification benchmarks show that the proposed approach consistently
outperforms leading methods by a large margin.Comment: The second version of AutoMix. 12 pages, 7 figure
- …