14 research outputs found
SyNDock: N Rigid Protein Docking via Learnable Group Synchronization
The regulation of various cellular processes heavily relies on the protein
complexes within a living cell, necessitating a comprehensive understanding of
their three-dimensional structures to elucidate the underlying mechanisms.
While neural docking techniques have exhibited promising outcomes in binary
protein docking, the application of advanced neural architectures to multimeric
protein docking remains uncertain. This study introduces SyNDock, an automated
framework that swiftly assembles precise multimeric complexes within seconds,
showcasing performance that can potentially surpass or be on par with recent
advanced approaches. SyNDock possesses several appealing advantages not present
in previous approaches. Firstly, SyNDock formulates multimeric protein docking
as a problem of learning global transformations to holistically depict the
placement of chain units of a complex, enabling a learning-centric solution.
Secondly, SyNDock proposes a trainable two-step SE(3) algorithm, involving
initial pairwise transformation and confidence estimation, followed by global
transformation synchronization. This enables effective learning for assembling
the complex in a globally consistent manner. Lastly, extensive experiments
conducted on our proposed benchmark dataset demonstrate that SyNDock
outperforms existing docking software in crucial performance metrics, including
accuracy and runtime. For instance, it achieves a 4.5% improvement in
performance and a remarkable millionfold acceleration in speed
Construction of a cross-species cell landscape at single-cell level.
Individual cells are basic units of life. Despite extensive efforts to characterize the cellular heterogeneity of different organisms, cross-species comparisons of landscape dynamics have not been achieved. Here, we applied single-cell RNA sequencing (scRNA-seq) to map organism-level cell landscapes at multiple life stages for mice, zebrafish and Drosophila. By integrating the comprehensive dataset of > 2.6 million single cells, we constructed a cross-species cell landscape and identified signatures and common pathways that changed throughout the life span. We identified structural inflammation and mitochondrial dysfunction as the most common hallmarks of organism aging, and found that pharmacological activation of mitochondrial metabolism alleviated aging phenotypes in mice. The cross-species cell landscape with other published datasets were stored in an integrated online portal-Cell Landscape. Our work provides a valuable resource for studying lineage development, maturation and aging
What has been Enhanced in my Knowledge-Enhanced Language Model?
A number of knowledge integration (KI) methods have recently been proposed to incorporate external knowledge into pretrained language models (LMs). Even though knowledge-enhanced LMs (KELMs) outperform base LMs on knowledge-intensive tasks, the inner-workings of these KI methods are not well-understood. For instance, it is unclear which knowledge is effectively integrated into KELMs and which is not; and if such integration led to catastrophic forgetting of already learned knowledge. We show that existing model interpretation methods such as linear probes and prompts have some key limitations in answering these questions. Then, we revisit KI from an information-theoretic view and propose a new theoretically sound probe model called Graph Convolution Simulator (GCS) for KI interpretation. GCS is eventually quite simple – it uses graph attention on the corresponding knowledge graph for interpretation.We conduct various experiments to verify that GCS provides reasonable interpretation results for two well-known KELMs: ERNIE and K-Adapter. Our experiments reveal that only little knowledge is successfully integrated in these models, and simply increasing the size of the KI corpus may not lead to better KELMs
Recent Advances in Reliable Deep Graph Learning: Adversarial Attack, Inherent Noise, and Distribution Shift
Deep graph learning (DGL) has achieved remarkable progress in both business
and scientific areas ranging from finance and e-commerce to drug and advanced
material discovery. Despite the progress, applying DGL to real-world
applications faces a series of reliability threats including adversarial
attacks, inherent noise, and distribution shift. This survey aims to provide a
comprehensive review of recent advances for improving the reliability of DGL
algorithms against the above threats. In contrast to prior related surveys
which mainly focus on adversarial attacks and defense, our survey covers more
reliability-related aspects of DGL, i.e., inherent noise and distribution
shift. Additionally, we discuss the relationships among above aspects and
highlight some important issues to be explored in future research
A Survey of Trustworthy Graph Learning: Reliability, Explainability, and Privacy Protection
Deep graph learning has achieved remarkable progresses in both business and
scientific areas ranging from finance and e-commerce, to drug and advanced
material discovery. Despite these progresses, how to ensure various deep graph
learning algorithms behave in a socially responsible manner and meet regulatory
compliance requirements becomes an emerging problem, especially in
risk-sensitive domains. Trustworthy graph learning (TwGL) aims to solve the
above problems from a technical viewpoint. In contrast to conventional graph
learning research which mainly cares about model performance, TwGL considers
various reliability and safety aspects of the graph learning framework
including but not limited to robustness, explainability, and privacy. In this
survey, we provide a comprehensive review of recent leading approaches in the
TwGL field from three dimensions, namely, reliability, explainability, and
privacy protection. We give a general categorization for existing work and
review typical work for each category. To give further insights for TwGL
research, we provide a unified view to inspect previous works and build the
connection between them. We also point out some important open problems
remaining to be solved in the future developments of TwGL.Comment: Preprint; Work in progress. arXiv admin note: substantial text
overlap with arXiv:2202.0711
Construction of the axolotl cell landscape using combinatorial hybridization sequencing at single-cell resolution
The Mexican axolotl is a well-established tetrapod model for regeneration and development. Here the authors report a scRNA-seq method to profile neotenic, metamorphic and limb development stages, highlighting unique perturbation patterns of cell type-related gene expression throughout metamorphosis