Search CORE

311 research outputs found

Quantification of Drive-Response Relationships Between Residues During Protein Folding

Author: Im Wonpil
Qi Yifei
Publication venue: 'American Chemical Society (ACS)'
Publication date: 13/08/2013
Field of study

Mutual correlation and cooperativity are commonly used to describe residue-residue interactions in protein folding/function. However, these metrics do not provide any information on the causality relationships between residues. Such drive-response relationships are poorly studied in protein folding/function and difficult to measure experimentally due to technical limitations. In this study, using the information theory transfer entropy (TE) that provides a direct measurement of causality between two times series, we have quantified the drive-response relationships between residues in the folding/unfolding processes of four small proteins generated by molecular dynamics simulations. Instead of using a time-averaged single TE value, the time-dependent TE is measured with the Q-scores based on residue-residue contacts and with the statistical significance analysis along the folding/unfolding processes. The TE analysis is able to identify the driving and responding residues that are different from the highly correlated residues revealed by the mutual information analysis. In general, the driving residues have more regular secondary structures, are more buried, and show greater effects on the protein stability as well as folding and unfolding rates. In addition, the dominant driving and responding residues from the TE analysis on the whole trajectory agree with those on a single folding event, demonstrating that the drive-response relationships are preserved in the non-equilibrium process. Our study provides detailed insights into the protein folding process and has potential applications in protein engineering and interpretation of time-dependent residue-based experimental observables for protein function

KU ScholarWorks

PubMed Central

FigShare

Roles of PLODs in Collagen Synthesis and Cancer Progression

Author: Qi Yifei
Xu Ren
Publication venue: UKnowledge
Publication date: 28/06/2018
Field of study

Collagen is the major component of extracellular matrix. Collagen cross-link and deposition depend on lysyl hydroxylation, which is catalyzed by procollagen-lysine, 2-oxoglutarate 5-dioxygenase (PLOD). Aberrant lysyl hydroxylation and collagen cross-link contributes to the progression of many collagen-related diseases, such as fibrosis and cancer. Three lysyl hydroxylases (LH1, LH2, and LH3) are identified, encoded by PLOD1, PLOD2, and PLOD3 genes. Expression of PLODs is regulated by multiple cytokines, transcription factors and microRNAs. Dysregulation of PLODs promotes cancer progression and metastasis, suggesting that targeting PLODs is potential strategy for cancer treatment. Here, we summarize the recent progress in the investigation of function and regulation of PLODs in normal tissue development and disease progression, especially in cancer

University of Kentucky

Identifiable Contrastive Learning with Automatic Feature Importance Discovery

Author: Wang Yifei
Wang Yisen
Zhang Qi
Publication venue
Publication date: 29/10/2023
Field of study

Existing contrastive learning methods rely on pairwise sample contrast

z_x^\top z_{x'}

to learn data representations, but the learned features often lack clear interpretability from a human perspective. Theoretically, it lacks feature identifiability and different initialization may lead to totally different features. In this paper, we study a new method named tri-factor contrastive learning (triCL) that involves a 3-factor contrast in the form of

z_x^\top S z_{x'}

, where

S=\text{diag}(s_1,\dots,s_k)

is a learnable diagonal matrix that automatically captures the importance of each feature. We show that by this simple extension, triCL can not only obtain identifiable features that eliminate randomness but also obtain more interpretable features that are ordered according to the importance matrix

S

. We show that features with high importance have nice interpretability by capturing common classwise features, and obtain superior performance when evaluated for image retrieval using a few features. The proposed triCL objective is general and can be applied to different contrastive learning methods like SimCLR and CLIP. We believe that it is a better alternative to existing 2-factor contrastive learning by improving its identifiability and interpretability with minimal overhead. Code is available at https://github.com/PKU-ML/Tri-factor-Contrastive-Learning

arXiv.org e-Print Archive

Segmentation of Synapses in Fluorescent Images using U-Net++ and Gabor-based Anisotropic Diffusion

Author: Qiu Zhen
Yan Yifei
Zhang Qi
Publication venue: IEEE
Publication date: 01/11/2021
Field of study

Objective: Large-scale and automated detection of fluorescent microscopic synaptic images are essential for the understanding of brain function and disorders at the molecular level. However, the quantification of synapses from fluorescent images is challenging due to low signal-to-noise (SNR) and non-synaptic background artefacts. This calls for new tools to be developed for an automatic, high-throughput and robust synapse image segmentation.Methods: we proposed an automatic synapse segmentation framework using a deep learning method based on a modified U-Net++ and Gabor-based anisotropic diffusion (GAD). The modified U-Net++ was used to segment the non-synaptic regions, while the multiplicative Poisson noise was suppressed and the edge of the synapses was enhanced by the GAD filter. Thereafter, the synapses were segmented by a thresholding method.Results: The non-synaptic regions were segmented precisely, and the Dice coefficient and Jaccard similarity were 0.833 and 0.719. Our model for synapse segmentation reduced the interference from the non-synaptic tissues and Poisson noise and yielded automatic and accurate segmentation of synapses.Conclusion: We have proposed an automatic segmentation framework that can accurately segment non-synaptic and synaptic tissues, which may have the potential to automate the quantitative analysis of synapses

University of Dundee Online Publications

Segmentation of Synapses in Fluorescent Images using U-Net++ and Gabor-based Anisotropic Diffusion

Author: Qiu Zhen
Yan Yifei
Zhang Qi
Publication venue: IEEE
Publication date: 01/11/2021
Field of study

University of Dundee Online Publications

Well-posedness of the discrete nonlinear Schr\"odinger equations and the Klein-Gordon equations

Author: Wu Yifei
Yang Zhibo
Zhou Qi
Publication venue
Publication date: 31/10/2023
Field of study

The primary objective of this paper is to investigate the well-posedness theories associated with the discrete nonlinear Schr\"odinger equation and Klein-Gordon equation. These theories encompass both local and global well-posedness, as well as the existence of blowing-up solutions for large and irregular initial data. The main results of this paper presented in this paper can be summarized as follows: 1. Discrete Nonlinear Schr\"odinger Equation: We establish global well-posedness in

l^p_h

spaces for all

1\leq p\leq \infty

, regardless of whether it is in the defocusing or focusing cases. 2. Discrete Klein-Gordon Equation (including Wave Equation): We demonstrate local well-posedness in

l^p_h

spaces for all

1\leq p\leq \infty

. Furthermore, in the defocusing case, we establish global well-posedness in

l^p_h

spaces for any

2\leq p\leq 2\sigma+2

. In contrast, in the focusing case, we show that solutions with negative energy blow up within a finite time

arXiv.org e-Print Archive

How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders

Author: Wang Yifei
Wang Yisen
Zhang Qi
Publication venue
Publication date: 26/03/2023
Field of study

Masked Autoencoders (MAE) based on a reconstruction task have risen to be a promising paradigm for self-supervised learning (SSL) and achieve state-of-the-art performance across different benchmark datasets. However, despite its impressive empirical success, there is still limited theoretical understanding of it. In this paper, we propose a theoretical understanding of how masking matters for MAE to learn meaningful features. We establish a close connection between MAE and contrastive learning, which shows that MAE implicit aligns the mask-induced positive pairs. Built upon this connection, we develop the first downstream guarantees for MAE methods, and analyze the effect of mask ratio. Besides, as a result of the implicit alignment, we also point out the dimensional collapse issue of MAE, and propose a Uniformity-enhanced MAE (U-MAE) loss that can effectively address this issue and bring significant improvements on real-world datasets, including CIFAR-10, ImageNet-100, and ImageNet-1K. Code is available at (https://github.com/zhangq327/U-MAE)

arXiv.org e-Print Archive

Effects of N-glycosylation on protein conformation and dynamics: Protein Data Bank analysis and molecular dynamics simulation study

Author: Im Wonpil
Lee Hui Sun
Qi Yifei
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/03/2015
Field of study

N-linked glycosylation is one of the most important, chemically complex, and ubiquitous post-translational modifications in all eukaryotes. The N-glycans that are covalently linked to proteins are involved in numerous biological processes. There is considerable interest in developments of general approaches to predict the structural consequences of site-specific glycosylation and to understand how these effects can be exploited in protein design with advantageous properties. In this study, the impacts of N-glycans on protein structure and dynamics are systematically investigated using an integrated computational approach of the Protein Data Bank structure analysis and atomistic molecular dynamics simulations of glycosylated and deglycosylated proteins. Our study reveals that N-glycosylation does not induce significant changes in protein structure, but decreases protein dynamics, likely leading to an increase in protein stability. Overall, these results suggest not only a common role of glycosylation in proteins, but also a need for certain proteins to be properly glycosylated to gain their intrinsic dynamic properties.This work was supported by NIH U54GM087519 and XSEDE MCB070009. We gratefully acknowledge Sunhwan Jo for helping us to use Glycan Reader. Anton computer time was provided by the National Center for Multiscale Modeling of Biological Systems (MMBioS) through Grant P41GM103712-S1 from the National Institutes of Health and the Pittsburgh Supercomputing Center (PSC). The Anton machine at PSC was generously made available by D.E. Shaw Research

KU ScholarWorks

PubMed Central