2 research outputs found
Network analysis of synonymous codon usage
Most amino acids are encoded by multiple synonymous codons. For an amino
acid, some of its synonymous codons are used much more rarely than others.
Analyses of positions of such rare codons in protein sequences revealed that
rare codons can impact co-translational protein folding and that positions of
some rare codons are evolutionary conserved. Analyses of positions of rare
codons in proteins' 3-dimensional structures, which are richer in biochemical
information than sequences alone, might further explain the role of rare codons
in protein folding. We analyze a protein set recently annotated with codon
usage information, considering non-redundant proteins with sufficient
structural information. We model the proteins' structures as networks and study
potential differences between network positions of amino acids encoded by
evolutionary conserved rare, evolutionary non-conserved rare, and commonly used
codons. In 84% of the proteins, at least one of the three codon categories
occupies significantly more or less network-central positions than the other
codon categories. Different protein groups showing different codon centrality
trends (i.e., different types of relationships between network positions of the
three codon categories) are enriched in different biological functions,
implying the existence of a link between codon usage, protein folding, and
protein function
Network-based protein structural classification
Experimental determination of protein function is resource-consuming. As an
alternative, computational prediction of protein function has received
attention. In this context, protein structural classification (PSC) can help,
by allowing for determining structural classes of currently unclassified
proteins based on their features, and then relying on the fact that proteins
with similar structures have similar functions. Existing PSC approaches rely on
sequence-based or direct 3-dimensional (3D) structure-based protein features.
In contrast, we first model 3D structures of proteins as protein structure
networks (PSNs). Then, we use network-based features for PSC. We propose the
use of graphlets, state-of-the-art features in many research areas of network
science, in the task of PSC. Moreover, because graphlets can deal only with
unweighted PSNs, and because accounting for edge weights when constructing PSNs
could improve PSC accuracy, we also propose a deep learning framework that
automatically learns network features from weighted PSNs. When evaluated on a
large set of ~9,400 CATH and ~12,800 SCOP protein domains (spanning 36 PSN
sets), our proposed approaches are superior to existing PSC approaches in terms
of accuracy, with comparable running time