Integration of protein binding interfaces and abundance data reveals evolutionary pressures in protein networks

Abstract

Networks of protein-protein interactions have received considerable interest in the past two decades for their insights about protein function and evolution. Traditionally, these networks only map the functional partners of proteins; they lack further levels of data such as binding affinity, allosteric regulation, competitive vs noncompetitive binding, and protein abundance. Recent experiments have made such data on a network-wide scale available, and in this thesis I integrate two extra layers of data in particular: the binding sites that proteins use to interact with their partners, and the abundance or “copy numbers” of the proteins. By analyzing the networks for the clathrin-mediated endocytosis (CME) system in yeast and the ErbB signaling pathway in humans, I find that this extra data reveals new insights about the evolution of protein networks. The structure of the binding site or interface interaction network (IIN) is optimized to allow higher binding specificity; that is, a high gap in strength between functional binding and nonfunctional mis-binding. This strongly implies that mis-binding is an evolutionary error-load constraint shaping protein network structure. Another method to limit mis-binding is to balance protein copy numbers so that there are no “leftover” proteins available for mis-binding. By developing a new method to quantify balance in IINs, I show that the CME network is significantly balanced when compared to randomly sampled sets of copy numbers. Furthermore, IINs with a biologically realistic structure produce less mis-binding under balanced concentrations, when compared to random networks, but more mis-binding under unbalanced concentrations. This implies strong pressure for copy number balance and that any imbalance should occur for functional reasons. I thus explore some functional consequences of imbalance by constructing dynamic models of two poorly balanced subnetworks of the larger CME network. In general, I find that balanced copy numbers provide higher protein complex yield (number of complete complexes), but imbalance may allow cells to “bottleneck” a functional process, effectively turning complex formation on or off via spatial localization of subunits. Finally, I find that strongly binding proteins are more likely to be balanced, as these “sticky” proteins would be more likely to engage in mid-binding otherwise

    Similar works