Information is often encoded as an aperiodic chain of building blocks. Modern
digital computers use bits as the building blocks, but in general the choice of
building blocks depends on the nature of the information to be encoded. What
are the optimal building blocks to encode structural information? This can be
analysed by substituting the operations of addition and multiplication of
conventional arithmetic with translation and rotation. It is argued that at the
molecular level, the best component for encoding discretised structural
information is carbon. Living organisms discovered this billions of years ago,
and used carbon as the back-bone for constructing proteins that function
according to their structure. Structural analysis of polypeptide chains shows
that an efficient and versatile structural language of 20 building blocks is
needed to implement all the tasks carried out by proteins. Properties of amino
acids indicate that the present triplet genetic code was preceded by a more
primitive one, coding for 10 amino acids using two nucleotide bases.Comment: (v1) 9 pages, revtex. (v2) 10 pages. Several arguments expanded to
make the article self-contained and to increase clarity. Applications pointed
out. (v3) 11 pages. Published version. Well-known properties of proteins
shifted to an appendix. Reformatted according to journal styl