Most Fundamental Sets Of Building Blocks In Proteins

An international and interdisciplinary team working at the Earth-Life Science Institute (ELSI) at the Tokyo Institute of Technology has modeled the evolution of one of biology’s most fundamental sets of building blocks and found that it may have special properties that helped bootstrap itself into its modern form. All life on Earth uses an almost universal set of 20 coded amino acids (CAAs) to construct proteins.

This set was likely “canonicalized” or standardized during early evolution; before this, smaller amino acid sets were gradually expanded as organisms developed new synthetic proofreading and coding abilities. The new study, led by Melissa Ilardo, now at the University of Utah; explored how this set evolution might have occurred.

Most fundamental sets

There are millions of possible types of amino acids; that could be finding on Earth or elsewhere in the universe, each with its own distinctive chemical properties. Indeed, scientists have found these unique chemical properties are what give biological proteins, the large molecules that do much of life’s catalysis, their own unique capabilities.

The team had previously measured how the CAA set compares to random sets of amino acids; and found that only about 1 in a billion random sets had chemical properties as unusually distributed as those of the CAAs. The team thus set out to ask the question of what earlier; smaller coded sets might have been like in terms of their chemical properties. There are many possible subsets of the modern CAAs or other presently uncoded amino acids that could have comprised the earlier sets.

Chemical properties

The team calculated the possible ways of making a set of 3-20 amino acids; using a special library of 1913 structurally diverse “virtual” amino acids they computed; and found there are 1048 ways of making sets of 20 amino acids. In contrast, there are only 1019 grains of sand on Earth; and only 1024 stars in the entire universe. There are just so many possible amino acids, and so many ways to make combinations of them; a computational approach was the only comprehensive way to address this question.

Efficient implementations of algorithms based on appropriate mathematical models allow us to handle even astronomically huge combinatorial spaces; adds co-author Markus Meringer of the German Center for Air and Space. As this number is so large that they using statistical methods to compare; the adaptive value of the combined physicochemical properties; of the modern CAA set with those of billions of random sets of three to 20 amino acids.