Posted on behalf of Retread
Back in the 80s when artificial intelligence (AI) was going to make humans obsolete, LISP was the programming language of choice for AI. As a neurologist I was interested in intelligence in any form (machine or otherwise) so I tried to learn it. Most programs looked like gibberish. There was a great quote in a book “Let’s Talk LISP” after a particularly convoluted piece of code — “Relax you, never understand anything, you just get used to it”.
I think the same thing has happened with our understanding of biologically relevant proteins. We’ve just become used to the fact that biological proteins have a dominant shape. However, we also know that other polymers don’t. DNA and RNA certainly don’t have a single shape.
So why do biologically meaningful proteins have one? Consider enzymes. The amino acid side chains comprising the active site are found all over the protein rather than next to each other in the sequence. Chymotrypsin, one of the best studied enzymes, has a catalytic triad made from histidine #57, aspartic acid #102 and serine #195. To function, they must be brought near to each other and held there fixed (and in the proper orientation to boot). The same holds for structural proteins that make up muscle and the cytoskeleton.
Yet only 10 kcal/mole — 2 hydrogen bonds — is enough to denature them. Not much of an activation energy — not even close to a covalent bond. Once denatured, Anfinsen showed that ribonuclease found its way back to the original shape, implying that there were no other conformations of similarly low energy available to it.
It is remarkable that we only have 20,000 or so protein coding genes when you consider just how large possible protein space is. In this regard, proteins are like English words. There are very few of them when you calculate how many there could be. Sonnet #18 — “Shall I compare thee to a summer’s day?” contains 114 words of which 17 are 7 or more letters long. The Oxford English dictionary contains 600,000 or so words of all lengths. There are 8 × 10^9 strings of 7 letters. Few of them have meaning.
Words are a lot shorter than proteins. There are 8 times as many strings of 4 amino acids (20^4 = 160,000) than we have proteins. My guess is that this isn’t an accident, because I doubt that most strings of amino acids have a dominant shape (e.g., biological meaning), and even if they did, they couldn’t find it quickly enough (the Levinthal paradox again).
How would you prove me wrong? Is the question even meaningful scientifically? I (of course) think it is quite meaningful in a philosophic sense, since it bears on just how probable or improbable life is. The next post will discuss some gedanken experiments which could settle the question (or show that it is unanswerable).