Astounding Science Fiction/Volume 54/Number 06/Hemoglobin and the Universe
Hemoglobin and the Universe
By Isaac Asimov
It's been said that in an infinite universe, in infinite time anything can happen—and anything that ever has happened would be repeated. So? Well, how long would you have to wait for some specific event, say a molecule of a common protein, to show up . . . ?
(Special note: For those readers who associate me with that amazing substance, thiotimoline, it is necessary for me to state categorically that the following article is not a hoax, gag, or comic piece. It is perfectly serious and legitimate. Cross my heart.)
Even the purest and most high-minded scientist finds it expedient sometimes to assault the fortress of truth with the blunt weapon of trial and error. Sometimes it works beautifully. As evidence and as a case in point, let us bring to the center of the stage the hemoglobin molecule.
Hemoglobin is the chief protein component of the red blood cells. It has the faculty of loosely combining with molecular oxygen to form oxyhemoglobin. That combination takes place in the small blood vessels of the lungs. The oxyhemoglobin there formed is carried by the blood stream to all the cells of the body; it gives up its oxygen to said cells and becomes hemoglobin once more. It is then ready to make its way to the lungs for another load.
Because of hemoglobin's vital function in life and because of its ready availability in fairly pure form, the protein has been favored with the closest scrutiny on the part of chemists. It was found, for instance, that the hemoglobin molecule is approximately a parallelepiped in shape, with dimensions of 6.4 by 4.8 by 3.6 millimicrons. (A millimicron is one-billionth of a meter; a meter is forty inches.) The bulk of this molecule is "globin" which, by itself, is an unstable protein. It makes up ninetyseven per cent of the whole. Attached to the globin, and rendering the whole more stable, are four iron-bearing groups called "heme" (see Figtire 1).
Figure 1. Schematic representation of hemoglobin molecule.
Hemoglobin can be split apart into a heme fraction and a globin fraction without very much difficulty, and the two can be studied separately. Heme, being simpler in construction and quite stable in addition, was naturally the more intensively investigated of the two.
The heme molecule is flat and approximately circular in shape. In the very center of heme is an iron atom. Surrounding that iron atom are twenty carbon atoms and four nitrogen atoms—plus some hydrogens—arranged in four small rings that are themselves connected into one big ring. This wheels within wheels arrangement occurs in numerous compounds other than heme—notably in chlorophyll—and is called the "porphyrin ring." Establishing the structure of the porphyrin ring itself took some fancy footwork, but was a relatively straightforward matter.
Now, however, there enters an additional refinement. There are eight points in the porphyrin ring where groups of atoms called "side-chains" can be, and are, attached. In the heme molecule, the eight side-chains are of three variaties: four of one kind, two
Figure 2. Schematic representation of heme molecule.
(Note: The positions available for side-chain attachment are numbered 1 to 8. The small rings which are themselves combined to form the porphyrin ring are numbered I to IV. The symbol Fe stands for the iron atom.)
of another, and two of a third. Porphyrin rings to which are attached that particular combination of sidechains are called "protoporphyrins."
Now this is the ticklish point. Which side-chains are attached to which positions in the porphyrin ring? To illustrate the difficulty, let's draw some pictures. Since this article concerns itself not with chemistry—despite appearances so far—but merely with some simple arithmetic, there is no need to make an accurate representation of the porphyrin ring. It will be sufficient to draw a ticktacktoe design (Figure 2). Topologically, we have achieved all that is necessary. The two ends of each of the four lines represent the eight positions to which sidechains can be attached.
Figure 3a |
Figure 3b |
Two possible arrangements of protoporphyrin side-chains.
(Note: The reader may think he can draw more arrangements than the fifteen stated in the article to be the number that can exist. So he can! However, the porphyrin ring possesses four-fold radial symmetry and front-back bilateral symmetry which reduces the number of different arrangements eightfold. Furthermore, certain arrangements could be ruled out for various chemical reasons. There remained, as stated, fifteen arrangements in all which could not be ruled out either by symmetry or by chemical reasoning.)
If we symbolize the side-chains as a, b, and c (four a's, two b', and two c's), several arrangements can be represented. Two of these are shown in Figures 3a and 3b. Altogether fifteen different and distinct arrangements can exist. Each arrangement represents a molecule with properties that are in some respects different from those of the molecules represented by every other arrangement. Only one of the fifteen is the arrangement found in heme.
Which one?
A German chemist called Fisher was faced with that problem and he solved it in the most straightforward possible manner. He wrote down the fifteen possible arrangements on pieces of paper, numbering them arbitrarily from one to fifteen. He then, in effect, called out his sixty graduate students, marshaled them into platoons of four apiece, and gave each platoon one of the arrangements. Instructions were for each to synthesize the protoporphyrin with the particular arrangement pictured.
Figure 4. Side-chain arrangement in protoporphyrin IX.
The students got to work. As each protoporphyrin was formed, its properties were compared with those of the natural protoporphyrin obtained from hemoglobin. It turned out that only one of the synthetic protoporphyrins matched the natural product. It was the one that Fisher had happened to assign the number 9, and it has the side-chain arrangement shown in Figure 4. Since then, generations of medical students and biochemists have memorized the formula of the natural product and learned to call it "Protoporphyrin IX." (It is my personal experience that few students show any curiosity at all as to why the IX.)
Score a tremendous victory for pure trial and error!
Now let's tackle the globin portion of the hemoglobin molecule. Globin is, as has been said, protein in nature, and proteins are by far the most important chemicals in living tissue. There is no question but that most or all of the secrets of life lie hidden in the details of protein structure. A biochemist who could learn the exact structure of some protein would be an awfully happy biochemist. So let's get some notion as to what it would take to achieve that desirable end.
All protein molecules are made up of relatively small compounds called "amino acids," which are strung together in the molecule like beads on a string. There are about twenty different amino acids occurring in proteins and the structure of each one of them is exactly known. Furthermore, the exact manner in which amino acids are hooked together in a chain to form a protein molecule is also known. Finally, in the case of many proteins, including hemoglobin, we know exactly how many of each amino acid the molecule contains. Most of the problem seems to be licked. The only thing left is to figure out the exact order in which the different amino acids occur along the protein chain.
To show what we mean, let's suppose we have a very small protein molecule made up of four different amino acids: a, b, c, and d. These four amino acids can be arranged in twenty-four different ways, as shown in Figure 5. Each arrangement represents a molecule with distinct properties of its own. The situation is then similar to that in the case of heme. Each of the twenty-four possible molecules can be synthesized and its properties compared with the natural product. One of the twenty-four must be the right one.
a-b-c-d- |
b-a-c-d- |
c-b-a-d- |
d-b-c-a- |
Figure 5. The different arrangements of four amino acids in a protein chain. |
To be sure, hemoglobin has somewhat more than four amino acids in its molecule so the number of possible arrangements is to be expected to be somewhat more than twenty-four. Still, proteins are so important that biochemists would be willing to go to an unusual amount of effort to solve the problem of their structure and the mere presence of additional arrangements might not discourage them. Trial and error might be a little more tedious than in the case of heme, but, given time enough, it ought to be as sure as death and taxes.
Or should it?
To begin with, hemoglobin is a protein of only average size. Its molecule is made up of five hundred thirty-nine amino acids of exactly twenty different varieties and the number of each amino acid present is known. There is no need to name each amino acid. We can accomplish all that is necessary for our purposes by lettering them from a to t inclusive. There are seventy-five amino acids of type a present in the molecule, fifty-four of type b, fifty of type c and so on. One possible arrangement of the five hundred thirty-nine amino acids is shown in Figure 6.
Obviously the letters in Figure 6 can be written down in quite a few different arrangements and the reader may well shiver a bit at the thought of trying to write down all possible combinations and then counting them. Fortunately, we don't have to do that. The number of combinations can be calculated indirectly from the data we already have.
a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-q-r-s-t-a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-q-r-s-a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-q-r-s-a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-q-r-a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-q-a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-a-b-c-d-e-f-g-h-i-j-k-l-m-n-a-b-c-d-e-f-g-h-i-j-k-l-m-n-a-b-c-d-e-f-g-h-i-j-k-l-m-n-a-b-c-d-e-f-g-h-i-j-k-l-m-n-a-b-c-d-e-f-g-h-i-j-k-l-a-b-c-d-e-f-g-h-i-j-k-l-a-b-c-d-e-f-g-h-i-j-k-a-b-c-d-e-f-g-h-i-j-k-a-b-c-d-e-f-g-h-i-j-a-b-c-d-e-f-g-h-i-j-a-b-c-d-e-f-g-h-i-a-b-c-d-e-f-g-h-i-a-b-c-d-e-f-g-h-i-a-b-c-d-e-f-g-h-i-a-b-c-d-e-f-g-h-i-a-b-c-d-e-f-g-h-i-a-b-c-d-e-f-g-h-a-b-c-d-e-f-g-h-a-b-c-d-e-f-g-h-a-b-c-d-e-f-g-a-b-c-d-e-f-g-a-b-c-d-e-f-a-b-c-d-e-a-b-c-d-e-a-b-c-d-a-b-c-d-a-b-c-d-a-b-c-d-a-b-c-d-a-b-c-d-a-b-c-d-a-b-c-d-a-b-c-d-a-b-c-d-a-b-c-a-b-c-a-b-a-b-a-b-a-b-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a
Figure 6. One possible arrangement of the amino acids in the hemoglobin molecule.
Thus, if we have n different objects, then the number of ways in. which they can be arranged in a line is equal to the product of all the integers from n down to 1. The number of combinations of four objects, for instance, is: 4 x 3 x 2 x 1, or 24. This is the number we found by actually writing out all the different combinations (see Figure 5). The product of all the integers from n to 1 is called "factorial n" and is symbolized as n!
Figure 7a. The total arrangements of four amino acids, two of one kind and two of another. |
a-a-b-b a-b-a-b a-b-b-a b-a-a-b b-a-b-a b-b-a-a Figure 7b. The different arrangements of four amino acids, two of one kind and two of another. |
If the n objects are not all different, an additional complication is introduced. Suppose that our very small four-amino-acid protein is made up of two amino acids of one kind and two of another. Let's symbolize the amino acids as a, a*, b and b*. The twenty-four theoretical combinations are presented in Figure 7a. But if a and a* are indistinguishable, and b and b* likewise, then the combination ab* is identical, for all practical purposes, with a*b, a*b*, and ab. The combination aba*b* is identical with a*bab*, ab*a*b and so on. The total number of different combinations among those found in Figure 7a is shown in Figure 7b, in which asterisks are eliminated. You will note that the number of different combinations is 6.
The formula for obtaining the number of different combinations of n objects of which the number p are of one kind, q of another, r of another, and so on, involves a division of factorials, thus:
n! |
p! x q! x r! . . . . . |
In the case we have just cited—that is, the four-amino-acid protein with two amino acids of one type and two of another—the formula is:
4! | or | 4 x 3 x 2 x 1 | or 6 |
2! x 2! | 2 x 1 x 2 x 1 |
Of course, the factorials involved in calculating the number of amino acid combinations in hemoglobin are larger. We must start with factorial 539—the total number of amino acids in hemoglobin—and divide that by the product of factorial 75, factorial 54, factorial 50 and so on—the number of each amino acid present.
1! | equals | 1 | equals | 1 |
2! | equals | 2 x 1 | equals | 2 |
3! | equals | 3 x 2 x 1 | equals | 6 |
4! | equals | 4 x 3 x 2 x 1 | equals | 24 |
5! | equals | 5 x 4 x 3 x 2 x l | equals | 120 |
6! | equals | 6 x 5 x 4 x 3 x 2 x l | equals | 720 |
7! | equals | 7 x 6 x 5 x 4 x 3 x 2 x l | equals | 5040 |
8! | equals | 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1 | equals | 40320 |
Figure 8. The factorials of the first few integers. |
The factorials of the lower integers are easy enough to calculate (see Figure 8). Unfortunately they build up rather rapidly. Would you make a quick guess at the value of factorial 20? You're probably wrong. The answer is approximately twenty-four hundred quadrillion, which, written in figures, is 2,400,000,000,000,000,000. And factorial values continue mounting at an ever-increasing rate.
In handling large numbers of this sort, recourse is had to exponentials of the form 10n. 10n is a short way of representing a numeral consisting of 1 followed by n zeros. 1,000 would be 103 and 1,000,000,000,000 would be 1012 and so on. A number like 2,500 which is in between 1,000 (that is 103) and 10,000 (that is 104) could be expressed as 10 to a fractional exponent somewhere in between 3 and 4. More often, it is written simply as 2.5 x 103 (that is, 2.5 x 1,000—which, obviously, works out to 2,500).
Written exponentially, then, factorial 20 is about 2.4 x 1018.
For the purposes of this article, there are several things that must be kept in mind with regard to exponential numbers:
1) In multiplying two exponential numbers, the exponents are added. Thus, the product of 2 x 104 and 3 x 105 equals 6 x 109. If you translate the first two numbers to 20,000 and 300,000, you will see that the product is indeed 6,000,000,000.
2) A number like 2,560,000 can be expressed as 256 x 104, or 25.6 x 105 or 2.56 x 106 or 0.256 x 107. All are the same number, as you can see if you multiply 256 by 10,000: 25.6 by 100,000; 2.56 by 1,000,000; or 0.256 by 10,000,000. Which one of these exponential numbers is it best to use? It is customary to use the one in which the nonexponential portion of the number is between 1 and 10. In the case of 2,560,000, the usual exponential figure is 2.56 x 106. For this reason, in multiplying 2 x 104 by 6 x 105, we present the answer not as 12 x 109, but as 1.2 x 1010. (Where the number 1010 is presented by itself, it is the same as writing 1 x 1010.)
3) The appearance of exponential numbers may be deceiving. 103 is ten times greater than 102. Similarly, 1069 is ten times greater than 1068, despite the fact that intuitively they look about the same. Again, it must be remembered that 1012, for instance, is not twice as great as 106, but a million times as great.
And now we are ready to return to our factorials. If factorial 20 is 2.4 x 1018, you may well hesitate to try to calculate the value of such numbers as factorial 50, factorial 54, factorial 75 and, above all, factorial 539. Fortunately, there exist tables of the lower factorials—say, to factorial 100—and equations whereby the higher factorials can be approximately determined.
Using both tables and equations, the number of combinations possible in hemoglobin can be computed. The answer turns out to be 4 x 10619. If you want to see what that number looks like written out in full, see Figure 9. Let's agree to call 4 x 10619 the "hemoglobin number." Those of you, by the way, who have read Kasner and Newman's "Mathematics and the Imagination" will see that the hemoglobin number is larger than a googol (10100) but smaller than a googolplex (10googol).
40, | 000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000. |
Figure 9. The hemoglobin number.
Of all the hemoglobin number of combinations, only one combination has the precise properties of the hemoglobin molecule found in the human being. To test that number of combinations one after the other to find the one would, as you probably rightly suspect, take time. But given enough time, enough scientists, enough generations of scientists, surely trial and error would come through with the answer, inevitably, at long, long last. But exactly how much space and time would be required?
In order to answer that question we must first, get an idea of the size of the hemoglobin number. It seems awfully big, so we'll begin by taking something grandiose as a comparison. For instance, how does the hemoglobin number compare with the total number of molecules of hemoglobin on Earth? That's a fair beginning.
The human population of the Earth is 2,500,000,000 or, exponentially, 2.5 x 109. The average human being, including men, women and children, weighs, let us say, one hundred twenty pounds, which is equal to 5.5 x 104 grams. (There are 454 grams in a pound.) The total number of grams of living human flesh, blood, and bone on Earth is, therefore, about 1.4 x 1014 grams.
Seven per cent of the human body is blood so that the total amount of blood on Earth is 9.0 x 109 liters. (Since a liter is equal to about 1.06 quarts, that figure comes to nine and a half billion quarts.) Every liter of blood contains five trillion (5 x 1012) red cells, so the total number of human red cells on Earth is, therefore, 4.5 x 1022.
Although the red cell is microscopic in size, there is still enough room in each red cell for nearly three hundred million hemoglobin molecules—2.7 x 108, to be more precise. There are thus, on all the Earth, 1031 human hemoglobin molecules.
But those are the hemoglobin molecules belonging to human beings only. Other vertebrates, from whales to shrews, also possess hemoglobin in their blood, as do some lower forms of fife. Let's be generous and assume that for every human hemoglobin molecule on Earth there are one billion (109) nonhuman hemoglobin molecules. In that case, the total number of hemoglobin molecules on Earth, human and nonhuman, is 1040.
Even this number, unfortunately, is nowhere near the hemoglobin number and so it will not serve as a comparison.
Let us bring in the element of time and see if that helps us out. The average red blood cell has a life expectancy of about one third of a year. After that it is broken up and a new red blood cell takes its place. Let us suppose that every time a new red blood cell is formed, it contains a completely new set of hemoglobin molecules. In one year, then, a total of 3 x 1040 hemoglobin molecules will have existed. But the Earth has existed in solid state for something like three and a third billion years—3.3 x 109. Suppose that in all that time, Earth has been just as rich in hemoglobin as it is now. If that were true, the total number of hemoglobin molecules ever to have existed on Earth would be 1050. This is still nowhere near the hemoglobin number.
Well, then, let us stop fooling around with one dinky little planet and its history. We have all of space and time at our disposal and as science-fiction enthusiasts we ought to have no
qualms about using it. It is estimated that there are one hundred million stars in the galaxy and at least that many galaxies in the universe. Let's be generous. Let's never stint in our generosity. Let's suppose that there are a billion stars in the galaxy, rather than merely a hundred million. Let us suppose there are a billion galaxies in the universe. The total number of stars in the universe would then be 109 x 109 or 1018.
Suppose now that every star—every single star—possessed in its gravitational field no less than ten planets, each one of which was capable of holding as much life as Earth can and that each one was as rich in hemoglobin. There would then be IO19 such planets in existence and in one year, the number of hemoglobin molecules that would have existed on all those planets—assuming always a life-expectancy of a third of a year for each molecule—would be 3 x 1059.
Now let us suppose that each of these planets remained that rich in hemoglobin for, from first to last, three hundred billion years—3 x 1011. This is a very generous figure, really, since the sun's life expectancy is only about ten to twenty billion years, during only a portion of which time will life on Earth be possible. And this life expectancy is rather longer than average for other stars, too.
Still, with all the generous assumptions we have been making, all the hemoglobin molecules that could possibly exist in all the space and time we have any knowledge of—and more—comes out to 1071. This number is still virtually zero compared to the hemoglobin number.
Let's try a different tack altogether. Let's build a computing machine—a big computing machine. The whole known universe is estimated to be a billion light-years in diameter, so let us imagine a computing machine in the form of a cube ten billion light-years on each edge. If such a machine were hollow, there would be room in it for one thousand universes such as ours, including all the stars and galaxies and all the space between the various stars and galaxies as well.
Now let us suppose that computing machine was completely filled from edge to edge and from top to bottom with tiny computing units, each one of which could test different combinations of hemoglobin amino acids in order to see whether it was the hemoglobin combination or not. In order to make sure that the computing units are as numerous as possible, let's suppose that each one is no larger than the least voluminous object known, the single neutron.
How many computing units would the machine contain?
A neutron is only one-ten-trillionth of a centimeter in diameter. One cubic centimeter—which is equal to only one-sixteenth of a cubic inch—will, therefore, contain 1013 x 1013 x 1013 or 1039 neutrons, if these were packed in as tightly as possible. (We assume the neutrons to be tiny cubes rather than tiny spheres, for simplicity's sake.)
Now light travels at the rate of 3 x 1010 centimeters per second. There are about 3.16 x 108 seconds in a year. A light-year is the distance traversed by light in one year, and is, therefore, 3 x 1010 x 3.16 x 108 or about 1019 centimeters in length. Our computing machine which is ten billion (1010) light-years along each edge is, therefore, IO29 centimeters long each way and its volume is 1029 x 1029 x 1029 or 1087 cubic centimeters all told. Since each cubic centimeter can contain 1039 neutrons, the total number of neutrons that can be packed into a cube a thousand times the volume of the known universe is 1087 x 1039 or 10126.
But these "neutrons" are computing units, remember. Let us suppose that each computing unit is a really super-mechanical job, capable of testing a billion different amino-acid combinations every second, and let us suppose that each unit keeps up this mad pace, unrelentingly, for three hundred billion years.
The number of different combinations tested in all that time would be about 10155.
This number is still approximately zero as compared with the hemoglobin number. In fact, the chance that the right combination would have been found in all that time would be only 1 in 4 x 10464.
But, you may say, suppose there is more than only one possible hemoglobin combination. It is true, after all, that the hemoglobin of various species of animals are distinct in their properties from one another. Well, let's be unfailingly generous. Let's suppose that every hemoglobin molecule that ever possibly existed on Earth is just a little different from every other. It would then be only necessary for our giant computing machine to find any one of 1050 possibilities. The chances of finding any one of those in three hundred billion years with 10126 units each turning out a billion answers a second is still only 1 out of 10414.
It would seem then that if ever a problem were absolutely incapable of solution, it is the problem of trying to pick out the exact arrangement of amino acids in a protein molecule out of all the different arrangements that are possible.
And yet, in the last ten years, biochemists have been making excellent progress in solving just that sort of problem. The amino acid arrangements in the protein, insulin—lack of which brings on diabetes—was completely worked out in 1953. To be sure, insulin is only one fifth the size of hemoglobin, but there are still just about 8 x 10113 possible arrangements of its amino acids, and that is a most respectable quantity.
How did the biochemists do it?
The fact is that straight trial-and-error technique would have been an unbearable trial and a colossal error. So they used other methods. There are other methods, you know.
What, for instance?
Well, that's another story for another article at another time. What we have now is enough for one sitting.
The End