# Lightning in a Bottle/Chapter 2

**Chapter Two**

**What's the Significance of Complexity?**

**2.0 Introduction and Overview**

In **Chapter One**, I presented a general theory about the nature of the scientific project, and argued that this general theory suggests a natural way of thinking about the relationship between (and underlying unity of) the different branches of science. This way of looking at science is instructive but (as I said), doing abstract philosophy of science is not really my goal here. Eventually, we will need to turn to consider climate science specifically and examine the special problems faced by those studying the Earth's climate system. Before we get down into the nitty-gritty concrete details, though, we'll need a few more theoretical tools. Here's how this chapter will go.

In **2.1** I will introduce a distinction between "complex systems" sciences and "simple systems" sciences, and show how that distinction very naturally falls out of the account of science offered in **Chapter One**. I will draw a distinction between "complex" and "complicated," and explore what it is that makes a particular system complex or simple. We'll think about why the distinction between complex and simple systems is a useful one, and discuss some attempts by others to make the notion of complexity precise. In **2.2**, we will attempt to construct our own definition using the framework from the last chapter. Finally, in **2.3**, I’ll set up the discussion to come in **Chapter Three**, and suggest that climate science is a paradigmatic complex systems science, and that recognizing that fact is essential if we’re to make progress as rapidly as we need to. More specifically, I’ll argue that the parallels between climate science and other complex systems sciences—particularly economics—have been largely overlooked, and that this oversight is primarily a result of the tradition of dividing the sciences into physical and social sciences. This division, while useful, has limitations, and (at least in this case) can obfuscate important parallels between different branches of the scientific project. The complex/simple systems distinction cuts across the physical/social science distinction, and serves to highlight some important lessons that climate science could learn from the successes (and failures) of other complex systems sciences. This is the second (and last) chapter that will be primarily philosophical in character; with the last of our conceptual tool-kit assembled here, we’ll be ready to move on to a far more concrete discussion in **Chapter Three** and beyond.

**2.1 What is “Complexity?”**

Before we can actually engage with complex systems theories (and bring those theories to bear in exploring the foundations of climate science), we’ll need to articulate what exactly makes a system complex, and examine the structure of complex systems theories generally. Just as in **Chapter One**, my focus here will be primarily on exploring the actual *practice* of contemporary, working science: I’m interested in what climate scientists, economists, and statistical physicists (as well as others working in the branches of science primarily concerned with predicting the behavior of complex systems) can learn from one another, rather than in giving *a priori* pronouncements on the structure of these branches of science. With that goal in mind, we will anchor our discussion with examples drawn from contemporary scientific theories whenever possible, though a certain amount of purely abstract theorizing is unavoidable. Let's get that over with as quickly as possible.

It is important, first of all, to forestall the conflation of “complex/simple” and “complicated/simplistic.” All science is (to put the point mildly) *difficult*, and no branch of contemporary science is simplistic in the sense of being facile, superficial, or *easy*. In opposing complex systems to simple systems, then, I am not claiming that some branches of science are “hard” and some are “soft” in virtue of being more or less rigorous—indeed, the hard/soft science distinction (which roughly parallels the physical/social science distinction, at least most of the time) is precisely the conceptual carving that I’m suggesting we ought to move beyond. There are no simplistic sciences: all science is complicated in the sense of being difficult, multi-faceted, and messy. Similarly, there are no simplistic systems in nature; no matter how we choose to carve up the world, the result is a set of systems that are decidedly complicated (and thank goodness for this: the world would be incredibly boring otherwise!). This point should be clear from our discussion in **Chapter One**.

If all systems are complicated, then, what makes one system a *complex* system, and another a *simple* system? This is not an easy question to answer, and an entirely new academic field—complex systems theory—has grown up around attempts to do so. Despite the centrality of the concept, there’s no agreed-upon definition of complexity in the complex systems theory literature. We'll look at a few different suggestions that seem natural (and suggest why they might not be entirely satisfactory) before building our own, but let’s start by trying to get an intuitive grasp on the concept. As before, we’ll tighten up that intuitive account as we go along; if all goes well, we’ll construct a natural definition of complexity piece by piece.

Rather than trying to go for a solid demarcation between complex and simple systems immediately, it might be easier to start by *comparing* systems. Here are some comparisons that seem intuitively true^{[1]}: a dog’s brain is more complex than an ant’s brain, and a human’s brain is more complex still. The Earth’s ecosystem is complex, and rapidly became significantly *more* complex during and after the Cambrian explosion 550 million years ago. The Internet as it exists today is more complex than ARPANET—the Internet’s progenitor—was when it was first constructed. A Mozart violin concerto is more complex than a folk tune like “Twinkle, Twinkle, Little Star.” The shape of Ireland’s coastline is more complex than the shape described by the equation x^{2} + y^{2} = 1. The economy of the United States in 2011 is more complex than the economy of pre-Industrial Europe. All these cases are (hopefully) relatively uncontroversial. What quantity is actually being tracked here, though? Is it the *same* quantity in all these cases? That is, is the sense in which a human brain is more complex than an ant brain the *same* sense in which a Mozart concerto is more complex than a folk tune? One way or another, what’s the *significance* of the answer to that question—if there’s an analogous sense of complexity behind all these cases (and I shall argue that there is, at least in most cases), what does that mean for the practice of science? What can we learn by looking at disparate examples of complex systems? Let’s look at a few different ways that we might try to make this notion more precise. We’ll start with the most naïve and intuitive paths, and work our way up from there^{[2]}. Once we have a few proposals on the table, we’ll see if there’s a way to synthesize them such that we preserve the strengths of each attempt while avoiding as many of their weaknesses as possible.

**2.1.1 Complexity as Mereological Size**

One simple measure tends to occur to almost everyone when confronted with this problem for the first time: perhaps complexity is a measure of the number of independent *parts* that a system has—a value that we might call “mereological size.” This accords rather well with complexity in the ordinary sense of the word: an intricate piece of clockwork is complex largely in virtue of having a massive number of interlocking parts—gears, cogs, wheels, springs, and so on—that account for its functioning. Similarly, we might think that humans are complex in virtue of having a very large number of “interlocking parts” that are responsible for our functioning in the way we do^{[3]}—we have a lot more genes than (say) the yeast microorganism^{[4]}. Something like this definition is explicitly embraced by, for example, Michael Strevens: “A complex system, then, is a system of many somewhat autonomous, but strongly interacting parts^{[5]}.” Similarly, Lynn Kiesling says, “Technically speaking, what is a complex system? It’s a system or arrangement of many component parts, and those parts interact. These interactions generate outcomes that you could not necessarily have predicted in advance.^{[6]}”

There are a few reasons to be suspicious of this proposal, though. Perhaps primarily, it will turn the question "how complex is this system?" into a question that's only answerable by making reference to what the system is *made out of*. This might not be a fatal issue *per se*, but it suggests that measuring complexity is an insurmountably *relativist* project—after all, how are we to know exactly *which* parts we ought to count to define the complexity of a system? Why, that is, did we choose to measure the complexity of the human organism by the number of genes we have? Why not cells (in which case the blue whale would beat us handily), or even *atoms* (in which case even the smallest star would be orders of magnitude more complex than even the most corpulent human)? Relatedly, how are we to make comparisons across what (intuitively) seem like different *kinds* of systems? If we've identified the gene as the relevant unit for living things, for instance, how can we say something like "humans are more complex than cast-iron skillets, but less complex than global economies^{[7]}?"

Even if we waive that problem, though, the situation doesn't look too good for the mereological size measure. While it's certainly true that a human being has more nucleotide base pairs in his DNA than a yeast microbe, it's also true that we have far *fewer* base pairs than most amphibians, and fewer still than many members of the plant kingdom (which tend to have strikingly long genomes)^{[8]}. That's a *big* problem, assuming we want to count ourselves as more complex than frogs and ferns. This isn't going to do it, then: while size certainly matters *somewhat*, the mereological size measure fails to capture the sense in which it matters. Bigger doesn't always mean more complex, even if we can solve the all-important problem of defining what "bigger" even means.

In the case of Strevens’ proposal, we might well be suspicious of what Wikipedia editors would recognize as “weasel words” in the definition: a complex system is one that is made up of *many* parts that are *somewhat* independent of one another, and yet interact *strongly*. It’s difficult to extract anything very precise from this definition: if we didn’t already have an intuitive grasp of what ‘complex’ meant, a definition like this one wouldn’t go terribly far toward helping us get a grasp of the concept. *How* many parts do we need? *How* strongly must they interact? *How* autonomous can they be? Without a clear and precise answer to these questions, it’s hard to see how a definition like this can help us understand the general nature of complexity. In Strevens’ defense, this is not in the least fatal to his project, since his goal is not to give a complete analysis of complexity (but rather just to analyze the role that probability plays in the emergence of simple behavior from the chaotic interaction of many parts). Still, it won’t do for what we’re after here (and Kiesling can claim no such refuge, though her definition does come from an introductory-level talk). We’ll need to find something more precise.

**2.1.2 Complexity as Hierarchical Position**

First, let's try a refinement of the mereological size measure. The language of science (and, to an even greater degree, the language of *philosophy* of science) is rife with talk of levels. It's natural to think of many natural systems as showing a kind of hierarchical organization: lakes are made out of water molecules, which are made out of atoms, which are made out of quarks; computers are made out of circuit boards, which are made out of transistors and capacitors, which are made out of molecules; economies are made out of firms and households, which are made out of agents, which are made out of tissues, which are made out of cells &c.. This view is so attractive, in fact, that a number of philosophers have tried to turn it into a full-fledged metaphysical theory^{[9]}. Again, I want to try to avoid becoming deeply embroiled in the metaphysical debate here, so let's try to skirt those problems as much as possible. Still, might it not be the case that something like *degree of hierarchy* is a good measure for complexity? After all, it does seem (at first glance) to track our intuitions: more complex systems are those which are "nested" more deeply in this hierarchy. It seems like this might succeed in capturing what it was about the mereological size measure that felt right: things higher up on the hierarchy seem to have (as a general rule) *more parts* than things lower down on the hierarchy. Moreover, this measure might let us make sense of the most nagging question that made us suspicious of the mereological size measure: how to figure out which parts we ought to count when we're trying to tabulate complexity.

As attractive as this position looks at first, it's difficult to see how it can be made precise enough to serve the purpose to which we want to put it here. Hierarchy as a measure of complexity was first proposed by Herbert Simon back before the field of complex systems theory diverged from the much more interestingly named field of “cybernetics.” It might be useful to actually look at how Simon proposed to recruit hierarchy to explain complexity; the difficulties, I think, are already incipient in his original proposal:

Roughly, by a complex system I mean one made up of a large number of parts that interact in a non-simple way. In such systems, the whole is more than the sum of the parts, not in an ultimate, metaphysical sense, but in the important pragmatic sense that, given the properties of the parts and the laws of their inter-action, it is not a trivial matter to infer the properties of the whole. In the face of complexity, an in-principle reductionist may be at the same time a pragmatic holist...

^{[10]}

This sounds very much like the Strevens/Kiesling proposal that we looked at in **2.1.1**, and suffers from at least some of the same problems (as well as a few of its own). Aside from what I flagged above as Wikipedian “weasel words,” the hierarchical proposal suffers from some of the same subjectivity issues that plagued the mereological proposal: when Simon says (for instance) that one of the key features of the right sort of hierarchical composition is “near-decomposability,” exactly *what* is it that’s supposed to be decomposable? Again, the hierarchical position seems to be tracking something interesting here—Simon is right to note that it seems that many complex systems have the interesting feature of being decomposable into many (somewhat less) complex *subsystems*, and that the interactions within each subsystem are often stronger than interactions between subsystems. This structure, Simon contends, remains strongly in view even as the subsystems themselves are decomposed into sub-subsystems. There is certainly something to this point. Interactions between (say) my liver and my heart are relatively “weak” compared to interactions that the cells of my heart (or liver) have with each other. Similarly, the interactions between the mitochondria and the Golgi body of an *individual* cell in my heart are stronger than the interactions between the individual cells. Or, to move up in the hierarchy, the interactions between my organs seem stronger than the interactions between my body as a whole and other individual people I encounter on my daily commute to Columbia’s campus.

Still, a problem remains. What’s the sense of “stronger” here? Just as before, it seems like this proposal is tracking *something*, but it isn’t easy to say precisely what. We could say that it is easier for the equilibrium of my body to be disturbed by the right (or, rather, *wrong*) sort of interaction between my liver and heart than it is for that same equilibrium to be disturbed by the right kind of interaction between me and a stranger on the subway, but this still isn’t quite correct. It might be true that the processes that go on between my organs are more *fragile*—in the sense of being more easily perturbed out of a state where they’re functioning normally—than the processes that go on between me and the strangers standing around me on the subway as I write this, but without a precise account of the source and nature of this fragility, we haven’t moved too far beyond the intuitive first-pass account of complexity offered at the outset of **Section 2.1**. Just as with mereological size, there seems to be a nugget of truth embedded in the hierarchical account of complexity, but it will take some work to extract it from the surrounding difficulties.

**2.1.3 Complexity as Shannon Entropy**

Here’s a still more serious proposal. Given the discussion in **Chapter One**, there’s another approach that might occur to us: perhaps complexity is a measure of *information content* or *degree of surprise* in a system. We can recruit some of the machinery from the last chapter to help make this notion precise. We can think of “information content” as being a fact about how much structure (or lack thereof) exists in a particular system—how much of a pattern there is to be found in the way a system is put together. More formally, we might think of complexity as being a fact about the *Shannon entropy ^{[11]}* in a system. Let’s take a moment to remind ourselves of what exactly that means, and see if it succeeds in capturing our intuitive picture of complexity.

“Amount of surprise” is a good first approximation for the quantity that I have in mind here, so let’s start by thinking through a simple analogy. I converse with both my roommate and my Siamese cat on a fairly regular basis. In both cases, the conversation consists in my making particular sounds and my interlocutor responding by making different sounds. Likewise, in both cases there is a certain amount of *information* exchanged between my interlocutor and me. In the case of my roommate, the nature of this information might vary wildly from conversation to conversation: sometimes we will talk about philosophy, sometimes about a television show, and sometimes what to have for dinner. Moreover, he’s a rather unusual fellow—I’m never quite sure what he’s going to say, or how he’ll respond to a particular topic of conversation. Our exchanges are frequently *surprising* in a very intuitive sense: I never know what’s going to come out of his mouth, or what information he’ll convey. My Siamese cat, on the other hand, is far less surprising. While I can’t predict *precisely* what’s going to come out of her mouth (or when), I have a pretty general sense: most of the time, it’s a sound that’s in the vicinity of “meow,” and there are very specific situations in which I can expect particular noises. She’s quite grandiloquent for a cat (that’s a Siamese breed trait), and the sight of the can opener (or, in the evening, just someone going *near* the can opener) will often elicit torrent of very high-pitched vocalizations. I’m not surprised to hear these noises, and can predict when I'll hear them with a very high degree of accuracy.

The difference between conversing with these two creatures should be fairly clear. While my cat is not like a *recording*—that is, while I’m not sure *precisely* what she’s going to say (in the way that, for instance, I’m *precisely* sure what Han Solo will say in his negotiations with Jabba the Hutt), there’s far less variation in her vocalizations than there is in my roommate’s. She can convey urgent hunger (and often does), a desire for attention, a sense of contentment, and a few other basic pieces of information, but even that variation is expressed by only a very narrow range of vocalizations. My roommate, on the other hand, often surprises me, both with what kind of information he conveys and *how* he conveys it. Intuitively, my roommate’s vocalizations are the more *complex*.

We can also think of “surprise” as tracking something about how much I *miss* if I fail to hear part of a message. In messages that are more surprising (in this sense), missing just a small amount of data can make the message very difficult to interpret, as anyone who has ever said expressed incredulity with “What?!” can attest; when a message is received and interpreted as being highly surprising, we understand that just having misheard a word or two could have given us the wrong impression, and request verification. Missing just two or three words in a sentence uttered by my roommate, for instance, can render the sentence unintelligible, and the margin for error becomes more and more narrow as the information he’s conveying becomes less familiar. If he’s telling me about some complicated piece of scholarly work, I can afford to miss very little information without risking failing to understand the message entirely. On the other hand, if he’s asking me what I’d like to order for dinner and then listing a few options, I can miss quite a bit and still be confident that I’ve understood the overall gist of the message. My cat’s communications, which are less surprising even than the most banal conversation I can have with my roommate, are very easily recoverable from even high degrees of data loss; if I fail to hear the first four “meows,” there’s likely to be a fifth and sixth, just to make sure I got the point. Surprising messages are thus harder to *compress* in the sense described in **Chapter One**, as the recovery of a missing bit requires a more *complex* pattern to be reliable.

Shannon entropy formalizes this notion. In Shannon’s original formulation, the *entropy* (*H*) of a particular message source (my roommate’s speech, my cat’s vocalizations, Han Solo’s prevarications) is given by an equation,^{[12]} the precise details of which are not essential for our purposes here, that specifies how *unlikely* a *particular* message is, given specifications about the algorithm encoding the message. A particular string of noises coming out of my cat are (in general) far more *likely* than any particular string of noises that comes out of my roommate; my roommate’s speech shows a good deal more variation between messages, and between *pieces* of a given message. A sentence uttered by him has far higher Shannon entropy than a series of meows from my cat. So far, then, this seems like a pretty good candidate for what our intuitive sense of complexity might be tracking: information about complex systems has far more Shannon entropy than information about simple systems. Have we found our answer? Is complexity just Shannon entropy? Alas, things are not quite that easy. Let’s look at a few problem cases.

First, consider again the "toy science" from **Section 1.3**. We know that for each bit in a given string, there are two possibilities: the bit could be either a ‘1’ or a ‘0.’ In a truly random string in this language, knowing the state of a particular bit doesn’t tell us anything about the state of any other bits: there’s no pattern in the string, and the state of each bit is informationally independent of each of the others. What’s the entropy of a string like that—what’s the entropy of a “message” that contains nothing but randomly generated characters? If we think of the message in terms of how “surprising” it is, the answer is obvious: a randomly-generated string has *maximally high Shannon entropy*. That’s a problem if we’re to appeal to Shannon entropy to characterize complexity: we don’t want it to turn out that purely random messages are rated as even more complex than messages with dense, novel information-content, but that’s precisely what straight appeal to Shannon entropy would entail.

Why not? What’s the problem with calling a purely random message more complex? To see this point, let’s consider a more real-world example. If we want Shannon entropy to work as a straight-forward measure for complexity, it needs to be the case that there's a tight correlation between an increase (or decrease) in Shannon entropy and an increase (or decrease) in complexity. That is: we need it to be the case that complexity is *proportional* to Shannon entropy: call this the *correlation condition*. I don't think this condition is actually satisfied, though: think (to begin) of the difference between my brain at some time *t*, and my brain at some later time *t _{1}*. Even supposing that we can easily (and uncontroversially) find a way to represent the physical state of my brain as something like a message,

^{[13]}it seems clear that we can construct a case where measuring Shannon entropy

*isn't*going to give us a reliable guide to complexity. Here is such a case.

Suppose that at *t*, my brain is more-or-less as it is now—(mostly) functional, alive, and doing its job of regulating the rest of the systems in my body. Now, suppose that in the time between *t* and *t1*, someone swings a baseball bat at my head. What happens when it impacts? If there's enough force behind the swing, I'll *die*. Why is that? Well, when the bat hits my skull, it transfers a significant amount of kinetic energy through my skull and into my brain, which (among other things) *randomizes ^{[14]}* large swaths of my neural network, destroying the correlations that were previously in place, and making it impossible for the network to perform the kind of computation that it must perform to support the rest of my body. This is (I take it) relatively uncontroversial. However, it seems like we also want to say that my brain was

*more complex*when it was capable of supporting both life and significant information processing than it was after it was randomized—we want to say that normal living human systems are

*more complex*than corpses. But now we've got a problem: in randomizing the state of my brain, we've

*increased*the Shannon entropy of the associated message encoding its state. A decrease in complexity here is associated with an increase in Shannon entropy. That looks like trouble, unless a system with minimal Shannon entropy is a system with maximal complexity (that is, unless the strict inverse correlation between entropy and complexity holds). But that's absurd: a system represented by a string of identical characters is certainly not going to be more complex than a system represented by a string of characters in which multiple nuanced patterns are manifest

^{[15]}. The correlation condition between entropy and complexity fails.

Shannon entropy, then, can’t be quite what we’re looking for, but neither does it seem to miss the mark entirely. On the face of it, there’s *some* relationship between Shannon entropy and complexity, but the relationship must be more nuanced than simple identity, or even proportionality. Complex systems might well be those with a particular entropic profile, but if that’s the case, then the profile is something more subtle than just “high entropy” or “low entropy.” Indeed, if anything, it seems that there’s a kind of “sweet spot” between maximal and minimal Shannon entropy—systems represented by messages with too much Shannon entropy tend not to be complex (since they’re randomly organized), and systems represented by messages with too little Shannon entropy tend not to be complex, since they’re totally homogenous. This is a tantalizing observation: there’s a kind of Goldilocks zone here. Why? What’s the significance of that sweet spot? We will return to this question in **Section 2.1.5**. For now, consider one last candidate account of complexity from the existing literature.

**2.1.4 Complexity as Fractal Dimension**

The last candidate definition for complexity that we’ll examine here is also probably the least intuitive. The notion of a fractal was originally introduced as a purely geometric concept by French mathematician Benoit Mandelbrot^{[16]}, but there have been a number of attempts to connect the abstract mathematical character of the fractal to the ostensibly “fractal-like” structure of certain natural systems. Many parts of nature are fractal-like in the sense of displaying a certain degree of what’s sometimes called “statistical self-similarity.” Since we’re primarily interested in real physical systems here (rather than mathematical models), it makes sense to start with that sense of fractal dimension before considering the formal structure of mathematical fractals. Let’s begin by getting a handle on what counts as statistical self-similarity in nature, then, to
begin with.

Consider a stalk of broccoli or cauliflower that we might find in the produce section of a supermarket. A medium-sized stalk of broccoli is composed of a long smooth stem (which may be truncated by the grocery store, but is usually still visible) and a number of lobes covered in what look like small green bristles. If we look closer, though, we’ll see that we can separate those lobes from one another and remove them. When we do, we’re left with several things that look very much like our original piece of broccoli, only miniaturized: each has a long smooth stem, and a number of smaller lobes that look like bristles. Breaking off one of these smaller lobes reveals another piece that looks much the same. Depending on the size and composition of the original stalk, this process can be iterated several times, until at last you’re removing an individual bristle from the end of a small stalk. Even here, though, the structure looks remarkably similar to that of the original piece: a single green lobe at the end of a long smooth stem.

This is a clear case of the kind of structure that generally gets called “fractal-like.” It’s worth highlighting two relevant features that the broccoli case illustrates nicely. First, fractal-like physical systems have interesting detail at many levels of magnification: as you methodically remove pieces from your broccoli stem, you continue to get pieces with detail that isn’t homogenous. Contrast this with what it looks like when you perform a similar dissection of (say) a carrot. After separating the leafy bit from the taproot, further divisions produce (no pun intended) pieces that are significantly less interesting: each piece ends up looking more-or-less the same as the last one—smooth, orange, and fibrous. That’s one feature that makes fractal-like parts of the world interesting, but it’s not the only one. After all, it’s certainly the case that there are many *other* systems which, on dissection, can be split into pieces with interesting detail many times over—any sufficiently inhomogeneous mixture will have this feature. What else, then, is the case of fractals tracking? What’s the difference between broccoli and (say) a very inhomogeneous mixture of cake ingredients?

The fact that (to put it one more way) a stalk of broccoli continues to evince interesting details at several levels of magnification cannot be all that makes it fractal-like, so what’s the second feature? Recall that the kind of detail that our repeated broccoli division produced was of a very particular kind—one that kept more-or-less the same structure with every division. Each time we zoomed in on a smaller piece of our original stalk, we found a piece with a long smooth stem and a round green bristle on the end. That is, each division (and magnification) yielded a structure that not only resembled the structure which resulted from the *previous* division, but also the structure that we *started* with. The interesting detail at each level was structurally similar to the interesting detail at the level above and below it. This is what separates fractal-like systems from merely inhomogeneous mixtures—not only is interesting detail present with each division, but it *looks the same*. Fractal-like systems (or, at least the fractal-like systems we’re interested in here) show *interesting details* at multiple levels of magnification, and the interesting details present at each level are *self-similar*.

With this intuitive picture on the table, let’s spend a moment looking at the more formal definition of fractals given in mathematics. Notice that we’ve been calling physical systems “fractal-*like*” all along here—that’s because nothing in nature is *actually* a fractal, in just the same sense that nothing in nature is *actually* a circle. In the case of circles, we know exactly what it means to say that there are no circles in nature: no natural systems exist which are precisely isomorphic to the equation that describes a geometric circle: things (e.g. basketball hoops) are *circular*, but on close enough examination they turn out to be rough and bumpy in a way that a mathematical circle is not. The same is true of fractals; if we continue to subdivide the broccoli stalk discussed above, eventually we’ll reach a point where the self-similarity breaks down—we can’t carry on getting smaller and smaller smooth green stems and round green bristles forever. Moreover, the kind of similarity that we see at each level of magnification is only *approximate*: each of the lobes looks *a lot* like the original piece of broccoli, but the resemblance isn’t perfect—it’s just pretty close. That’s the sense in which fractal-like physical systems are only *statistically* self-similar—at each level of magnification, you’re *likely* to end up with a piece that looks more-or-less the same as the original one, but the similarity isn’t perfect. The tiny bristle isn’t just a broccoli stalk that’s been shrunk to a tiny size, but it’s *almost* that. This isn’t the case for mathematical fractals: a true fractal has the two features outlined above at *every* level of magnification—there’s always more interesting detail to see, and the interesting details are always *perfectly* self-similar miniature copies of the original

Here’s an example of an algorithm that will produce a true fractal:

- Draw a square.
- Draw a 45-45-90 triangle on top of the square, so that the top edge of the square and the base of the triangle are the same line. Put the 90 degree angle at the vertex of the triangle, opposite the base
- Use each of the other two sides of the triangle as sides for two new (smaller) squares.
- Repeat steps 1-4 for each of the new squares you’ve drawn.

Here’s what this algorithm produces after just a dozen iterations:

Look familiar? This shape^{[17]} is starting to look suspiciously like our stalk of broccoli: there’s a main “stem” formed by the first few shapes (and the negative space of later shapes), “lobes” branching off from the main stem with stems of their own, and so on. If you could iterate this procedure an infinite number of times, in fact, you’d produce a *perfect* fractal: you could zoom in on almost any region of the shape and find *perfect* miniaturized copies of what you started with. Zooming in again on any region of one of those copies would yield even more copies, *ad infinitum*.

This is a neat mathematical trick, but (you might wonder) what’s the point of this discussion? How does this bear on complexity? Stay with me just a bit longer here—we’re almost there. To explain the supposed connection between fractal-like systems and complexity, we have to look a bit more closely at some of the mathematics behind geometric fractals; in particular, we’ll have to introduce a concept called *fractal dimension*. All the details certainly aren’t necessary for what we’re doing here, but a rough grasp of the concepts will be helpful for what follows. Consider, to begin with, the intuitive notion of “dimension” that’s taught in high school math classes: the dimensionality of a space is just a specification of how many numbers need to be given in order to uniquely identify a point in that space. This definition is sufficient for most familiar spaces (such as all subsets of Euclidean spaces), but breaks down in the case of some more interesting figures^{[18]}. One of the cases in which this definition becomes fuzzy is the case of the Pythagoras Tree described above: because of the way the figure is structured, it behaves in
some formal ways as a two-dimensional figure, and in other ways as a not two-dimensional figure.

The notion of *topological dimensionality* refines the intuitive concept of dimensionality. A full discussion of topological dimension is beyond the scope of this chapter, but the basics of the idea are easy enough to grasp. Topological dimensionality is also sometimes called “covering dimensionality,” since it is (among other things) a fact about how difficult it is to *cover* the figure in question with other overlapping figures, and how that covering can be done most efficiently. Consider the case of the following curve^{[19]}:

Suppose we want to cover this curve with a series of open (in the sense of not having a precisely-defined boundary) disks. There are many different ways we could do it, three of which are shown in the figure above. In the case on the bottom left, several points are contained in the intersection of four disks; in the case in the middle, no point is contained in the intersection of more than three disks; finally, the case on the right leaves no point contained in the intersection of more than two disks. It’s easy to see that this is the furthest we could possibly push this covering: it wouldn’t be possible to arrange open disks of any size into any configuration where the curve was both completely covered and no disks overlapped^{[20]}. We can use this to define topological dimensionality in general: for a given figure *F*, the topological dimension is defined to be the minimum value of *n*, such that every finite open cover of *F* has a finite open refinement in which no point is included in more than *n+1* elements. In plain English, that just means that the topological dimension of a figure is one less than the largest number of intersecting covers (disks, in our example) in the most efficient scheme to cover the whole figure. Since the most efficient refinement of the cover for the curve above is one where there is a maximum of *two* disks intersecting on a given point, this definition tells us that the figure is *1-dimensional*. So far so good—it’s a line, and so in this case topological dimensionality concurs with intuitive dimensionality^{[21]}.

There’s one more mathematical notion that we need to examine before we can get to the punch-line of this discussion: fractal dimensionality. Again, a simple example^{[22]} can illustrate this point rather clearly. Consider a Euclidean line segment. Bisecting that line produces two line segments, each with half the length of the original segment. Bisecting the segments again produces four segments, each with one-quarter the length of the original segment. Next, consider a square on a Euclidean plane. Bisecting each side of the square results in four copies, each one-quarter the size of the original square. Bisecting each side of the new squares will result in 16 squares, each a quarter the size of the squares in the second step. Finally, consider a cube. Bisecting each face of the cube will yield eight one-eighth sized copies of the original
cube.

These cases provide an illustration of the general idea behind fractal dimension. Very roughly, fractal dimension is a measure of the relationship between how many *copies* of a figure are present at different levels of magnification and how much the *size* of those copies changes between levels of magnification^{[23]}. In fact, we can think of it as a *ratio* between these two quantities. The fractal dimension *d* of an object is equal to log(*a*)/log(*b*), where *a* = the number of new copies present at each level, and *b* is the factor by each piece must be magnified in order to have the same size as the original. This definition tells us that a line is one-dimensional: it can be broken into *n* pieces, each of which is *n*-times smaller than the original. If we let *n* = 2, as in our bisection case, then we can see easily that log(2)/log(2) = 1. Likewise, it tells us that a square is two-dimensional: a square can be broken into *n*^{2} pieces, each of which must be magnified by a factor of *n* to recover the size of the original figure; again, let *n* = 2 as in our bisection case, so that the bisected square contains 2^{2} = 4 copies of the original figure, each of which must be doubled in size to recover the area of the original figure. Log(4)/log(2) = 2, so the square is two-dimensional. So far so good. It’s worth pointing out that in these more familiar cases intuitive dimension = topological dimension = fractal dimension. That is not the case for all figures, though.

Finally, consider our broccoli-like fractal: the Pythagoras Tree. The Pythagoras Tree, as you can easily confirm, has a fractal dimension of 2: at each step *n* in the generation, there are 2^{n} copies of the figure present: 1 on the zeroth iteration, 2 after a single iteration, 4 after two iterations, 8 after three, 16 after four, and so on. Additionally, each iteration produces figures that are smaller by a factor of √2/2. Following our formula from above, we can calculate log(2)/log(√2/2), which is equal to 2. This accords with our intuitive ascription of dimensionality (the Pythagoras Tree looks like a plane figure) but, more interestingly, it *fails* to accord with the topological dimension of the figure. Perhaps surprisingly, the Pythagoras Tree’s topological dimension is not 2 but *1*—like a simple curve, it can be covered by disks such that no point is in the intersection of more than two disks^{[24]}. Topologically, the Pythagoras Tree behaves like a simple one-dimensional line, while in other ways it behaves more like a higher dimensional figure. Fractal dimension lets us quantify the amount by which these behaviors diverge: in fact, this is a characteristic that’s common to many (but not all) fractals. In addition to the two-pronged “fine detail and self-similarity” definition given above, Mandelbrot, in his original discussion of fractals, offers an alternative definition: a fractal is a figure where the fractal dimension is greater than the topological dimension^{[25]}.

At last, we’re in a position, then, to say what it is about fractals that’s supposed to capture our notion of complexity. Since fractal dimension quantifies the relationship between the proliferation of detail and the change in magnification scale, an object with a higher fractal dimension will show *more* interesting detail than an object with a lower fractal dimension, given the same amount of magnification. In the case of objects that are appropriately called “fractal-like” (e.g. our stalk of broccoli), this cascade of detail is more significant than you’d expect it to be for an object with the sort of abstract (i.e. topological) structure it has. That’s what it means to say that fractal dimension exceeds topological dimension for most fractals (and fractal-like objects): the buildup of interesting details in a sense “outruns” the buildup of other geometric characteristics. Objects with higher fractal dimension are, in a sense, richer and more rewarding: it takes less magnification to see more detail, and the detail you can see is more intricately structured.

So is this measure sufficient, then? You can probably guess by now that the answer is ‘no, not entirely.’ There are certainly cases where fractal dimension accords very nicely with what we mean by ‘complex:’ it excels, for instance, at tracking the rugged complexity of coastlines. Coasts—which were among Mandelbrot’s original paradigm cases of fractal-like objects—are statistically self-similar in much the same way that broccoli is. Viewed from high above, coastlines look jagged and irregular. As you zoom in on a particular section of the coast, this kind of jaggedness persists: a small segment of shore along a coast that is very rugged *in general* is likely to be very rugged itself. Just as with the broccoli, this self-similarity is (of course) not perfect: the San Francisco bay is not a perfect miniaturization of California’s coastline overall, but they look similar in many respects. Moreover, it turns out that the more rugged a coastline is, the higher fractal dimension it has: coasts with outlines that are very *complex* have higher fractal dimension than coasts that are relatively *simple* and smooth.

The most serious problem with using fractal dimension as a general measure of complexity is that it seems to chiefly be quantifying a fact about how complex an object’s *spatial configuration* is: the statistical self-similarity that both broccoli and coastlines show is a self-similarity of *shape*. This is just fine when what we’re interested in is the structure or composition of an object, but it isn’t at all clear how this notion might be expanded. After all, at least some of our judgments of complexity seem (at least at first glance) to have very little to do with shape: when I say (for instance) that the global economy is more complex today than it was 300 years ago, it doesn’t look like I’m making a claim about the shape of any particular object. Similarly, when I say that a human is more complex than a fern, I don’t seem to be claiming that the shape of the human body has a greater fractal dimension than the shape of a fern. In many (perhaps most) cases, we’re interested not in the *shape* of an object, but in how the object *behaves* over time; we’re concerned not with relatively static properties like fractal dimension, but with dynamical ones too. Just as with Shannon entropy, there seems to be a grain of truth buried in the fractal dimension measure, but it will takes some work to articulate what it is; also like Shannon entropy, it seems as though fractal dimension by itself will not be sufficient.

**2.2 Moving Forward**

We have spent the majority of this chapter introducing some of the concepts behind contemporary complexity theory, and examining various existing attempts to define ‘complexity.’ I have argued (convincingly, I hope) that none of these attempts really captures all the interesting facets of what we’re talking about when we talk about complex physical systems (like the Earth’s climate). I have not yet offered a positive view, though—I have not yet told you what I would propose to use in place of the concepts surveyed here. In **Chapter Three**, I shall take up that project, and present a novel account of what it means for a physical system to be complex in the relevant sense. This concept, which I will call *dynamical complexity*, is presented as a physical interpretation of some very recent mathematical advancements in the field of information theory. The central problem that shall occupy us in the next chapter, then, is how to transform a discussion of complexity that seems to work very well for things like *messages* into an account that works well for things like climate systems. My hope is that dynamical complexity offers this bridge. Once this final conceptual tool is on the table, we can start applying all of this to the problem of understanding the Earth’s climate.

- ↑ I'm going to rely quite heavily on our intuitive judgments of complexity in this chapter; in particular, I'll argue that some of the definitions we consider later on are insufficient because they fail to accord with our intuitive judgments about what counts as a complex system. Since constructing a more rigorous definition is precisely what we're trying to do here, this doesn't seem like much of a problem. We've got to start somewhere.
- ↑ For an even more exhaustive survey of different attempts to quantify “complexity” in the existing literature, see Chapter 7 of Mitchell (2009). We will not survey every such proposal here, but rather will focus our attention on a few of the leading contenders—both the most intuitive proposals and the proposals that seem to have gotten the most mileage—before offering a novel account of complexity that attempts to synthesize these contenders.
- ↑ It’s interesting to point out that this is precisely the intuition that many proponents of the “intelligent design” explanation for biological complexity want to press on. See, canonically, Paley (1802).
- ↑ Even still, the amount of information encoded in the human genome is shockingly small by today’s storage standards: the Human Genome Project has found that there are about 2.9 billion base-pairs in the human genome. If every base-pair can be coded with two bits, this corresponds to about 691.4 megabytes of data. Moreover, Christley et. al. (2009) point out that since individual genomes vary by less than 1% from each other, they can be losslessly compressed to roughly 4 megabytes. To put that in perspective, even a relatively cheap modern smartphone has about 16 gigabytes of memory—enough to store almost 5,000 complete human genomes.
- ↑ Strevens (2003), p. 7
- ↑ Kiesling (2011)
- ↑ Whether or not these comparisons are
*accurate*is another matter entirely. That is, whether you think it's actually*true*to say that humans are less complex than the 21^{st}century global economy, it seems clear that the comparison is at least*sensible*. Or, at least, it seems clear that it*ought*to be sensible if we're to succeed in our goal of finding a notion of "complexity" that is widely-applicable enough to be useful. I'll argue in**2.2**that there*is*sense to the comparison and (moreover) that the global economy*is*more complex than an individual human. For now, though, it's enough to point out that even having that discussion presupposes a wide notion of complexity that renders the mereological size measure suspect. - ↑ Most amphibians have between 10
^{9}and 10^{11}base-pairs.*Psilotum nudum*, a member of the fern family, has even more: something on the order of 2.5 x 10^{11}base-pairs. The latter case is perhaps the most striking comparison, since*P. nudum*is quite primitive, even compared to other ferns (which are among the oldest plants still around): it lacks leaves, flowers, and fruit. It closely resembles plants from the Silurian epoch (~443 million years ago – 416 million years ago), which are among the oldest vascular plants we've found in the fossil record. - ↑ See, e.g., Morgan (1923), Oppenheim & Putnam (1958), and (to a lesser extent) Kim (2002)
- ↑ Simon (1962)
- ↑ See Shannon (1948) and Shannon & Weaver (1949)
- ↑
*i**H*= ∑*P*This equation expresses the entropy in terms of a sum of probabilities_{i}H_{i}*p*for producing various symbols_{i}(j)*j*such that the message in question is structured the way it is. Thus, the more variation you can expect in each*bit*of the message, the higher the entropy of the total message. For a more detailed discussion of the process by which this equation can be derived, see Shannon (1948) and Shannon & Weaver (1964). - ↑ Mitchell (op. cit.) points out that if we’re to use any measure of this sort to define complexity, anything we wish to appropriately call “complex” must be put into a form for which Shannon entropy can be calculated—that is, it has to be put into the form of a
*message*. This works just fine for speech, but it isn’t immediately obvious how we might go about re-describing (say) the brain of a human and the brain of an ant messages such that we can calculate their Shannon entropy. This problem may be not be insurmountable (I’ll argue in**2.2**that it can indeed be surmounted), but it is worth noting still. - ↑ The sense of “randomizes” here is a thermodynamic one. By introducing a large amount of kinetic energy into my brain, my assailant (among other things) makes it the case that the volume of the region of configuration space associated with my brain is
*wildly*increased. That is, the state “Jon is conscious and trying to dodge that baseball bat” is compatible with far fewer microstates of my brain than is the state “Jon has been knocked out by a baseball bat to the face.” The bat’s impacting with my skull, then, results in a large amount of information loss about the system—the number of possible*encodings*for the new state is larger than the number of possible encodings for the old state. The Shannon entropy has thus increased. - ↑ To see this point, think of two pieces of DNA—one of which codes for a normal organism (say, a human being) and one of similar length, but which consists only in cytosine-guanine pairs. Each DNA string can be encoded as a message consisting entirely of the letters A, C, G, and T. The piece of DNA that codes for a functional organism will be associated with a message with
*far*higher Shannon entropy than the piece of DNA associated with a message that consists entirely of the string ‘CG’ repeated many times. Surely DNA that codes for a functional organism, though, is more complex than a non-coding DNA molecule. Again, the correlation condition fails. - ↑ Mandelbrot (1986)
- ↑ The shape generated by this procedure is called the Pythagoras Tree.
- ↑ Additionally, it’s difficult to make this definition of dimensionality more precise than the very vague phrasing we’ve given it here. Consider a curve embedded in a two-dimensional Euclidean plane—something like a squiggly line drawn on a chalkboard. What’s the dimensionality of that figure? Our intuitions come into conflict here: for each point on the curve, we have to specify two numbers (the Cartesian coordinates) in order to uniquely pick it out. On the other hand, this seems to just be a consequence of the fact that the curve is embedded in a two-dimensional space, not a fact about the curve
*itself*—since it’s just a line, it seems like it ought to just be*one*-dimensional. The intuitive account of dimensionality has no way to resolve this conflict of reasoning. - ↑ This figure is adapted from one in Kraft (1995)
- ↑ Why not? Remember that the disks are
*open*, so points just at the “boundary” are not contained in the disks. Thus, a series of very small disks that were very near each other without intersecting would necessarily leave at least some points uncovered: those in the tiny region between two open disks. The only way to cover the whole figure is to allow the disks to overlap slightly. - ↑ This also lets us move beyond our problem case from above: we can say why it is that a curve on a plane can be one-dimensional even though it is embedded in a two-dimensional space.
- ↑ This exceedingly clear way of illustrating the point is due to Mitchell (op. cit), though our discussion here is somewhat more technically precise than the discussion there; Mitchell hides the mathematics behind the discussion, and fails to make the connection between fractal dimension and topological dimension explicit, resulting in a somewhat confusing discussion as she equivocates between the two senses of "dimension." For a more formal definition of fractal dimensionality (especially in the case of Pythagoras Tree-like figures), see Lofstedt (2008).
- ↑ In the illustration here, we had to build in the presence of “copies” by hand, since a featureless line (or square or cube) has no self-similarity at all. That’s OK: the action of bisecting the figure is, in a sense, a purely abstract operation: we’re not changing anything about the topology of the figures in question by supposing that they’re being altered in this way. In figures with
*actual*self-similarity (like fractals), we won’t have to appeal to this somewhat arbitrary-seeming procedure. - ↑ The math behind this assertion is, again, beyond the scope of what we’re concerned with here. For a detailed discussion of why the topological dimension of fractal canopies—the class of figures to which the Pythagoras Tree belongs—is 1, see Mandelbrot (1986), Chapter 16.
- ↑ Mandelbrot offered these two definitions as equivalent. It has since been discovered, though, that there are a number of fractals (in the first sense) for which the latter definition does not hold. See Kraft (1995) for more on this.