The Chemist Measuring Complexity: Lee Cronin and the Revolution in Origin-of-Life Science

The Chemist Measuring Complexity: Lee Cronin and the Revolution in Origin-of-Life Science
From simple building blocks to complex structures: the mathematics of construction.

The Chemist Measuring Complexity: Lee Cronin and the Revolution in Origin-of-Life Science

Series: Assembly Theory | Part: 1 of 9


In a Glasgow laboratory, a chemist is attempting something that sounds impossible: measuring what it means for something to be complex. Not complicated, not intricate, not merely hard to understand—but genuinely, measurably complex in a way that distinguishes a living cell from a crystal, a poem from a password, selection from randomness.

Lee Cronin doesn't look like someone trying to overturn our understanding of what life is. He's a professor of chemistry at the University of Glasgow, his office cluttered with molecular models and mass spectrometry printouts. But over the past decade, Cronin and his collaborators have been developing what might be the most significant new framework for understanding biological complexity since Shannon gave us information theory: Assembly Theory.

The central insight is almost childlike in its simplicity. If you find a complex molecule—say, a protein with hundreds of amino acids arranged in a precise sequence—you can ask: How many steps would it take to build this from scratch? Not how many steps did evolution actually take, but the minimum number of construction operations required. This number, the assembly index, turns out to reveal something profound about the object's origin.

And here's the revolutionary claim: if the assembly index is high enough, the object couldn't have formed by chance. It must be the product of selection, of some process that accumulated and reused previous constructions. In other words, high assembly is a signature of life—or at least, of life-like processes. This isn't speculation. Cronin's team has published data showing they can reliably distinguish biological from non-biological molecules using nothing but assembly index measurements.

For those following the AToM framework (M = C/T), this should immediately raise flags. Assembly Theory is asking about the construction history embedded in an object's structure—which sounds remarkably like asking about the coherence trajectory that brought a system into its current state. If meaning equals coherence over time, and assembly captures the temporal depth of selection processes, then we might be looking at two different formalisms converging on the same underlying phenomenon.

But we're getting ahead of ourselves. This is the introduction to a series that will take Assembly Theory seriously—not as a curiosity, not as a competitor to AToM, but as a complementary formalism that illuminates how complexity emerges through time under constraint. We'll trace its foundations, understand its mathematics, explore its implications for biology and astrobiology, and ultimately ask whether assembly concepts can scale beyond molecules to meanings, ideas, and cultural forms.

This first article lays the groundwork. Who is Lee Cronin? What problem was he trying to solve? Why do we need a new theory of complexity when we already have thermodynamics, information theory, and complexity science? And what does it actually mean to measure how hard something is to make?


The Chemist Who Wouldn't Accept "That's Just How It Is"

Lee Cronin came to the origin-of-life question from an unusual direction: synthetic chemistry. Most origin-of-life researchers are biologists trying to work backward, tracing metabolic pathways and genetic mechanisms to their primordial roots. Cronin was building complex molecules in the lab and kept running into a puzzle.

When you synthesize a complex natural product—something like a drug molecule or a biological cofactor—you follow a recipe. Step 1: react compound A with compound B. Step 2: purify the intermediate. Step 3: add reagent C under specific conditions. The number of steps matters immensely. A 5-step synthesis is publishable. A 20-step synthesis is heroic. A 50-step synthesis borders on the impossible.

Now consider: living cells synthesize molecules with assembly complexities that would require hundreds or thousands of synthetic steps if you tried to build them from scratch in a flask. They do this routinely, constantly, with remarkable reliability. How?

The standard answer is "evolution." Natural selection accumulated useful constructions over billions of years, building complex molecular machinery through gradual optimization. But this answer, while true, didn't satisfy Cronin. It explains why cells can do it (they had time to evolve the capacity) but not what physical principle allows complexity to accumulate in the first place.

Here's the deeper puzzle: thermodynamically, complex structures shouldn't spontaneously form. The Second Law favors disorder. Complex molecules are highly ordered, low-entropy configurations that entropy should rapidly dismantle. Life doesn't violate thermodynamics—it's an open system that exports entropy—but this still doesn't explain how the information gets encoded into the molecular structures themselves.

Information theory gives us bits, but bits alone don't capture what makes a protein different from a random polymer of the same length. A shuffled genome contains the same amount of Shannon information as a functional one, but only one of them codes for anything useful. Something is missing from our conceptual toolkit.

Cronin's insight was to focus on construction history. A complex molecule isn't just a particular arrangement of atoms—it's a structure that required a sequence of operations to assemble. The number of operations, and particularly the reuse of earlier constructions, tells you something fundamental about the molecule's origin.

This is where Assembly Theory begins: with the radical claim that the complexity of an object is not about its structure or its information content, but about the shortest path required to construct it from elementary building blocks.


Assembly Index: Complexity as Construction Cost

The assembly index of an object is the minimum number of recursive joining operations needed to construct it from its basic parts.

Let's make this concrete. Consider a simple string of letters: "ABCABC". How many steps to build it?

  1. You have basic parts: A, B, C
  2. Step 1: Join A and B → "AB"
  3. Step 2: Join "AB" and C → "ABC"
  4. Step 3: Join "ABC" and "ABC" → "ABCABC"

Assembly index = 3.

Now consider "ABCDEFG":

  1. Join A and B → "AB"
  2. Join "AB" and C → "ABC"
  3. Join "ABC" and D → "ABCD"
  4. Continue through all letters...

Assembly index = 6.

Notice something crucial: "ABCABC" has lower assembly despite being the same length as "ABCDEFG" (if we count it as a 6-character sequence). Why? Because "ABCABC" reuses a construction. Once you've made "ABC", you can copy it. The reuse is what drives down assembly cost—and the reuse is precisely what signals that something other than pure randomness is at work.

This principle scales to molecules. A long-chain polymer made of repeating units (like many plastics) has relatively low assembly—you make the monomer, then repeat the polymerization step. But a protein with 300 amino acids in a specific, non-repeating sequence has extremely high assembly. Every new residue in a unique position adds to the construction cost.

Cronin's team has measured assembly indices for thousands of molecules using mass spectrometry. The method is clever: fragment the molecule in all possible ways, identify the largest fragment that appears multiple times, and use the fragment pattern to compute the minimum construction tree. The result is an empirical assembly index you can extract from real molecular data.

Here's what they found: there's a sharp threshold around assembly index 15. Below this, you find molecules in meteorites, in primordial soup simulations, in any environment where chemistry happens. Above this threshold, you find only biological molecules—products of living systems or of human chemists (who are themselves biological).

The threshold isn't arbitrary. It corresponds to the point where random assembly becomes astronomically improbable. Below 15, you might get lucky. Above 15, you need selection—a process that preserves useful intermediates and builds on them recursively.


Why This Isn't Information Theory (and Why That Matters)

Claude Shannon's information theory, developed in 1948, gave us a rigorous way to quantify information. Shannon entropy measures the uncertainty or "surprise" in a message. A random string has maximum entropy; a highly patterned string has low entropy.

But Shannon information is about statistical patterns, not construction. A genome and its shuffled version have identical Shannon entropy. A functional protein and a random amino acid chain of the same composition have similar information content. Yet one is biologically meaningful and the other is junk.

Assembly Theory offers something different: a measure of temporal depth. The assembly index doesn't care about statistical patterns. It cares about how many distinct operations you had to perform, and crucially, whether you reused earlier constructions.

This is fundamentally about history. An object with high assembly is one that couldn't have popped into existence in a single moment. It requires a past—a sequence of events that built up the structure incrementally, preserving useful substructures along the way.

In AToM terms, this is strikingly resonant with the concept of coherence trajectories. A coherent system isn't just one with low free energy in the present; it's one that has been shaped by its history of interactions, constrained by past couplings that determined which pathways were viable. Assembly captures, in molecular terms, the depth of that history.

Where Shannon measures what—the pattern of bits—Assembly measures how it got there—the construction path. For understanding life, this distinction is everything.


The Threshold of Aliveness

If assembly index reliably separates biological from non-biological molecules, then it's doing something profound: it's giving us an operational definition of life that doesn't depend on any specific biochemistry.

Consider the usual definitions of life. NASA's working definition: "a self-sustaining chemical system capable of Darwinian evolution." This is useful but circular—it defines life in terms of evolution, which is itself the thing we're trying to explain.

Or consider metabolism-based definitions: life is what maintains itself far from equilibrium, exporting entropy. True, but not distinctive—hurricanes and flames do this too.

Assembly Theory cuts through this by focusing on the products. Life is whatever produces objects with assembly indices above the threshold where random processes become implausible. This isn't circular. It's measurable. And crucially, it's substrate-independent.

You don't need to know if the molecules are carbon-based or silicon-based, use DNA or some alien genetic system, metabolize glucose or exotic Titan hydrocarbons. You just need to measure: what's the assembly index of the molecules you're finding? If you're consistently seeing assembly indices above 15, you've found something that accumulates complex constructions. In other words, you've found life—or at least, something life-like.

This has enormous implications for astrobiology. Current biosignature searches look for specific molecules (oxygen, methane, phosphine) or specific metabolic byproducts. But these depend on assumptions about what alien biochemistry might look like. Assembly Theory offers a more general approach: look for any molecules with high assembly. If you find them, something is selecting and preserving complex constructions.

Cronin has proposed missions that would carry mass spectrometers capable of measuring assembly indices on Mars, Enceladus, or Europa. The idea is beautifully simple: sample the environment, fragment the molecules, compute assembly indices. If the distribution shows a tail extending above the threshold, you've detected life—even if you have no idea what specific molecules you're looking at.


Selection as the Driver of Assembly

But what actually creates high assembly? What physical process enables the accumulation of complex constructions?

The answer is selection—but understood in a specific, physically grounded way. Selection isn't a mysterious force. It's a process where certain molecular structures are preferentially preserved and replicated because they perform some function that stabilizes them or speeds their formation.

Consider an autocatalytic set: a collection of molecules that catalyze each other's formation. Once you have such a set, you've created a selection environment where molecules that participate in the catalytic network persist, and molecules that don't, degrade. The autocatalytic set becomes a constructor—a physical system that assembles specific molecular structures repeatedly.

Each cycle of the autocatalytic network is an assembly step. The more cycles you go through, the higher the assembly index of the products. And crucially, the network reuses earlier constructions—molecules made in cycle 1 become inputs to cycle 2, which makes products that feed back into cycle 1.

This is the deep connection between Assembly Theory and Constructor Theory, the framework developed by David Deutsch and Chiara Marletto. In Constructor Theory, the fundamental entities are tasks—transformations that are possible or impossible—and constructors—physical systems that reliably perform tasks without being consumed.

A cell is a constructor. So is a ribosome. So is an autocatalytic chemical network. Each one enables assembly operations that wouldn't happen (or would happen astronomically slowly) without them. And the existence of constructors is what allows assembly to accumulate beyond the threshold of randomness.

Selection, in this view, is the emergence of constructors that enable further construction. Life is what happens when you get a recursive constructor—a system that constructs new constructors, enabling assembly to compound.


Coherence, Assembly, and the Deep Structure of Persistence

Now we return to AToM. If meaning equals coherence over time (M = C/T), and coherence is the integral of dynamical stability under constraint, then what is the relationship between coherence and assembly?

Here's a provisional mapping:

Assembly captures the temporal depth of selection. An object with high assembly is one that required many iterations of construction and preservation. It's a memory of the selection process—a physical encoding of the fact that certain pathways were traversed, certain structures were stabilized, certain intermediates were reused.

Coherence captures the dynamical stability of the current system. A coherent system is one that maintains its organization, that has low curvature in its state space, that resists perturbation. Coherence is about present structure.

But coherence over time—the T in M = C/T—is about how stability was maintained through a history of perturbations. And that history of maintaining structure under constraint is precisely what assembly measures, from a different angle.

Put differently: Assembly is the record of coherence construction. Every increment in assembly index represents a step where some structure was stabilized (achieved coherence) and then used as a platform for further construction. The assembly index is, in effect, the integral of coherence-preserving operations over the construction history.

This isn't metaphor. If we formalize assembly as a path through construction space, and formalize coherence as curvature in state space, we can ask: what's the relationship between the length of the construction path (assembly) and the integral of curvature along the trajectory (coherence accumulation)?

Cronin hasn't done this mapping explicitly—Assembly Theory developed independently of AToM—but the formal parallels are striking. Both frameworks are asking: what does it take to persist as something complex? Assembly answers in terms of construction steps. AToM answers in terms of dynamical geometry. The convergence suggests we're circling the same underlying phenomenon from different angles.


What This Series Will Explore

Over the next eight articles, we'll trace Assembly Theory through its foundations, applications, and speculative extensions:

Article 2 dives deep into the assembly index itself—the mathematical formalism, the measurement techniques, and why the threshold around 15 is so significant.

Article 3 explores what assembly reveals about the special nature of biological chemistry—why life's molecules look the way they do, and what this tells us about the constraints evolution operated under.

Article 4 distinguishes assembly from Shannon information and other complexity measures, clarifying why construction history is a fundamentally different kind of measurement than statistical pattern.

Article 5 connects Assembly Theory to Constructor Theory, showing how selection processes are physical constructors that enable recursion in assembly space.

Article 6 asks the speculative question: can assembly concepts scale beyond molecules? Could we define an assembly index for ideas, meanings, or cultural forms? What would high-assembly cognition look like?

Article 7 examines the astrobiology implications—how Assembly Theory might enable universal biosignature detection, and what this means for our search for life beyond Earth.

Article 8 builds the bridge to the Free Energy Principle, showing how Cronin's insights about molecular construction align with Friston's framework for understanding what it takes to persist as a far-from-equilibrium system.

Article 9 synthesizes everything, asking: what does Assembly Theory teach us about the construction of meaning? If molecules with high assembly are physically encoded memories of selection, are meanings with high coherence cognitive encodings of the same process at a different scale?

This isn't just a science series. It's an investigation into what complexity is, where it comes from, and how it relates to the central questions of AToM: what is meaning, how does it emerge, and what geometry does it live in?

Cronin is measuring molecules. But in doing so, he's given us a formalism that might illuminate how selection—at any scale—constructs the complex, improbable, deeply historical structures we call meaningful.


Further Reading

  • Cronin, L., & Walker, S. I. (2016). "Beyond prebiotic chemistry." Science, 352(6290), 1174-1175.
  • Marshall, S. M., et al. (2021). "Identifying molecules as biosignatures with assembly theory and mass spectrometry." Nature Communications, 12, 3033.
  • Sharma, A., et al. (2023). "Assembly theory explains and quantifies selection and evolution." Nature, 622, 321-328.
  • Deutsch, D., & Marletto, C. (2015). "Constructor theory of information." Proceedings of the Royal Society A, 471(2174), 20140540.

This is Part 1 of the Assembly Theory series, exploring Lee Cronin's framework for measuring complexity through the lens of AToM coherence geometry.

Next: Assembly Index: A New Way to Measure How Hard Something Is to Make