**Abstract**: For the scientific community intelligent design
represents creationism's latest grasp at scientific legitimacy.
Accordingly, intelligent design is viewed as yet another ill-conceived
attempt by creationists to straightjacket science within a religious
ideology. But in fact intelligent design can be formulated as
a scientific theory having empirical consequences and devoid of
religious commitments. Intelligent design can be unpacked as a
theory of information. Within such a theory, information becomes
a reliable indicator of design as well as a proper object for
scientific investigation. In my paper I shall (1) show how information
can be reliably detected and measured, and (2) formulate
a conservation law that governs the origin and flow of information.
My broad conclusion is that information is not reducible to natural
causes, and that the origin of information is best sought in intelligent
causes. Intelligent design thereby becomes a theory for detecting
and measuring information, explaining its origin, and tracing
its flow.

In *Steps Towards Life* Manfred Eigen (1992, p. 12) identifies
what he regards as the central problem facing origins-of-life
research: "Our task is to find an algorithm, a natural law
that leads to the origin of information." Eigen is only half
right. To determine how life began, it is indeed necessary to
understand the origin of information. Even so, neither algorithms
nor natural laws are capable of producing information. The great
myth of modern evolutionary biology is that information can be
gotten on the cheap without recourse to intelligence. It is this
myth I seek to dispel, but to do so I shall need to give an account
of information. No one disputes that there is such a thing as
information. As Keith Devlin (1991, p. 1) remarks, "Our very
lives depend upon it, upon its gathering, storage, manipulation,
transmission, security, and so on. Huge amounts of money change
hands in exchange for information. People talk about it all the
time. Lives are lost in its pursuit. Vast commercial empires are
created in order to manufacture equipment to handle it."
But what exactly is information? The burden of this paper is to
answer this question, presenting an account of information that
is relevant to biology.

What then is information? The fundamental intuition underlying information is not, as is sometimes thought, the transmission of signals across a communication channel, but rather, the actualization of one possibility to the exclusion of others. As Fred Dretske (1981, p. 4) puts it, "Information theory identifies the amount of information associated with, or generated by, the occurrence of an event (or the realization of a state of affairs) with the reduction in uncertainty, the elimination of possibilities, represented by that event or state of affairs." To be sure, whenever signals are transmitted across a communication channel, one possibility is actualized to the exclusion of others, namely, the signal that was transmitted to the exclusion of those that weren't. But this is only a special case. Information in the first instance presupposes not some medium of communication, but contingency. Robert Stalnaker (1984, p. 85) makes this point clearly: "Content requires contingency. To learn something, to acquire information, is to rule out possibilities. To understand the information conveyed in a communication is to know what possibilities would be excluded by its truth." For there to be information, there must be a multiplicity of distinct possibilities any one of which might happen. When one of these possibilities does happen and the others are ruled out, information becomes actualized. Indeed, information in its most general sense can be defined as the actualization of one possibility to the exclusion of others (observe that this definition encompasses both syntactic and semantic information).

This way of defining information may seem counterintuitive since we often speak of the information inherent in possibilities that are never actualized. Thus we may speak of the information inherent in flipping one-hundred heads in a row with a fair coin even if this event never happens. There is no difficulty here. In counterfactual situations the definition of information needs to be applied counterfactually. Thus to consider the information inherent in flipping one-hundred heads in a row with a fair coin, we treat this event/possibility as though it were actualized. Information needs to referenced not just to the actual world, but also cross-referenced with all possible worlds.

How does our definition of information apply to biology, and
to science more generally? To render information a useful concept
for science we need to do two things: first, show how to measure
information; second, introduce a crucial distinction-the distinction
between *specified* and *unspecified* information. First,
let us show how to measure information. In measuring information
it is not enough to count the number of possibilities that were
excluded, and offer this number as the relevant measure of information.
The problem is that a simple enumeration of excluded possibilities
tells us nothing about how those possibilities were individuated
in the first place. Consider, for instance, the following individuation
of poker hands:

(i) A royal flush.

(ii) Everything else.

To learn that something other than a royal flush was dealt (i.e., possibility (ii)) is clearly to acquire less information than to learn that a royal flush was dealt (i.e., possibility (i)). Yet if our measure of information is simply an enumeration of excluded possibilities, the same numerical value must be assigned in both instances since in both instances a single possibility is excluded.

It follows, therefore, that how we measure information needs to be independent of whatever procedure we use to individuate the possibilities under consideration. And the way to do this is not simply to count possibilities, but to assign probabilities to these possibilities. For a thoroughly shuffled deck of cards, the probability of being dealt a royal flush (i.e., possibility (i)) is approximately .000002 whereas the probability of being dealt anything other than a royal flush (i.e., possibility (ii)) is approximately .999998. Probabilities by themselves, however, are not information measures. Although probabilities properly distinguish possibilities according to the information they contain, nonetheless probabilities remain an inconvenient way of measuring information. There are two reasons for this. First, the scaling and directionality of the numbers assigned by probabilities needs to be recalibrated. We are clearly acquiring more information when we learn someone was dealt a royal flush than when we learn someone wasn't dealt a royal flush. And yet the probability of being dealt a royal flush (i.e., .000002) is minuscule compared to the probability of being dealt something other than a royal flush (i.e., .999998). Smaller probabilities signify more information, not less.

The second reason probabilities are inconvenient for measuring information is that they are multiplicative rather than additive. If I learn that Alice was dealt a royal flush playing poker at Caesar's Palace and that Bob was dealt a royal flush playing poker at the Mirage, the probability that both Alice and Bob were dealt royal flushes is the product of the individual probabilities. Nonetheless, it is convenient for information to be measured additively so that the measure of information assigned to Alice and Bob jointly being dealt royal flushes equals the measure of information assigned to Alice being dealt a royal flush plus the measure of information assigned to Bob being dealt a royal flush.

Now there is an obvious way to transform probabilities which
circumvents both these difficulties, and that is to apply a negative
logarithm to the probabilities. Applying a negative logarithm
assigns the more information to the less probability and, because
the logarithm of a product is the sum of the logarithms, transforms
multiplicative probability measures into additive information
measures. What's more, in deference to communication theorists,
it is customary to use the logarithm to the base 2. The rationale
for this choice of logarithmic base is as follows. The most convenient
way for communication theorists to measure information is in bits.
Any message sent across a communication channel can be viewed
as a string of 0's and 1's. For instance, the ASCII code uses
strings of eight 0's and 1's to represent the characters on a
typewriter, with whole words and sentences in turn represented
as strings of such character strings. In like manner all communication
may be reduced to the transmission of sequences of 0's and 1's.
Given this reduction, the obvious way for communication theorists
to measure information is in number of bits transmitted across
a communication channel. And since the negative logarithm to the
base 2 of a probability corresponds to the average number of bits
needed to identify an event of that probability, the logarithm
to the base 2 is the canonical logarithm for communication theorists.
Thus we define the measure of information in an event of probability
*p* as -log2*p* (see Shannon and Weaver, 1949, p. 32;
Hamming, 1986; or indeed any mathematical introduction to information
theory).

What about the additivity of this information measure? Recall
the example of Alice being dealt a royal flush playing poker at
Caesar's Palace and that Bob being dealt a royal flush playing
poker at the Mirage. Let's call the first event A and the second
B. Since randomly dealt poker hands are probabilistically independent,
the probability of A and B taken jointly equals the product of
the probabilities of A and B taken individually. Symbolically,
**P**(A&B) = **P**(A)x**P**(B).
Given our logarithmic definition of information we therefore define
the amount of information in an event E as **I**(E) =def -log2**P**(E).
It then follows that **P**(A&B) = **P**(A)x**P**(B)
if and only if **I**(A&B) = **I**(A)+**I**(B). Since
in the example of Alice and Bob **P**(A) = **P**(B) = .000002,
**I**(A) = **I**(B) = 19, and **I**(A&B) = **I**(A)+**I**(B)
= 19 + 19 = 38. Thus the amount of information inherent in Alice
and Bob jointly obtaining royal flushes is 38 bits.

Since lots of events are probabilistically independent, information
measures exhibit lots of additivity. But since lots of events
are also correlated, information measures exhibit lots of non-additivity
as well. In the case of Alice and Bob, Alice being dealt a royal
flush is probabilistically independent of Bob being dealt a royal
flush, and so the amount of information in Alice and Bob both
being dealt royal flushes equals the sum of the individual amounts
of information. But consider now a different example. Alice and
Bob together toss a coin five times. Alice observes the first
four tosses but is distracted, and so misses the fifth toss. On
the other hand, Bob misses the first toss, but observes the last
four tosses. Let's say the actual sequence of tosses is 11001
(1 = heads, 0 = tails). Thus Alice observes 1100* and Bob observes
*1001. Let A denote the first observation, B the second. It follows
that the amount of information in A&B is the amount of information
in the completed sequence 11001, namely, 5 bits. On the other
hand, the amount of information in A alone is the amount of information
in the incomplete sequence 1100*, namely 4 bits. Similarly, the
amount of information in B alone is the amount of information
in the incomplete sequence *1001, also 4 bits. This time information
doesn't add up: 5 = **I**(A&B) _ **I**(A)+**I**(B)
= 4+4 = 8.

Here A and B are correlated. Alice knows all but the last bit
of information in the completed sequence 11001. Thus when Bob
gives her the incomplete sequence *1001, all Alice really learns
is the last bit in this sequence. Similarly, Bob knows all but
the first bit of information in the completed sequence 11001.
Thus when Alice gives him the incomplete sequence 1100*, all Bob
really learns is the first bit in this sequence. What appears
to be four bits of information actually ends up being only one
bit of information once Alice and Bob factor in the prior information
they possess about the completed sequence 11001. If we introduce
the idea of conditional information, this is just to say that
5 = **I**(A&B) = **I**(A)+**I**(B|A) = 4+1. **I**(B|A),
the conditional information of B given A, is the amount of information
in Bob's observation once Alice's observation is taken into account.
And this, as we just saw, is 1 bit.

**I**(B|A), like **I**(A&B), **I**(A), and **I**(B),
can be represented as the negative logarithm to the base two of
a probability, only this time the probability under the logarithm
is a conditional as opposed to an unconditional probability. By
definition **I**(B|A) =def -log2**P**(B|A), where **P**(B|A)
is the conditional probability of B given A. But since **P**(B|A)
=def **P**(A&B)/**P**(A), and since the logarithm of
a quotient is the difference of the logarithms, log2**P**(B|A)
= log2**P**(A&B) - log2**P**(A), and so -log2**P**(B|A)
= -log2**P**(A&B) + log2**P**(A), which is just **I**(B|A)
= **I**(A&B) - **I**(A). This last equation is equivalent
to

(*)** I**(A&B) = **I**(A)+**I**(B|A)

Formula (*) holds with full generality, reducing to **I**(A&B)
= **I**(A)+**I**(B) when A and B are probabilistically independent
(in which case **P**(B|A) = **P**(B) and thus **I**(B|A)
= **I**(B)).

Formula (*) asserts that the information in both A and B jointly
is the information in A plus the information in B that is not
in A. Its point, therefore, is to spell out how much additional
information B contributes to A. As such, this formula places tight
constraints on the generation of new information. Does, for instance,
a computer program, call it A, by outputting some data, call the
data B, generate new information? Computer programs are fully
deterministic, and so B is fully determined by A. It follows that
**P**(B|A) = 1, and thus **I**(B|A) = 0 (the logarithm of
1 is always 0). From Formula (*) it therefore follows that **I**(A&B)
= **I**(A), and therefore that the amount of information in
A and B jointly is no more than the amount of information in A
by itself.

For an example in the same spirit consider that there is no
more information in two copies of Shakespeare's *Hamlet*
than in a single copy. This is of course patently obvious, and
any formal account of information had better agree. To see that
our formal account does indeed agree, let A denote the printing
of the first copy of *Hamlet*, and B the printing of the
second copy. Once A is given, B is entirely determined. Indeed,
the correlation between A and B is perfect. Probabilistically
this is expressed by saying the conditional probability of B given
A is 1, namely, **P**(B|A) = 1. In information-theoretic terms
this is to say that **I**(B|A) = 0. As a result **I**(B|A)
drops out of Formula (*), and so **I**(A&B) = **I**(A).
Our information-theoretic formalism therefore agrees with our
intuition that two copies of *Hamlet* contain no more information
than a single copy.

Information is a complexity-theoretic notion. Indeed, as a
purely formal object, the information measure described here is
a complexity measure (cf. Dembski, 1998, ch. 4). Complexity measures
arise whenever we assign numbers to degrees of complication. A
set of possibilities will often admit varying degrees of complication,
ranging from extremely simple to extremely complicated. Complexity
measures assign non-negative numbers to these possibilities so
that 0 corresponds to the most simple and _ to the most complicated.
For instance, computational complexity is always measured in terms
of either time (i.e., number of computational steps) or space
(i.e., size of memory, usually measured in bits or bytes)
or some combination of the two. The more difficult a computational
problem, the more time and space are required to run the algorithm
that solves the problem. For information measures, degree of complication
is measured in bits. Given an event A of probability **P**(A),
**I**(A) = -log2**P**(A) measures the number of bits associated
with the probability **P**(A). We therefore speak of the "complexity
of information" and say that the complexity of information
increases as **I**(A) increases (or, correspondingly, as **P**(A)
decreases). We also speak of "simple" and "complex"
information according to whether **I**(A) signifies few or
many bits of information. This notion of complexity is important
to biology since not just the origin of information stands in
question, but the origin of complex information.

Given a means of measuring information and determining its
complexity, we turn now to the distinction between *specified*
and *unspecified* information. This is a vast topic whose
full elucidation is beyond the scope of this paper (the details
can be found in my monograph *The Design Inference*). Nonetheless,
in what follows I shall try to make this distinction intelligible,
and offer some hints on how to make it rigorous. For an intuitive
grasp of the difference between specified and unspecified information,
consider the following example. Suppose an archer stands 50 meters
from a large blank wall with bow and arrow in hand. The wall,
let us say, is sufficiently large that the archer cannot help
but hit it. Consider now two alternative scenarios. In the first
scenario the archer simply shoots at the wall. In the second scenario
the archer first paints a target on the wall, and then shoots
at the wall, squarely hitting the target's bull's-eye. Let us
suppose that in both scenarios where the arrow lands is identical.
In both scenarios the arrow might have landed anywhere on the
wall. What's more, any place where it might land is highly improbable.
It follows that in both scenarios highly complex information is
actualized. Yet the conclusions we draw from these scenarios are
very different. In the first scenario we can conclude absolutely
nothing about the archer's ability as an archer, whereas in the
second scenario we have evidence of the archer's skill.

The obvious difference between the two scenarios is of course that in the first the information follows no pattern whereas in the second it does. Now the information that tends to interest us as rational inquirers generally, and scientists in particular, is not the actualization of arbitrary possibilities which correspond to no patterns, but rather the actualization of circumscribed possibilities which do correspond to patterns. There's more. Patterned information, though a step in the right direction, still doesn't quite get us specified information. The problem is that patterns can be concocted after the fact so that instead of helping elucidate information, the patterns are merely read off already actualized information.

To see this, consider a third scenario in which an archer shoots
at a wall. As before, we suppose the archer stands 50 meters from
a large blank wall with bow and arrow in hand, the wall being
so large that the archer cannot help but hit it. And as in the
first scenario, the archer shoots at the wall while it is still
blank. But this time suppose that after having shot the arrow,
and finding the arrow stuck in the wall, the archer paints a target
around the arrow so that the arrow sticks squarely in the bull's-eye.
Let us suppose further that the precise place where the arrow
lands in this scenario is identical with where it landed in the
first two scenarios. Since any place where the arrow might land
is highly improbable, in this as in the other scenarios highly
complex information has been actualized. What's more, since the
information corresponds to a pattern, we can even say that in
this third scenario highly complex patterned information has been
actualized. Nevertheless, it would be wrong to say that highly
complex specified information has been actualized. Of the three
scenarios, only the information in the second scenario is specified.
In that scenario, by *first* painting the target and *then*
shooting the arrow, the pattern is given independently of the
information. On the other hand, in this, the third scenario, by
first shooting the arrow and then painting the target around it,
the pattern is merely read off the information.

Specified information is always patterned information, but
patterned information is not always specified information. For
specified information not just any pattern will do. We therefore
distinguish between the "good" patterns and the "bad"
patterns. The "good" patterns will henceforth be called
*specifications*. Specifications are the independently given
patterns that are not simply read off information. By contrast,
the "bad" patterns will be called *fabrications*.
Fabrications are the *post hoc* patterns that are simply
read off already existing information.

Unlike specifications, fabrications are wholly unenlightening. We are no better off with a fabrication than without one. This is clear from comparing the first and third scenarios. Whether an arrow lands on a blank wall and the wall stays blank (as in the first scenario), or an arrow lands on a blank wall and a target is then painted around the arrow (as in the third scenario), any conclusions we draw about the arrow's flight remain the same. In either case chance is as good an explanation as any for the arrow's flight. The fact that the target in the third scenario constitutes a pattern makes no difference since the pattern is constructed entirely in response to where the arrow lands. Only when the pattern is given independently of the arrow's flight does a hypothesis other than chance come into play. Thus only in the second scenario does it make sense to ask whether we are dealing with a skilled archer. Only in the second scenario does the pattern constitute a specification. In the third scenario the pattern constitutes a mere fabrication.

The distinction between specified and unspecified information may now be defined as follows: the actualization of a possibility (i.e., information) is specified if independently of the possibility's actualization, the possibility is identifiable by means of a pattern. If not, then the information is unspecified. Note that this definition implies an asymmetry between specified and unspecified information: specified information cannot become unspecified information, though unspecified information may become specified information. Unspecified information need not remain unspecified, but can become specified as our background knowledge increases. For instance, a cryptographic transmission whose cryptosystem we have yet to break will constitute unspecified information. Yet as soon as we break the cryptosystem, the cryptographic transmission becomes specified information.

What is it for a possibility to be identifiable by means of
an independently given pattern? A full exposition of specification
requires a detailed answer to this question. Unfortunately, such
an exposition is beyond the scope of this paper. The key conceptual
difficulty here is to characterize the independence condition
between patterns and information. This independence condition
breaks into two subsidiary conditions: (1) a condition to stochastic
conditional independence between the information in question and
certain relevant background knowledge; and (2) a tractability
condition whereby the pattern in question can be constructed from
the aforementioned background knowledge. Although these conditions
make good intuitive sense, they are not easily formalized. For
the details refer to my monograph *The Design Inference.*

If formalizing what it means for a pattern to be given independently of a possibility is difficult, determining in practice whether a pattern is given independently of a possibility is much easier. If the pattern is given prior to the possibility being actualized-as in the second scenario above where the target was painted before the arrow was shot-then the pattern is automatically independent of the possibility, and we are dealing with specified information. Patterns given prior to the actualization of a possibility are just the rejection regions of statistics. There is a well-established statistical theory that describes such patterns and their use in probabilistic reasoning. These are clearly specifications since having been given prior to the actualization of some possibility, they have already been identified, and thus are identifiable independently of the possibility being actualized (cf. Hacking, 1965).

Many of the interesting cases of specified information, however,
are those in which the pattern is given *after* a possibility
has been actualized. This is certainly the case with the origin
of life: life originates first and only afterwards do pattern-forming
rational agents (like ourselves) enter the scene. It remains the
case, however, that a pattern corresponding to a possibility,
though formulated after the possibility has been actualized, can
constitute a specification. Certainly this was not the case in
the third scenario above where the target was painted around the
arrow only after it hit the wall. But consider the following example.
Alice and Bob are celebrating their fiftieth wedding anniversary.
Their six children all show up bearing gifts. Each gift is part
of a matching set of china. There is no duplication of gifts,
and together the gifts constitute a complete set of china. Suppose
Alice and Bob were satisfied with their old set of china, and
had no inkling prior to opening their gifts that they might expect
a new set of china. Alice and Bob are therefore without a relevant
pattern whither to refer their gifts prior to actually receiving
the gifts from their children. Especially if Bob has already taken a pill of Cialis,
then the situation will take on a different character. Nevertheless, the pattern they
explicitly formulate only after receiving the gifts could be formed
independently of receiving the gifts-indeed, we all know about
matching sets of china and how to distinguish them from unmatched
sets. This pattern therefore constitutes a specification. What's
more, there is an obvious inference connected with this specification:
Alice and Bob's children were in collusion, and did not present
their gifts as random acts of kindness.

But what about the origin of life? Is life specified? If so, to what patterns does life correspond, and how are these patterns given independently of life's origin? Obviously, pattern-forming rational agents like ourselves don't enter the scene till after life originates. Nonetheless, there are functional patterns to which life corresponds, and which are given independently of the actual living systems. An organism is a functional system comprising many functional subsystems. The functionality of organisms can be cashed out in any number of ways. Arno Wouters (1995) cashes it out globally in terms of viability of whole organisms. Michael Behe (1996) cashes it out in terms of the irreducible complexity and minimal function of biochemical systems. Even the staunch Darwinist Richard Dawkins will admit that life is specified functionally, cashing out the functionality of organisms in terms of reproduction of genes. Thus Dawkins (1987, p. 9) will write: "Complicated things have some quality, specifiable in advance, that is highly unlikely to have been acquired by random chance alone. In the case of living things, the quality that is specified in advance is . . . the ability to propagate genes in reproduction."

Information can be specified. Information can be complex. Information can be both complex and specified. Information that is both complex and specified I call "complex specified information," or CSI for short. CSI is what all the fuss over information has been about in recent years, not just in biology, but in science generally. It is CSI that for Manfred Eigen constitutes the great mystery of biology, and one he hopes eventually to unravel in terms of algorithms and natural laws. It is CSI that for cosmologists underlies the fine-tuning of the universe, and which the various anthropic principles attempt to understand (cf. Barrow and Tipler, 1986). It is CSI that David Bohm's quantum potentials are extracting when they scour the microworld for what Bohm calls "active information" (cf. Bohm, 1993, pp. 35-38). It is CSI that enables Maxwell's demon to outsmart a thermodynamic system tending towards thermal equilibrium (cf. Landauer, 1991, p. 26). It is CSI on which David Chalmers hopes to base a comprehensive theory of human consciousness (cf. Chalmers, 1996, ch. 8). It is CSI that within the Kolmogorov-Chaitin theory of algorithmic information takes the form of highly compressible, non-random strings of digits (cf. Kolmogorov, 1965; Chaitin, 1966).

Nor is CSI confined to science. CSI is indispensable in our
everyday lives. The 16-digit number on your VISA card is an example
of CSI. The complexity of this number ensures that a would-be
thief cannot randomly pick a number and have it turn out to be
a valid VISA card number. What's more, the specification of this
number ensures that it is your number, and not anyone else's.
Even your phone number constitutes CSI. As with the VISA card
number, the complexity ensures that this number won't be dialed
randomly (at least not too often), and the specification ensures
that this number is yours and yours only. All the numbers on our
bills, credit slips, and purchase orders represent CSI. CSI makes
the world go round. It follows that CSI is a rife field for criminality.
CSI is what motivated the greedy Michael Douglas character in
the movie *Wall Street* to lie, cheat, and steal. CSI's total
and absolute control was the objective of the monomaniacal Ben
Kingsley character in the movie *Sneakers*. CSI is the artifact
of interest in most techno-thrillers. Ours is an information age,
and the information that captivates us is CSI.

Whence the origin of complex specified information? In this
section I shall argue that intelligent causation, or equivalently
design, accounts for the origin of complex specified information.
My argument focuses on the nature of intelligent causation, and
specifically, on what it is about intelligent causes that makes
them detectable. To see why CSI is a reliable indicator of design,
we need to examine the nature of intelligent causation. The principal
characteristic of intelligent causation is *directed contingency*,
or what we call *choice*. Whenever an intelligent cause acts,
it chooses from a range of competing possibilities. This is true
not just of humans, but of animals as well as extra-terrestrial
intelligences. A rat navigating a maze must choose whether to
go right or left at various points in the maze. When SETI (Search
for Extra-Terrestrial Intelligence) researchers attempt to discover
intelligence in the extra-terrestrial radio transmissions they
are monitoring, they assume an extra-terrestrial intelligence
could have chosen any number of possible radio transmissions,
and then attempt to match the transmissions they observe with
certain patterns as opposed to others (patterns that presumably
are markers of intelligence). Whenever a human being utters meaningful
speech, a choice is made from a range of possible sound-combinations
that might have been uttered. Intelligent causation always entails
discrimination, choosing certain things, ruling out others.

Given this characterization of intelligent causes, the crucial
question is how to recognize their operation. Intelligent causes
act by making a choice. How then do we recognize that an intelligent
cause has made a choice? A bottle of ink spills accidentally onto
a sheet of paper; someone takes a fountain pen and writes a message
on a sheet of paper. In both instances ink is applied to paper.
In both instances one among an almost infinite set of possibilities
is realized. In both instances a contingency is actualized and
others are ruled out. Yet in one instance we infer design, in
the other chance. What is the relevant difference? Not only do
we need to observe that a contingency was actualized, but we ourselves
need also to be able to specify that contingency. The contingency
must conform to an independently given pattern, and we must be
able independently to formulate that pattern. A random ink blot
is unspecifiable; a message written with ink on paper is specifiable.
Wittgenstein (1980, p. 1e) made the same point as follows:
"We tend to take the speech of a Chinese for inarticulate
gurgling. Someone who understands Chinese will recognize *language
*in what he hears. Similarly I often cannot discern the *humanity*
in man."

In hearing a Chinese utterance, someone who understands Chinese not only recognizes that one from a range of all possible utterances was actualized, but is also able to specify the utterance as coherent Chinese speech. Contrast this with someone who does not understand Chinese. In hearing a Chinese utterance, someone who does not understand Chinese also recognizes that one from a range of possible utterances was actualized, but this time, because lacking the ability to understand Chinese, is unable to specify the utterance as coherent speech. To someone who does not understand Chinese, the utterance will appear gibberish. Gibberish-the utterance of nonsense syllables uninterpretable within any natural language-always actualizes one utterance from the range of possible utterances. Nevertheless, gibberish, by corresponding to nothing we can understand in any language, also cannot be specified. As a result, gibberish is never taken for intelligent communication, but always for what Wittgenstein calls "inarticulate gurgling."

The actualization of one among several competing possibilities, the exclusion of the rest, and the specification of the possibility that was actualized encapsulates how we recognize intelligent causes, or equivalently, how we detect design. Actualization-Exclusion-Specification, this triad constitutes a general criterion for detecting intelligence, be it animal, human, or extra-terrestrial. Actualization establishes that the possibility in question is the one that actually occurred. Exclusion establishes that there was genuine contingency (i.e., that there were other live possibilities, and that these were ruled out). Specification establishes that the actualized possibility conforms to a pattern given independently of its actualization.

Now where does choice, which we've cited as the principal characteristic of intelligent causation, figure into this criterion? The problem is that we never witness choice directly. Instead, we witness actualizations of contingency which might be the result of choice (i.e., directed contingency), but which also might be the result of chance (i.e., blind contingency). Now there is only one way to tell the difference-specification. Specification is the only means available to us for distinguishing choice from chance, directed contingency from blind contingency. Actualization and exclusion together guarantee we are dealing with contingency. Specification guarantees we are dealing with a directed contingency. The Actualization-Exclusion-Specification triad is therefore precisely what we need to identify choice and therewith intelligent causation.

Psychologists who study animal learning and behavior have known of the Actualization-Exclusion-Specification triad all along, albeit implicitly. For these psychologists-known as learning theorists-learning is discrimination (cf. Mazur, 1990; Schwartz, 1984). To learn a task an animal must acquire the ability to actualize behaviors suitable for the task as well as the ability to exclude behaviors unsuitable for the task. Moreover, for a psychologist to recognize that an animal has learned a task, it is necessary not only to observe the animal making the appropriate behavior, but also to specify this behavior. Thus to recognize whether a rat has successfully learned how to traverse a maze, a psychologist must first specify the sequence of right and left turns that conducts the rat out of the maze. No doubt, a rat randomly wandering a maze also discriminates a sequence of right and left turns. But by randomly wandering the maze, the rat gives no indication that it can discriminate the appropriate sequence of right and left turns for exiting the maze. Consequently, the psychologist studying the rat will have no reason to think the rat has learned how to traverse the maze. Only if the rat executes the sequence of right and left turns specified by the psychologist will the psychologist recognize that the rat has learned how to traverse the maze. Now it is precisely the learned behaviors we regard as intelligent in animals. Hence it is no surprise that the same scheme for recognizing animal learning recurs for recognizing intelligent causes generally, to wit, actualization, exclusion, and specification.

Now this general scheme for recognizing intelligent causes coincides precisely with how we recognize complex specified information: First, the basic precondition for information to exist must hold, namely, contingency. Thus one must establish that any one of a multiplicity of distinct possibilities might obtain. Next, one must establish that the possibility which was actualized after the others were excluded was also specified. So far the match between this general scheme for recognizing intelligent causation and how we recognize complex specified information is exact. Only one loose end remains-complexity. Although complexity is essential to CSI (corresponding to the first letter of the acronym), its role in this general scheme for recognizing intelligent causation is not immediately evident. In this scheme one among several competing possibilities is actualized, the rest are excluded, and the possibility which was actualized is specified. Where in this scheme does complexity figure in?

The answer is that it is there implicitly. To see this, consider again a rat traversing a maze, but now take a very simple maze in which two right turns conduct the rat out of the maze. How will a psychologist studying the rat determine whether it has learned to exit the maze. Just putting the rat in the maze will not be enough. Because the maze is so simple, the rat could by chance just happen to take two right turns, and thereby exit the maze. The psychologist will therefore be uncertain whether the rat actually learned to exit this maze, or whether the rat just got lucky. But contrast this now with a complicated maze in which a rat must take just the right sequence of left and right turns to exit the maze. Suppose the rat must take one hundred appropriate right and left turns, and that any mistake will prevent the rat from exiting the maze. A psychologist who sees the rat take no erroneous turns and in short order exit the maze will be convinced that the rat has indeed learned how to exit the maze, and that this was not dumb luck. With the simple maze there is a substantial probability that the rat will exit the maze by chance; with the complicated maze this is exceedingly improbable. The role of complexity in detecting design is now clear since improbability is precisely what we mean by complexity (cf. section 2).

This argument for showing that CSI is a reliable indicator of design may now be summarized as follows: CSI is a reliable indicator of design because its recognition coincides with how we recognize intelligent causation generally. In general, to recognize intelligent causation we must establish that one from a range of competing possibilities was actualized, determine which possibilities were excluded, and then specify the possibility that was actualized. What's more, the competing possibilities that were excluded must be live possibilities, sufficiently numerous so that specifying the possibility that was actualized cannot be attributed to chance. In terms of probability, this means that the possibility that was specified is highly improbable. In terms of complexity, this means that the possibility that was specified is highly complex. All the elements in the general scheme for recognizing intelligent causation (i.e., Actualization-Exclusion-Specification) find their counterpart in complex specified information-CSI. CSI pinpoints what we need to be looking for when we detect design.

As a postscript, I call the reader's attention to the etymology
of the word "intelligent." The word "intelligent"
derives from two Latin words, the preposition *inter*, meaning
between, and the verb *lego*, meaning to choose or select.
Thus according to its etymology, intelligence consists in *choosing
between*. It follows that the etymology of the word "intelligent"
parallels the formal analysis of intelligent causation just given.
"Intelligent design" is therefore a thoroughly apt phrase,
signifying that design is inferred precisely because an intelligent
cause has done what only an intelligent cause can do-make a choice.

Evolutionary biology has steadfastly resisted attributing CSI
to intelligent causation. Although Manfred Eigen recognizes that
the central problem of evolutionary biology is the origin of CSI,
he has no thought of attributing CSI to intelligent causation.
According to Eigen natural causes are adequate to explain the
origin of CSI. The only question for Eigen is which natural causes
explain the origin of CSI. The logically prior question of whether
natural causes are even in-principle capable of explaining the
origin of CSI he ignores. And yet it is a question that undermines
Eigen's entire project. Natural causes are in-principle incapable
of explaining the origin of CSI. To be sure, natural causes can
explain the flow of CSI, being ideally suited for transmitting
already existing CSI. What natural causes cannot do, however,
is originate CSI. This strong proscriptive claim, that natural
causes can only transmit CSI but never originate it, I call the
Law of Conservation of Information. It is this law that gives
definite scientific content to the claim that CSI is intelligently
caused. The aim of this last section is briefly to sketch the
Law of Conservation of Information (a full treatment will be given
in *Uncommon Descent*, a book I am jointly authoring with
Stephen Meyer and Paul Nelson).

To see that natural causes cannot account for CSI is straightforward.
Natural causes comprise chance and necessity (cf. Jacques Monod's
book by that title). Because information presupposes contingency,
necessity is by definition incapable of producing information,
much less complex specified information. For there to be information
there must be a multiplicity of live possibilities, one of which
is actualized, and the rest of which are excluded. This is contingency.
But if some outcome B is necessary given antecedent conditions
A, then the probability of B given A is one, and the information
in B given A is zero. If B is necessary given A, Formula (*) reduces
to **I**(A&B) = **I**(A), which is to say that B contributes
no new information to A. It follows that necessity is incapable
of generating new information. Observe that what Eigen calls "algorithms"
and "natural laws" fall under necessity.

Since information presupposes contingency, let us take a closer look at contingency. Contingency can assume only one of two forms. Either the contingency is a blind, purposeless contingency-which is chance; or it is a guided, purposeful contingency-which is intelligent causation. Since we already know that intelligent causation is capable of generating CSI (cf. section 4), let us next consider whether chance might also be capable of generating CSI. First notice that pure chance, entirely unsupplemented and left to its own devices, is incapable of generating CSI. Chance can generate complex unspecified information, and chance can generate non-complex specified information. What chance cannot generate is information that is jointly complex and specified.

Biologists by and large do not dispute this claim. Most agree that pure chance-what Hume called the Epicurean hypothesis-does not adequately explain CSI. Jacques Monod (1972) is one of the few exceptions, arguing that the origin of life, though vastly improbable, can nonetheless be attributed to chance because of a selection effect. Just as the winner of a lottery is shocked at winning, so we are shocked to have evolved. But the lottery was bound to have a winner, and so too something was bound to have evolved. Something vastly improbable was bound to happen, and so, the fact that it happened to us (i.e., that we were selected-hence the name selection effect) does not preclude chance. This is Monod's argument and it is fallacious. It fails utterly to come to grips with specification. Moreover, it confuses a necessary condition for life's existence with its explanation. Monod's argument has been refuted by the philosophers John Leslie (1989), John Earman (1987), and Richard Swinburne (1979). It has also been refuted by the biologists Francis Crick (1981, ch. 7), Bernd-Olaf Küppers (1990, ch. 6), and Hubert Yockey (1992, ch. 9). Selection effects do nothing to render chance an adequate explanation of CSI.

Most biologists therefore reject pure chance as an adequate explanation of CSI. The problem here is not simply one of faulty statistical reasoning. Pure chance is also scientifically unsatisfying as an explanation of CSI. To explain CSI in terms of pure chance is no more instructive than pleading ignorance or proclaiming CSI a mystery. It is one thing to explain the occurrence of heads on a single coin toss by appealing to chance. It is quite another, as Küppers (1990, p. 59) points out, to follow Monod and take the view that "the specific sequence of the nucleotides in the DNA molecule of the first organism came about by a purely random process in the early history of the earth." CSI cries out for explanation, and pure chance won't do. As Richard Dawkins (1987, p. 139) correctly notes, "We can accept a certain amount of luck in our [scientific] explanations, but not too much."

If chance and necessity left to themselves cannot generate CSI, is it possible that chance and necessity working together might generate CSI? The answer is No. Whenever chance and necessity work together, the respective contributions of chance and necessity can be arranged sequentially. But by arranging the respective contributions of chance and necessity sequentially, it becomes clear that at no point in the sequence is CSI generated. Consider the case of trial-and-error (trial corresponds to necessity and error to chance). Once considered a crude method of problem solving, trial-and-error has so risen in the estimation of scientists that it is now regarded as the ultimate source of wisdom and creativity in nature. The probabilistic algorithms of computer science (e.g., genetic algorithms-see Forrest, 1993) all depend on trial-and-error. So too, the Darwinian mechanism of mutation and natural selection is a trial-and-error combination in which mutation supplies the error and selection the trial. An error is committed after which a trial is made. But at no point is CSI generated.

Natural causes are therefore incapable of generating CSI. This broad conclusion I call the Law of Conservation of Information, or LCI for short. LCI has profound implications for science. Among its corollaries are the following: (1) The CSI in a closed system of natural causes remains constant or decreases. (2) CSI cannot be generated spontaneously, originate endogenously, or organize itself (as these terms are used in origins-of-life research). (3) The CSI in a closed system of natural causes either has been in the system eternally or was at some point added exogenously (implying that the system though now closed was not always closed). (4) In particular, any closed system of natural causes that is also of finite duration received whatever CSI it contains before it became a closed system.

This last corollary is especially pertinent to the nature of science for it shows that scientific explanation is not coextensive with reductive explanation. Richard Dawkins, Daniel Dennett, and many scientists are convinced that proper scientific explanations must be reductive, moving from the complex to the simple. Thus Dawkins (1987, p. 316) will write, "The one thing that makes evolution such a neat theory is that it explains how organized complexity can arise out of primeval simplicity." Thus Dennett (1995, p. 153) will view any scientific explanation that moves from simple to complex as "question-begging." Thus Dawkins (1987, p. 13) will explicitly equate proper scientific explanation with what he calls "hierarchical reductionism," according to which "a complex entity at any particular level in the hierarchy of organization" must properly be explained "in terms of entities only one level down the hierarchy." While no one will deny that reductive explanation is extremely effective within science, it is hardly the only type of explanation available to science. The divide-and-conquer mode of analysis behind reductive explanation has strictly limited applicability within science. In particular, this mode of analysis is utterly incapable of making headway with CSI. CSI demands an intelligent cause. Natural causes will not do.

Barrow, John D. and Frank J. Tipler. 1986.
*The Anthropic Cosmological Principle*. Oxford: Oxford University
Press.

Behe, Michael. 1996. *Darwin's Black Box: The Biochemical Challenge
to Evolution*. New York: The Free Press.

Bohm, David. 1993. *The Undivided Universe: An Ontological
Interpretation of Quantum Theory*. London: Routledge.

Chaitin, Gregory J. 1966. On the Length of Programs for Computing
Finite Binary Sequences. *Journal of the ACM*, 13:547-569.

Chalmers, David J. 1996. *The Conscious Mind: In Search of
a Fundamental Theory*. New York : Oxford University Press.

Crick, Francis. 1981. *Life Itself: Its Origin and Nature*.
New York: Simon and Schuster.

Dawkins, Richard. 1987. *The Blind Watchmaker*. New York:
Norton.

Dembski, William A. 1998. *The Design Inference: Eliminating
Chance through Small Probabilities*. Forthcoming, Cambridge
University Press.

Dennett, Daniel C. 1995. *Darwin's Dangerous Idea: Evolution
and the Meanings of Life*. New York: Simon & Schuster.

Devlin, Keith J. 1991. *Logic and Information*. New York:
Cambridge University Press.

Dretske, Fred I. 1981. *Knowledge and the Flow of Information*.
Cambridge, Mass.: MIT Press.

Earman, John. 1987. The Sap Also Rises: A Critical Examination
of the Anthropic Principle. *American Philosophical Quarterly*,
24(4): 307&SHY;317.

Eigen, Manfred. 1992. *Steps Towards Life: A Perspective on
Evolution*, translated by Paul Woolley. Oxford: Oxford University
Press.

Forrest, Stephanie. 1993. Genetic Algorithms: Principles of Natural
Selection Applied to Computation. *Science*, 261:872-878.

Hacking, Ian. 1965. *Logic of Statistical Inference*. Cambridge:
Cambridge University Press.

Hamming, R. W. 1986. *Coding and Information Theory*, 2nd
edition. Englewood Cliffs, N. J.: Prentice-Hall.

Kolmogorov, Andrei N. 1965. Three Approaches to the Quantitative
Definition of Information. *Problemy Peredachi Informatsii*
(in translation), 1(1): 3-11.

Küppers, Bernd-Olaf. 1990. *Information and the Origin
of Life*. Cambridge, Mass.: MIT Press.

Landauer, Rolf. 1991. Information is Physical. *Physics Today*,
May: 23&SHY;29.

Leslie, John. 1989. *Universes*. London: Routledge.

Mazur, James. E. 1990. *Learning and Behavior*, 2nd edition.
Englewood Cliffs, N.J.: Prentice Hall.

Monod, Jacques. 1972. *Chance and Necessity*. New York:
Vintage.

Schwartz, Barry. 1984. *Psychology of Learning and Behavior*,
2nd edition. New York: Norton.

Shannon, Claude E. and W. Weaver. 1949. *The Mathematical Theory
of Communication*. Urbana, Ill.: University of Illinois Press.

Stalnaker, Robert. 1984. *Inquiry*. Cambridge, Mass.: MIT
Press.

Swinburne, Richard. 1979. *The Existence of God*. Oxford:
Oxford University Press.

Wittgenstein, Ludwig. 1980. *Culture and Value*, edited
by G. H. von Wright, translated by P. Winch. Chicago: University
of Chicago Press.

Wouters, Arno. 1995. Viability Explanation. *Biology and Philosophy*,
10:435-457.

Yockey, Hubert P. 1992. *Information Theory and Molecular Biology*.
Cambridge: Cambridge University Press.

Copyright 1998 William A. Dembski. All
rights reserved. International copyright secured.

File Date: 11.15.98