The Lovelace Test in More Detail

Next: How do Today's Systems Up: The Lovelace Test Previous: The Lovelace Test

The Lovelace Test in More Detail

To begin to see how LT works, we start with a scenario that is close to home for Bringsjord and Ferrucci, given their sustained efforts to build story generation agents: Assume that Jones, a human AInik, attempts to build an artificial computational agent A that doesn't engage in conversation, but rather creates stories -- creates in the Lovelacean sense that this system originates stories. Assume that Jones activates A and that a stunningly belletristic story o is produced. We claim that if Jones cannot explain how o was gemerated by A, and if Jones has no reason whatever to believe that A succeeded on the strength of a fluke hardware error, etc. (which entails that A can produce other equally impressive stories), then A should at least provisionally be regarded genuinely creative. An artificial computational agent passes LT if and only if it stands to its creator as A stands to Jones.

LT relies on the special epistemic relationship that exists between Jones and A. But `Jones,' like `A,' is of course just an uninformative variable standing in for any human system designer. This yields the following rough-and-ready definition.

Def_LT 1

Artificial agent A, designed by H, passes LT if and only if

1: A outputs o;
2: A's outputting o is not the result of a fluke hardware error, but rather the result of processes A can repeat;
3: H (or someone who knows what H knows, and has H's resources⁹) cannot explain how A produced o.

Notice that LT is actually what might be called a meta-test. The idea is that this scheme can be deployed for any partcular domain. If conversation is the kind of behavior wanted, then merely stipulate that o is an English sentence (or sequence of such sentences) in the context of a converation (as in, of course, TT). If the production of a mathematical proof with respect to a given conjecture is what's desired, then we merely set o to a proof. In light of this, we can focus LT on the particular kind of interaction appropriate for the digital entertainment involved.

Obvious questions arise at this point. Three are:

Q1: What resources and knowledge does H have at his or her disposal?
Q2: What sort of thing would count as a successful explanation?
Q3: How long does H have to cook up the explanation?

The answer to the third question is easy: H can have as long as he or she likes, within reason. The proffered explanation doesn't have to come immediately: H can take a month, months, even a year or two. Anything longer than a couple of years strikes us as perhaps unreasonable. We realize that these temporal parameters aren't exactly precise, but then again we should not be held to standards higher than those pressed against Turing and those who promote his test and variants thereof.¹⁰ The general point, obviously, is that H should have more than ample time to sort things out.

But what about Q1 and Q2? The answer to Q1 is that H is assumed to have at her disposal knowledge of the architecture of the agent in question, knowledge of the KB of the agent, knowledge of how the main functions in the agent are implemented (e.g., how TELL and ASK are implemented), and so on (recall the summary of intelligent agents above). H is also assumed to have resources sufficient to pin down these elements, to ``freeze" them and inspect them, and so on. I confess that this isn't exactly precise. To clarify things, I offer an example. This example is also designed to provide an answer to Q2.

To fix the context for the example, suppose that the output from our artificial agent A' is a resolution-based proof which settles a problem which human mathematicians and logicians have grappled unsuccessfully with for decades. This problem, suppose, is to determine whether or not some formula $\phi$ can be derived from some (consistent) axiom set $\Gamma$ . Imagine that after many years of fruitless deliberation, a human H' encodes $\Gamma$ and $\neg \phi$ and gives both to OTTER (a well-known theorem prover; it's discussed in (Bringsjord & Ferrucci, 2000)), and OTTER produces a proof showing that this encoding is inconsistent, which establishes $\Gamma \vdash \phi$ , and leads to an explosion of commentary in the media about ``brilliant" and ``creative" machines, and so on.¹¹ In this case, A' doesn't pass LT. This is true because H, knowing the KB, architecture, and central functions of A' will be able to give a perfect explanation for the behavior in question. I routinely give explanations of this sort. The KB is simply the encoding of $\Gamma \cup \{\phi\}$ , the architecture consists in the search algorithms used by OTTER, and the main functions consist in the rules of inference used in a resolution-based theorem prover.

Here, now, given the foregoing, is a better definition:

Def_LT 2

Artificial agent A, designed by H, passes LT if and only if

1: A outputs o;
2: A's outputting o is not the result of a fluke hardware error, but rather the result of processes A can repeat;
3: H (or someone who knows what H knows, and has H's resources) cannot explain how A produced o by appeal to A's architecture, knowledge-base, and core functions.

Next: How do Today's Systems Up: The Lovelace Test Previous: The Lovelace Test

Selmer Bringsjord
2001-06-27