C H A P T E R  1
Introduction

1.1 INTRODUCTION

Games of strategy, such as chess, couple intellectual activity with competition. We can exercise and improve our intellectual skills by playing such games. The competition adds excitement and allows us to compare our skills to those of others. The same motivation accounts for interest in Computer Game Playing as a testbed for Artificial Intelligence. Programs that think better should be able to win more games, and so we can use competitions as an evaluation technique for intelligent systems.

Unfortunately, building programs to play specific games has limited value in AI. (1) To begin with, specialized game players are very narrow. They can be good at one thing but not another. Deep Blue may have beaten the world chess champion, but it has no clue how to play checkers; it cannot even balance a checkbook. (2) A second problem with specialized game playing systems is that they do only part of the work. Most of the interesting analysis and design is done in advance by their programmers. The systems themselves might as well be tele-operatord.

All is not lost. The idea of game playing can be used to good effect to inspire and evaluate good work in Artificial Intellgence, but it requires moving more of the mental work to the computer itself. This can be done by focussing attention on General Game Playing.

General game players are systems able to accept declarative descriptions of arbitrary games at runtime and able to use such descriptions to play those games effectively (without human intervention).

Unlike specialized game players, such as Deep Blue, general game players cannot rely on algorithms designed in advance for specific games. General game playing expertise must depend on intelligence on the part of the game player and not just intelligence of the programmer of the game player. In order to perform well, general game players must incorporate various Artificial Intelligence technologies, such as knowledge representation, reasoning, learning, and rational decsion making; and these capabilities have to work together in integrated fashion.

Moreover, unlike specialized game players, general game players must be able to play different kinds of games. They should be able to play simple games (like Tictactoe) and complex games (like Chess), games in static or dynamic worlds, games with complete and partial information, games with varying numbers of players, with simultaneous or alternating play, with or without communication among the players.

While general game playing is a topic with inherent interest, work in this area has practical value as well. The underlying technology can be used in a variety of other application areas, such as business process management, electronic commerce, and military operations.

1.2 GAMES

In General Game Playing, we consider finite, synchronous games. These games take place in an environment with finitely many states, with one distinguished initial state and one or more terminal states. In addition, each game has a fixed, finite number of players; each player has finitely many possible actions in any game state, and each state has an associated goal value for each player. The dynamic model for general games is synchronous update: all players move on all steps (although some moves could be "no-ops"), and the environment updates only in response to the moves taken by the players.

In its most abstract form, we can think of a finite, synchronous game as a state machine. Figure 1.1 shows a state machine for a general game with eleven states (named s1, ... , s11), with one initial state (s1), with three terminal states (s3, s9, and s11). The shading of states indicates that these are highly-valued states for the player of the game. Figure 1 exhibits the transition function with double arrows labeled with the set of moves made by the players on a step of the game. This is a two player game, and each player can perform actions x or y. Note that it is not the case that every state has an arc corresponding to every action pair: only the legal actions in li can be made in a particular state. For example, from state d, one player can legally play both x and y, while the other player's only legal move is x.


Figure 1.1 - State Machine for a simple game

This conceptualization of games is similar to the traditional extensive normal form definition in game theory, with a few exceptions. In extensive normal form, a game is modeled as a tree with actions of one player at each node. In state machine form, a game is modeled as a graph, and all players' moves are synchronous. State machine form has a natural ability to express simultaneous moves; with extensions, extensive normal form could also do this, albeit with some added cost of complexity. Additionally, state machine form makes it possible to describe games more compactly, and it makes it easier for players to play games efficiently.

1.3 GAME DESCRIPTION

Since all of the games that we are considering are finite, it is possible, in principle, to describe such games in the form of lists (of states and actions) and tables or graphs (to express legality, goals, temination, and update). Unfortunately, such explicit representations are not practical in all cases. Even though the numbers of states and actions are finite, they can be extremely large; and the tables relating them can be larger still. For example, in chess, there are thousands of possible moves and more than 1030 states.

In the vast majority of games, states and actions have composite structure that allows us to define a large number of states and actions in terms of a smaller number of more fundamental entities. In chess, for example, states are not monolithic; they can be conceptualized in terms of pieces, squares, rows and columns and diagonals, and so forth.

By exploiting this structure, it is possible to encode games in a form that is more compact than direct representation. GDL supports this by relying on a conceptualization of game states as databases and by relying on logic to define the notions of legality and so forth.

As an example of GDL, let us look at a definition of Tic-Tac-Toe based on the conceptualization of states introduced in the last section. We begin with an enumeration of roles.

role(white)
role(black)

Next, we characterize the initial state. In this case, all cells are blank.

init(cell(1,1,b))
init(cell(1,2,b))
init(cell(1,3,b))
init(cell(2,1,b))
init(cell(2,2,b))
init(cell(2,3,b))
init(cell(3,1,b))
init(cell(3,2,b))
init(cell(3,3,b))
init(control(white))

Next, we define legality. A player may mark a cell if that cell is blank and he has control. Otherwise, so long as there is a blank cell, the only legal action is noop.

legal(W,mark(X,Y)) :-
  true(cell(X,Y,b)) &
  true(control(W))

legal(W,noop) :-
  true(cell(X,Y,b)) &
  true(control(B))

legal(B,noop) :-
  true(cell(X,Y,b)) &
  true(control(W))

Next, we look at the update rules for the game. A cell is marked with an X or an O if the appropriate player marks that cell. If a cell contains a mark, it retains that mark on the subsequent state. If a cell is blank and is not marked, then it remains blank. Finally, control alternates on each play.

next(cell(M,N,X)) :-
  does(white,mark(M,N)) &
  true(cell(M,N,b))

next(cell(M,N,O)) :-
  does(black,mark(M,N)) &
  true(cell(M,N,b))

next(cell(M,N,W)) :-
  true(cell(M,N,W)) &
  distinct(W,b)

next(cell(M,N,b)) :-
  does(W,mark(J,K))
  true(cell(M,N,W)) &
  (distinct(M,J) | distinct(N,K))

next(control(white)) :-
  true(control(black))

next(control(black)) :-
  true(control(white))

Goals. A state is a win for white if there is a line of x's. It is a win for black if there is a line of o's. The line relation is defined below.

goal(white) :- line(x)

goal(black) :- line(o)

Supporting concepts. A line is a row of marks of the same type or a column or a diagonal. A row of marks mean thats there three marks all with the same first coordinate. The column and diagonal relations are defined analogously.

line(X) :- row(M,X)
line(X) :- column(M,X)
line(X) :- diagonal(X)

row(M,X) :-
  true(cell(M,1,X)) &
  true(cell(M,2,X)) &
  true(cell(M,3,X))

column(M,X) :-
  true(cell(1,N,X)) &
  true(cell(2,N,X)) &
  true(cell(3,N,X))

diagonal(X) :-
  true(cell(1,1,X)) &
  true(cell(2,2,X)) &
  true(cell(3,3,X)) &

diagonal(X) :-
  true(cell(1,3,X)) &
  true(cell(2,2,X)) &
  true(cell(3,1,X)) &

Termination. A game terminates whenever either player has a line of marks of the appropriate type.

terminal :- line(x)

terminal :- line(o)

Note that, under the full information assumption, any of these relations can be assumed to be false if it is not provably true. Thus, we have complete definitions for the relations legal, next, goal, terminal in terms of true and does. The true relation starts out identical to init and on each step is changed to correspond to the extension of the next relation on that step.

Although GDL is designed for use in defining complete information games, it can be extended to partial information games relatively easily. Unfortunately, the resulting descriptions are more verbose and more expensive to process. This extension to GDL is the subject of a separate document.

1.4 GAME MANAGEMENT

The process of running a game goes as follows. Upon receiving a request to run a match, the Game Manager's first sends a "Start" message to each player to initiate the match. Once game play, begins it sends "Play" messages to each player to get their plays and simulates the results. This part of the process repeats until the game is over. The Manager then sends Stop messages to each player.

The start message lists the name of the match, the role the player is to assume (e.g. white or black in chess), a formal description of the associated game (in GDL), and the startclock and playclock associated with the match. The startclock determines how much time remains before play begins. The playclock determines ho much time each player has to make each move once play begins.

Upon receiving a Start message, each player sets up its data structures and does whatever analysis it deems desirable in the time available. It then replies to the Game Manager that it is ready for play.

Having sent the start message, the game manager waits for replies from the players. Once it has received these replies OR once the startclock is exhausted, the Game Manager commences play.

On each step, the Game Manager sends a "Play" message to each player. The message includes information about the actions of all players on the preceding step. (On the first step, the argument is "nil".)

On receiving a "Play" message, players spend their time trying to decide their moves. They must reply within the amount of time specified by the match's playclock.

The Game Manager waits for replies from the players. If a player does not respond before the playclock is exhausted, the Game manager selects an arbitrary legal move. In any case, once all players reply or the playclock is exhausted, the Game manager takes the specified moves or the legal moves it has determined for the players and determines the next game state. It then evaluates the termination condition to see if the game is over. If the game is not over, the game manager sends the moves of the players to all players and the process repeats.

Once a game is determined to be over, the Game Manager sends a "Stop" message to each player with information about the last moves made by all players. The "stop" message allows players to clean up any data structures for the match. The information about previous plays is supplied so that players with learning components can profit from their experience.

Having stopped all payers, the Game manager then computes the rewards for each player, stores this information together with the play history in the Arcade database, and ceases operation.

1.5 GAME PLAYING

Having a formal description of a game is one thing; being able to use that description to play the game effectively is something else. In this section, we examine some of the problems of building general game players and discuss strategies for dealing with these difficulties.

Let us start with automated reasoning. Since game descriptions are written in logic, it is obviously necessary for a game player to do some degree of automated reasoning.

There are various choices here. (1) A game player can use the game description interpretively throughout a game. (2) It can map the description to a different representation and use that interpretively. (3) It can use the description to devise a specialized program to play the game. This is effectively automatic programming. There may be other options as well.

The good news is that there are powerful reasoners for first order logic freely available. The bad news is that such reasoners do not, in and of themselves, solve the real problems of general game playing, which are the same whatever representation for the game rules is used, viz. dealing with indeterminacy and size and multi-game commonalities.

The simplest sort of game is one in which there is just one player and the number of states and actions is not too large. For such cases, traditional AI planning techniques are ideal. Depending on the shape of the search space, the player can search either forward or backward to find a sequence of actions / plays that convert the initial state into an acceptable goal state. Unfortunately, not all games are so simple.

To begin with, there is the indeterminacy that arises in games with multiple players. Recall that the succeeding state at each point in a game depends on the actions of all players, and remember that no player knows the actions of the other players in advance.

Of course, in some cases, it is possible for a player to find sequences of actions guaranteed to achieve a goal state. However, this is quite rare.

More often, it is necessary to create conditional plans in which a player's future actions are determined by its earlier actions and those of the other players. For such cases, more complex planning techniques are necessary.

Unfortunately, even this is not always sufficient. In some cases, there may be no guaranteed plan at all, not even a conditional plan. Tic-Tac-Toe is a game of this sort. Although it can be won, there is no guaranteed way to win in general. It is not really clear what to do in such situations. The key to winning in such situations is to move and hope that the moves of the other players put the game into a state from which a guaranteed win is possible. However, this startegy leaves open the question of which moves to make prior to arrival at such a state. One can fall back on probabilistic reasoning. However, this is not wholly satisfactory since there is no justifiable way of selecting a probability distribution for the actions of the other players. Another approach, of primary use in directly competitive games, is to make moves that create more search for the other players so that there is a chance that time limitations will cause those players to err.

Another complexity, independent of indeterminacy, is sheer size. In Tic-Tac-Toe, here are approximately 5000 distinct states. This is large but manageable. In Chess there are approximately 10^30 states. A state space of this size, being finite, is fully searchable in principle but not in practice. Moreover, the time limit on moves in most games means that players must select actions without knowing for sure whether they are any good.

In such cases, the usual approach is to conduct partial search of some sort, examining the game tree to a certain depth, evaluating the possible outcomes at that point, and choosing actions accordingly. Of course, this approach relies on the availability of an evaluation function for non-terminal states that is roughly monotonic in the actual probability of achieving a goal. While, for specific games, such as chess, programmers are able to build in evaluation functions in advance, this is not possible for general game playing, since the structure of the game is not known in advance. Rather, the game player must analyze the game itself in order to find a useful evaluation function.

Another approach to dealing with size is abstraction. In some cases, it is possible to reformulate a state graph into a more abstract state graph with the property that any solution to the abstract problem has a solution when refined to the full state graph. In such cases, it may be possible to find a guaranteed solution or a good evaluation function for the full graph. Various researchers have proposed techniques along these lines, but more work is needed.

The third issue is not so much a problem as an opportunity, viz. multi-game commonalities. After playing multiple instances of a single game or after playing multiple games against a given player, it may be possible to identify common lessons that can be transferred from one game instance to another. A player that is capable of learning such lessons and transferring them to other game instances is likely to do better than one without this capability.

One difficulty with this approach is that, in our current framework, players are not told the names of games, only the axioms. In order to transfer such lessons, a player must be able to recognize that it is the same game as before. If it is a slightly different game, the player must realize which lessons still apply and which are different.

Another difficulty, specific to this year's competition, is that players are not told the identity of the other players. So, lessons specific to players cannot be transferred, unless a player is able to recognize players by their style of play. (In future years, the restriction on supplying identity information about players may be removed, making such learning more useful.)

1.6 DISCUSSION

While general game playing is a topic with inherent interest, work in this area has practical value as well. The underlying technology can be used in a variety of other application areas, such as business process management, electronic commerce, and military operations.

General Game Playing is a setting within which AI is the essential technology. It certainly concentrates attention on the notion of specification-based systems (declarative systems, self-aware systems, whatever and, by extension reconfigurable systems, self-organizing systems, and so forth). Building systems of this sort dates from the early years of AI.

It was in 1958 that John McCarthy invented the concept of the "advice taker". The idea was simple. He wanted a machine that he could program by description. He would describe the intended environment and the desired goal, and the machine would use that information in determining its behavior. There would be no programming in the traditional sense. McCarthy presented his concept in a paper that has become a classic in the field of AI.

The main advantage we expect the advice taker to have is that its behavior will be improvable merely by making statements to it, telling it about its environment and what is wanted from it. To make these statements will require little, if any, knowledge of the program or the previous knowledge of the advice taker.

An ambitious goal! But that was a time of high hopes and grand ambitions. The idea caught the imaginations of numerous subsequent researchers -- notably Bob Kowalski, the high priest of logic programming, and Ed Feigenbaum, the inventor of knowledge engineering. In a paper written in 1974, Feigenbaum gave his most forceful statement of McCarthy's ideal.

The potential use of computers by people to accomplish tasks can be "one-dimensionalized" into a spectrum representing the nature of the instruction that must be given the computer to do its job. Call it the what-to-how spectrum. At one extreme of the spectrum, the user supplies his intelligence to instruct the machine with precision exactly how to do his job step-by-step. ... At the other end of the spectrum is the user with his real problem. ... He aspires to communicate what he wants done ... without having to lay out in detail all necessary subgoals for adequate performance.

Some have argued that the way to achieve intelligent behavior is through specialization. That may work so long as the assumptions one makes in building such systems are true. For general intelligence, however, general intellectual capabilities are needed, and such systems shoud be capable of performing well in a wide variety of tasks. To paraphrase the words of Robert Heinlein.

A human being should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyze a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects.

It is our belief that general game playing offers an interesting application area within which general AI can be investigated.