In this chapter, we describe a general game player capable of functioning in a competition setting like the one described in the preceding chapter. We begin with an overview of the overall architecture of the our player. We then show how to instantiate the architecture to build two search-free players, viz. a legal player and a random player. We finish with an example of the player in operation.
A competition general game player consists of three parts. (1) The player's executive program is a body of code that manages communication between the player and the game manager (2) The player code consists subroutines called by the executive program to handle the various types of messages received from the game manager. (3) The code base is a set of standard subroutines for processing states and game descriptions.
The executive program in a competition general game player is a never-ending loop. It listens for message from the game manager, processes any messages it receives, and then repeats. Upon receiving a message, the executive program calls the appropriate player subroutine with the relevant components of the message as arguments. If the subroutine returns an answer other than false, the executive program puts the value in a reply message and sends that message to the game manager. If the value of the subroutine is false, the executive program does nothing and starts listening for more messages.
The player code for our player consists of subroutines for handling game playing messages from the game manager. There are five subroutines, one for each type of message. The subroutines are described below.
(1) The ping subroutine is called in response to a ping message. The calling pattern is shown below.ping()
If the player is ready to play, it returns ready. If the player is not ready to receive messages, it returns either busy or false as value. (In the former case, the executive program sends a bust message to the game manager. In the latter case, it simply doe snot reply. The consequences are the same in both cases.
(2) The start subroutine is called in response to a start message. The calling pattern is shown below.
Upon receipt of a start message, a player prepares itself to play the match. Once it is done, it replies ready as value.
(3) the play subroutine is called in response to a play message. The calling pattern is shown below.
Upon receipt of a play message, a player uses the move information to update its state as necessary. If it is the player in control, it then computes its next move and returns that move as value. Otherwise, it returns false.
Note that, it is the responsibility of the play subroutines to respond before the plan clock runs out. If it replies late, the game manager will substitute a random legal move of its own.
(4) The stop subroutine is called in response to a stop message. The calling pattern is shown below.
The stop subroutines cleans up after playing a match. No response is required, so the subroutine should return false.
(5) The abort subroutine is called in response to a stop message. The calling pattern is shown below.
The stop subroutines cleans up after playing a match. Again, no response is required, so the subroutine should return false.
The code base for a player consists of subroutines for processing game states and game descriptions. The Gamemaster code base contains definitions for the subroutines described below. Here state is a sequence of facts true in a state and state is a sequence of rules describing the game.
That's it. Our job is building a general game player consists of (a) selecting an appropriate executive program, loading the code base, and writing the five subroutines described above using the subroutines supplied by the code base.
4.3 Creating a Legal Player
A legal player is the simplest form of game player. In each state, a legal player selects an action based solely on its legality, without consideration of the consequences. Typically, the choice of action is consistent - it selects the same action every time it finds itself in the same state. (In this way, a legal player differs from a random player, which selects different legal actions on different occasions.)
Legal play is not a particularly good general game playing strategy. However, it is a worthwhile exercise to build a legal player (and a random player) just to get familiar with the concepts described above and to have a basis of comparison for more intelligent players.
Our player utilizes some global variables to retain information about a match while it is being played. The variables used in our implementation are shown below. (Properly, we should create a data structure for each match; and we should attach these values to this data structure. However, we are striving for simplicity of implementation in these notes. This does not mean that you should do the same.)
Using the basic subroutines provided in the GGP starter pack, building a legal player is very simple.
The ping handler simply returns true, since our legal player is always ready to play..
The start subroutine initializes our global variables and then returns ready.
The play subroutine takes a move as argument and uses the simulate subroutine to compute the resulting state. If the move is nil, the subroutine simply returns the current state (i.e. the initial state). Otherwise, it uses simulate to compute the state resulting from the specified move and the current state.
The play event handler takes a match identifier and a move as arguments. It first uses the simulate subroutine to compute the current state. If the move is nil, the subroutine returns the current state. Otherwise, it uses findnext to compute the state resulting from the specified move and the specified state. Once our player has the new state, it uses findlegalx to compute a legal move.
The stop subroutine does nothing. It simply returns false.
The abort subroutiner also does nothing and simply returns false.
4.4 Creating a Random Player
A random player is similar to a legal player in that it selects an action for a state based solely on its legality, without consideration of the consequences. A random player differs from a legal player in that it does not simply take the first legal move it finds but rather selects randomly from among the legal actions available in the state, usually choosing a different move on different occasions.
The implementation of a random player is almost identical to the implementation of a legal player. The only difference is in the play method. In selecting an action, our player first computes all legal moves in the given state and then randomly selects from among these choices (using the randomindex subroutine). One way of writing the code for the play handler is shown below.
Random players are no smarter than legal players. However, they often appear more interesting because they are unpredictable. Also, they sometimes avoid traps that befall consistent players like legal, which can sometimes maneuver themselves into a corner and be unable to escape. They are also used as standards to show that general game players or specific methods perform better than chance.
A random player consumes slightly more compute time than a legal player, since it must compute all legal moves in each state rather than just one. For most games, this is not a problem; but for games with a large number of possible actions, the difference can be noticeable.