Tic-Tac-Toe

A program written in Pascal for playing Tic-Tac-Toe or "Five in a Row" (an extended version of Tic-Tac-Toe), which learns from its mistakes.


I wrote this Pascal program in the times of yore, inspired by the movie WarGames (1983), where a computer on the verge of starting a global thermonuclear war is made to play tic-tac-toe against itself so that it may learn that there are some games which cannot be won.

My program starts by playing tic-tac-toe at random and whenever it loses a game it analyses and learns from it, improving its way of playing until it becomes unbeatable.

Alas as it becomes unbeatable it declares that the game of tic-tac-toe can always be won :) … this is due to a particular situation not being dealt with by the program (see below).

The program can also play five in a row and start learning and improving on it as well, although the improvement may perhaps lead to problems of combinatorial explosion of the information that the program has to handle.

At the time I was running this program on a "jurassic" unix mainframe, whose name I have long forgotten, but I remember that while learning to play five in a row it got to a point where there were more than a handful of new schemes to be generated and analysed and the program had to be left running overnight to get over them.

Some of the procedures in this program also deal with a rather laughable problem by today's standards, i.e. that of sending characters to an ASCII terminal to try and make it look like a graphical display.

What could be worth resuscitating from this program is the basic idea of using schemes and backward analysis, which might turn out to be a viable alternative to the forward search approach commonly adopted by engines playing these kinds of games.

Downloadable files:

External links:


Extracts from Tic-Tac-Toe.p

Game

Two players in turn lay their mark (a circle or cross symbol) on a grid. The player that manages to align a certain number of marks, horizontally, vertically or diagonally, wins the game.

The dimentions of the grid and the number of marks to be aligned are selectable (see boardrows, boardcols and winalign). The standard Tic-Tac-Toe can be played by selecting a 3x3 grid with 3 marks to align and Five-in-a-Row can be played by selecting a larger grid with 5 marks to align.

Main Procedures

The program contains two main procedures:

  • Procedure playthegame is for playing the game.
  • Procedure improveschemes is for improving the schemes used by the program to decide on the moves to play, based on the game just played.

Schemes

A scheme represents a set of conditions which determine a threat, i.e. the possibility of executing a sequence of moves which leads to win the game.

The schemes used by the program can be found in the record schemes of linked list list. In schemes, elements from 1 to totfree in the matrix free and from 1 to totheld in the matrix held indicate the squares on the grid which must satisfy respectively the conditions of 'being empty' and 'being marked with a sign by the player that makes the threat'. When all conditions are satisfied the move being threatened on the grid corresponds to the (0, 0) square in the scheme.

A single scheme represents multiple positions on the grid and the program derives those positions by centering, rotating and flipping the scheme. To do this the program tracks the orientation of the scheme through side and degrees and takes into account the number of rotations and flips (see usefulturns and usefuloverturns) that can be performed to derive all the positions.

To speed up calculations the program also handles the scheme boundaries in up, down, left and right, which are mainly used by the fits function. For instance up is the maximum integer value among the rows of all the squares in free[] and held[].

Each scheme in list has an associated level. The level of a scheme defines the number of moves to win the game when following the shortest sequence, assuming that the opponent does not hinder its execution (if he does, it will still be possible to win the game by following other longer sequences of moves). Schemes in list which share the same level are grouped in sub-lists that can be followed using the list.next pointers. The firstschemeoflevel matrix is indexed by the level and can be used to access the relative list.

The program does not need to handle lists of moves because the procedure improveschemes constructs new schemes from lower level schemes, so that after applying the move indicated by a scheme, it will be always be possible to find other schemes in list which apply to the new position and indicate the following moves.

Data Files

The schemes generated by the program can be saved on Tic-Tac-Toe-Data files, to be maintained through application restart. The program uses a non standard reset available in the Unix environment to differentiate between the nominal parameter DataFile and the actual file name, which is defined in schemesfilename.

The type of schemes that are to be constructed depends on the number of marks that need to be aligned to win the game, hence schemefilename defines different file names depending on winalign.

Limitation

The program assumes that if there are two victoryplans that cannot be immediately thwarted by the same move then it is possible to win the game. This however is not the case, as there can be a move that thwarts one victoryplan and at the same time threatens a winning sequence. Then it may become necessary to respond with a countermove which is different from the move requested by the other victoryplan, so that the opponent may then thwart the other victoryplan as well.

For instance in the standard tic-tac-toe this position may occur (where X = user and O = computer):

X
O
X

and the program may be using this scheme (where # = held and ~ = free):

#
~
~ ~ #

In this case the program will not recognise that placing a O on a side square creates a threat which allows to thwart the two victory plans generated by the scheme (used as is and symmetrically with respec to the centre square). Hence the program will consider the position lost, and through improveschemes this incorrect interpretation will be rippled back to the start of the game, which will be considered lost when the opponent places its mark on a corner.

As it happens, the program will still end up making the correct move in this particular case, but this limitation is most certainly having an adverse effect for extended variations of this game.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License