Computer chess: a historical perspective

Andrew Lea FBCS explains the different approaches to programming chess computers. Along the way, he explores the many historical attempts at creating a chess playing machine and asks philosophical questions about the nature of artificial intelligence.

In this article, we will consider the significance of computer chess and similar games. We will start with a reasonably technical exploration of how computer chess is typically implemented. This will provide a good framework for understanding chess’ history, development, significance and possible future.

How computer chess works

Technically, chess is a ‘perfect information, no chance, two-player, turn-based, adversarial board game’. In that regard, it is similar to Go, Connect-Four, draughts or noughts-and-crosses.

There are two main ways of programming chess. The first is effectively a set of ‘rules’ determining the best play move. Whilst such a strategy can exist for simple games, such as noughts-and-crosses, no such set is known for chess (which doesn’t preclude their existence, of course).

The second strategy is the ‘Turing Shannon’ paradigm: I consider all my moves, consider all your replies, consider all of my replies to those moves…and so forth, ideally until a winning, drawn, or losing position is found. This is the MiniMax algorithm: white seeks to maximise the score, and black seeks to minimise it. The game tree is explored recursively, assigning the maximum descendent value to each ‘white to move’ position and the minimum descendent value to each ‘black to move’ position.

Is ‘exploring the game space’ viable? For a small game, such as noughts-and-crosses, it is. It depends on the size of the game space to be explored for bigger games.

To characterise the game space size, we can use the typical branching factor — the number of possible moves from each position. Chess has a branching factor of almost 36, so to explore to a depth of ten — namely, five turns ahead — 3610, or 3.7 x 10¹⁵, positions would have to be explored; clearly far too many.

Fortunately the ‘alpha-beta’ heuristic can reduce the branching factor, in the ideal circumstances of exploring the best moves first, to its square root: six. So a depth of 10 is now a viable: 6¹⁰ = 60 million positions. Unfortunately, we don’t know which the best moves are — if we did, we wouldn’t need to explore the game tree — but fortunately, alpha-beta helps even if the moves are not perfectly ordered. ‘Iterative deepening’, or ordering moves by first exploring the tree to a lower depth, helps with that ordering.

Heuristics — procedures which yield good but not necessarily exactly correct answers — enable even deeper exploration in a given time. Many chess programs extend the search where pieces are en prise. ‘Transposition tables’, which can be very big, store the scores of positions already calculated, and since many move combinations reach the same position, this further reduces the number of positions to examine.

All of this depends on being able to score a position. Ideally, we reach a proven win or lose (for example checkmate) and assign +/- infinity or a draw (this could be stalemate) and can assign 0, but except in end-games, we mostly do not. A ‘static evaluation function’ endeavours to score a position without exploring deeper. This typically is where positional knowledge and move theory is embedded.

The final ingredient of a chess program is a large library of opening moves, the opening book, often derived from human games.

A chess engine’s power depends on both the quality of the static evaluation and how deep it can explore, which for a given amount of time depends on how fast it runs. For this reason, chess programs are highly optimised (often written in C) where the need for speed exceeds the need for clarity.

The online version of this article has a C implementation of noughts-and-crosses, written in the style of a chess engine including gnarly fast ‘bit-board’ representations. Typical components include:

Piece and board representations, often using several in parallel
A move generator, responsible for generating candidate moves from any position. The different move patterns and exceptions, such as en-passant and castling, make this complex in chess
Recursive ‘mini-max’ function(s) to explore the game tree
Input/output functions

Our toy noughts-and-crosses program lacks these major ingredients of a chess program:

A static evaluation function, to estimate how ‘good’ the position is
An opening book library
A transposition table handler, to cache position scores for repeating positions

Milestones in computer checkers and chess

In any computer chess history article, it is traditional to reference ‘The Turk’ of 1770, which was simply a small chap in a disguised-as-a-machine box. Far from being the first instance of chess computing, it was more the first instance of ‘fake AI’.

The first chess playing machine was El Ajedrecista, a remarkable automaton created in 1912 by Leonardo Torres Quevedo to play a King-and-Rook vs King end-game.

For you

Be part of something bigger, join BCS, The Chartered Institute for IT.

The first chess program was ‘Anti-Clerical Chess’, at Los Alamos. Resources were so limited it used a 6 by 6 board, and had no bishops. Nonetheless, it was the first computer chess to beat a human.

Arthur Samuel’s checkers program of 1958 actually used a self-improving learning strategy, conceptually not dissimilar from some of the recent advances in machine learning.

With the arrival of micro-computers, many programmers worked on and developed computer chess, developing powerful techniques such as transposition tables.

Deep Blue, with dedicated chess hardware, was the first to beat a world champion in its famous battles with world champion Gary Kasparov in 1997.

In 2007 Chinook ‘solved’ checkers, proving that perfect play by both sides results in a draw.

Stockfish chess is an open source chess engine, incorporating some of the very best human written chess. Nonetheless it was beaten in 2017 by AlphaZero, which was self trained with a self playing learning strategy.

Significance in computing

Computer chess programs share complexity with other ‘large’ computer programs, such as language compilers and operating systems. Subtle design concepts and interactions across components of the entire program can give rise to unforeseen and unfortunate behaviours: curious move choices (was that genius or bug?), an erroneous optimisation, or even disc corruption in rare edge cases. Chess programs must be fast, compilers must be laser-accurate, and operating systems must be responsive with low latency. All three are therefore complex, making it hard to ensure correctness.

Chess research has helped develop these facets of AI and software engineering:

Procedural knowledge representation: where (for example) positional factors are lovingly hand crafted into the evaluation function
Development of AI techniques, such as mini-max and alpha-beta heuristics. Some may not consider these to be AI techniques, but when developed they were all about getting a computer to ‘think’ — play chess — and therefore were considered AI.
Knowledge learning, such as in trained evaluation functions
Representation of ‘infinite’ problem spaces
Search space exploration
Efficient and fast algorithms
Writing reliable software: it is hard to know where a chess program might be going wrong, so most chess programs have extensive self-testing

But is non-learning chess AI? It certainly used to be regarded as such, but nowadays, with large language models in particular and machine learning in general being so fashionable, they would probably not (generally) be regarded as AI. Leaving aside the observation that machine learning is only a sub-set of AI — non-learning AI exists — perhaps we could award traditional computer chess the category of pioneering AI.

Future history

So, is computer chess ‘over’ as an AI endeavour, as a vehicle for learning more about AI? Hopefully not, as some computer chess challenges remain:

Unbeatable chess: the best chess programs are very good… but are they unbeatable?
Perfect chess: beyond unbeatable is optimum, winning in the minimum number of moves, or postponing defeat to the maximum
Solving chess, as checkers has been by Chinook — is it a forced win or draw?
Chess which follows a particular player’s style, or Turing Chess which plays chess in a way indistinguishable from a human, complete with mistakes
Teaching chess: a machine which plays chess in a way designed to help the player learn good chess, and adapting to the pupil’s play

Does any of this render human chess pointless? Clearly not: we can’t outrun horses or even bicycles, but athletics is alive. Strong evidence is that playing chess helps keep the brain ‘fit’ and fight off mental decline.

Every serious programmer should write a chess engine, compiler, mini operating system, or program of similar complexity once in their career. They all demand project management and documentation skills, advanced algorithms, insightful heuristics, complex data structures, accurate programming, and techniques that the master programmer will want to, well, master.

And finally, could we improve chess, making it human-winnable again? Is it possible to design games which people can win, but digital computers cannot? If we could, it would show that the human brain is not a symbol-processor in the sense that a digital computer is, and imply that a digital computer AI can never share all the attributes of the human mind.

About the author

Andrew Lea is a Fellow of the BCS and a member of its AI specialist interest group steering committee. He studied Natural Sciences at Cambridge and Computing at London, and has worked in AI for the last four decades. His current AI projects include applying AI to project management (with projectscience.co.uk), and symbolic regression.

Download Andrew’s chess engine written in C