Rules of the Game

The Context

Some people are not as impressed as they should be by the achievement of #AlphaZero in mastering three independent games from scratch to world-champion standard, given only the rules, in a matter of hours. (For details, cf. the paper published by DeepMind on December 5th, 2017 at I have heard it said that computers have always been better at some things than human beings, not least calculating, crunching data, determining statistics, and a host of other things. But this is different because it is not just about doing things faster; it is about learning how to play games that have been around for thousands of years while ignoring everything that human beings have ever said about them, thought about them, or suggested might be a good way to play them.

#AlphaZero plays Go differently from #AlphaGoZero; it plays Chess differently, too, making moves that few humans would even consider, to say nothing of the fact that it reinvented the entire treasure-house of standard book opening strategy from scratch.

This massive achievement, coming on the heels of similar successes now long passed in simple Atari games and Backgammon, will almost certainly soon be followed by mastery of Buzzard’s Starcraft 2 game, claimed to be far more difficult than Go, Chess or Shogi.

The Challenge

This blog is not about any of this; it is about something altogether different: the question of how #AlphaZero might become a master at human activities that are not obviously games, but probably can be conceived and modelled as such, even if they are not.

It would be comparatively easy to determine the rules of Chess and Go by watching a few games; it would be and clearly is far less easy to determine the rules of economics, the operation of the world’s stock markets, currency exchanges, the behaviour of the weather, traffic systems, and countless other things humans engage and are affected by on a daily basis in that are at least notionally bounded, that is to say contained within a definable range of activities or events.

Nobody knows the rules of any of these “games”, which is not to say that there are none or that there have not been substantial and persistent attempts to ascertain what they are; indeed, economic and political theories try to say what the rules are, and the various theories of economics that have been articulated, if not precisely codified (Marxism, Keynesianism, Monetarism, etc), try to model economic activity within the scope of a set of principles. An economic theory that successfully abstracted a full set of rules that allowed us to plot and model economic activity would “clean up”; it is just that nobody has ever managed to articulate it.


Mathematicians have been doing something similar for decades in the part of the discipline called axiomatics, and the process they use is called axiomatisation: the process of taking a mathematical system (such as the natural numbers {1, 2, 3, …}) and ascertaining a set of axioms or fundamental rules that govern the behaviour of that system. Such sets of axioms are not unique, but once we choose such an axiom set we can then proceed to regenerate the mathematical system, prove theorems in the system that predict its properties, and so forth. There are complications, not least questions to do with completeness and consistency, that have been discussed profoundly by such as Kurt Gödel; the details are not relevant here, but may eventually become so if the axiomatisation of other disciplines can be achieved.

Suppose, now, that some descendant of #AlphaZero – let’s call it #AlphaZeroPlus for convenience – becomes adept at abstracting the rules of a game simply from observing instances of the game being played. We might then show it, for example, the operations of a stock market or a traffic system, and ask it to generate the set of rules governing the way the game is “played”. If it could do that, then given the achievements of #AlphaZero, we would expect it then to be able to master the game – stock market investment or traffic management in our examples – relatively quickly. Were #AlphaZeroPlus to be able to determine the rules of economics itself, either in some relatively isolated part or perhaps, if we are optimistic, of the whole of economics, then again we would expect it to be able to learn to play the game fairly quickly, depending on how complex the rules proved to be, and make moves in economics using the available instruments – the things the Central Authorities such as the Bank of England and the Federal Reserve can change, like interest rates or money supply – that would achieve some putative desirable outcome. That outcome is not of course determined by the rules of the game any more than the winning-conditions of Chess or Go are determined by the rules governing the legal moves in those games; one could just as easily play Chess using exactly the same moves with an objective of capturing the opponent’s queen as a criterion of victory as play it to achieve checkmate. To some extent this idea has already been implemented in such things as “Chess960” otherwise known as Fischer Chess where the starting-positions of the pieces are randomised on the back row of the board according to certain constraints.

So one next step for AI could be to apply itself to the considerable task of determining the rules governing the behaviour of an arbitrary system of human activity. Were it able to do that then, depending on how wide a range of human activity proved susceptible to such axiomatisation – and my suspicion is that more would prove susceptible than we might at first sight imagine or wish to imagine – the kinds of achievements #AlphaZero has demonstrated would prove of extraordinary and world-changing power.

The team at #DeepMind have already been applying their technology to analysis of medical data, hoping to discern in the material some clues that will help diagnostic medicine. There is no reason, were they or some team with similar skills to be able to crack axiomatisation, why we would not be able to apply the same technology to environmental issues such as climate change, where knowing the rules would facilitate more efficient alterations of our behaviour to achieve a desired outcome, or politics, where understanding the rules governing human behaviour might permit us to achieve desired political outcomes such as resolving deadlocks. And yes, of course there are dangers: knowing the rules and being able to play the game to achieve any desired outcome would give those controlling such power unlimited influence over the trajectory of the world in most respects. But the fact that progress can be abused is not a new discovery, and that AI can be abused should surprise nobody.

Partial Axiomatisations

There is a further stage to this evolution of AI power that presents even more intriguing prospects. Suppose, as seems likely, that there are human activities and systems of such complexity that they resist axiomatisation, whether because they cannot be axiomatised or because they require levels of skill beyond even our fabled #AlphaZeroPlus. We might find ourselves in possession of partial axiomatisations of such systems that could explain aspects of their evolution but not every aspect. Such sparse axiom sets could still be used to generate probabilistic models rather like weather maps showing the likelihoods of different scenarios emerging from current situations.

Super-Sensitive Systems

Of course, as has been known for some decades now, many supposedly predictable systems are super-sensitive to the initial conditions, so-called “chaotic” systems, so even with an axiomatisation of such systems we, in collaboration with our superintelligent AIs, might still find ourselves incapable of determining the initial conditions to sufficient accuracy to make reliable predictions of a system’s evolution. In the case of some systems, indeed, there is no “sufficient” degree of accuracy; any change in initial conditions, even in the millionth or billionth decimal place, will send the system off on eventually divergent trajectories. But, while we should be aware of such intractable cases, we should still be able to make some kinds of predictions of how systems will evolve in most cases or, if they are markedly unstable, to identify them as such.


Much of education consists of interactions between teachers and learners. Many of those interactions have the form of moves in a game: a student does this, a teacher does that; students ask questions, teachers answer them or refer students to resources that answer them; students make mistakes, teachers correct them; and so forth. One day it seems perfectly plausible for the kinds of game-oriented AI technology we have been discussing to formulate some rules that shape the way learning happens (for example, Richard Feynman tried some decades ago to do just that). If that gives rise to some Educational Artificial Intelligence Engines (EAIEs), as has been predicted in this blog before, then the kinds of abstractive, axiomatic processes described here will be crucial in ascertaining how those educational systems and processes work. Then every student could be allocated a personalised EAIE that would track and provide input and feedback to all his or her activities, questions and projects.

The Game of Life

And of course the “Holy Grail” of such abstraction and axiomatisation would be for us to be able to build an #AlphaOmega that could determine the rules governing the Game of Life itself (and we don’t mean the Martin Conway version), but since #AlphaOmega would be a part of the game it was attempting to model, that of course would take us into quite another realm of self-referential computational complexity.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s