Transcript
University of Manchester School of Computer Science Third Year Project Report
Tron AI Adam Gill Bsc. Computer Science
Supervisor: Dr. Jonathan Shapiro
Page 1 of 43
Abstract There exists a number of artificial intelligence bots to play the game of Light Cycles from the science fiction universe of the Tron films, specifically those that were entered in the Google AI Challenge in 2010. This project looks at implementing a game engine along with a graphical interface for users to interact with. An AI is developed to work with the game engine and act as an opponent for users to play against. Furthermore, interactivity is added so that the Google challenge bots can be used as players of the system. This report will look at how this has been achieved, in addition to the performance of the project AI being evaluated against the existing bots.
Contents:
Page number
1. Introduction
6
1.1 The Game of Light Cycles
6
1.2 Distinguishing Light Cycles from other two player games
6
1.2.1 The need for low computation time
6
1.2.2 Endgame
6
1.2.3 Variable arena size
6
1.3 Introduction to the Google AI Challenge
7
1.4 Project milestones
7
1.5 What this report will contain
7
1.6 Defining some terminology
7
2. Background
8
2.1 What is a game?
8
2.2 Simultaneous move games
8
2.3 Zero sum games
8
2.4 Game trees
8
2.5 Game solving with minimax
9
2.6 Alpha-Beta pruning
12
2.7 Evaluation functions and heuristics
13
2.8 Space-filling
14 Page 2 of 43
2.8.1 Checkerboard
14
2.8.2 Articulation points
15
3. Design
16
3.1 Basic design of game
16
3.2 Tree node design: memory vs computation time
16
3.2.1 Approach 1
17
3.2.2 Approach 2
18
3.3 Handling simultaneous moves
18
3.4 Voronoi heuristic
19
3.5 Endgame detection
20
3.6 Exhaustive space filling
21
3.7 Arena generation
21
3.8 Graphically representing arena
21
3.9 User interface
22
3.10 Interactivity with Google Challenge Bots
24
3.11 Designing levels of difficulty for human opponents
24
3.12 Designing title screen
25
3.13 Designing evaluation software
25
4. Results
25
4.1 Solving the game of Light Cycles
25
4.2 Alpha beta prunings improvement over basic minimax
26
4.3 Selecting difficulty levels for human opponents
27
4.4 Interacting with final software
29
5. Evaluation
31
5.1 Selecting arenas for evaluation
31
5.2 Easy medium and hard difficulties
32
5.3 Evaluating against Google bots - Deciding level of AI
32
5.4 Selecting opponent bots
34
5.5 Evaluation results
35
5.6 Evaluation against ExperimentGarden
35 Page 3 of 43
5.7 Evaluation against luv2run
36
5.8 Evaluation against dmj
39
6. Conclusion
41
7. References
41
List of figures Figure 2.1: A game tree of the first move of 4x4 Light Cycles
9
Figure 2.2 Game tree with values for leaf nodes
10
Figure 2.3: The tree after the first max decision
10
Figure 2.4: The tree after the first min decision
11
Figure 2.5: The final tree with value for root node
11
Figure 2.6: Minimax on a simple Light Cycles tree
12
Figure 2.7: Alpha-Beta pruned tree
13
Figure 2.8: Checkerboard space filling
15
Figure 2.9: Checkerboard space filling
15
Figure 2.10: How an articulation point cuts off parts of the arena
16
Figure 3.1: Example board position for a node
17
Figure 3.2: Storing player moves in game tree
17
Figure 3.3: Generating game tree without simultaneous moves
18
Figure 3.4: Generating game tree with simultaneous moves
19
Figure 3.5: An example arena split by a Voronoi evaluation
20
Figure 3.6: Light Cycle design
22
Figure 3.7: Schneiderman's Golden Rules
22
Figure 3.8: Palette for Tron colour scheme
23
Figure 3.9: User interface design
24
Figure 4.1: Arena used to test pruning
27
Figure 4.2: Screenshot of the starting screen for the Light Cycles game
29
Figure 4.3: Screenshot of game paused in player vs player mode
30
Figure 4.4: Screenshot of the options for AI Light Cycles
30
Figure 4.5: Screenshot of the ‘Bot’ menu with arena walls
31
Page 4 of 43
Figure 5.1: Results from AI with endgame vs AI without separated by size
33
Figure 5.2: Results from AI with endgame vs AI without separated by frequency of walls
34
Figure 5.3: Match against ExperimentGarden on Arena 21
35
Figure 5.4: Match against ExperimentGarden on Arena 29
36
Figure 5.5: Results from AI without endgame vs luv2run separated by size
37
Figure 5.6: Results from AI with endgame vs luv2run separated by size
37
Figure 5.7: Results from AI without endgame vs luv2run separated by frequency of walls
38
Figure 5.8: Results from AI with endgame vs luv2run separated by frequency of walls
38
Figure 5.9: Results from AI with endgame vs dmj separated by size of arena
39
Figure 5.10: Results from AI without endgame vs dmj separated by size of arena
39
Figure 5.11: Results from AI without endgame vs dmj separated by frequency of walls
40
Figure 5.12: Results from AI with endgame vs dmj separated by frequency of walls
40
List of tables Table 4.1: How increasing arena size exponentially increases processing time
26
Table 4.2: Comparing minimax to minimax with alpha-beta
27
Table 4.3 Maximum time taken for a single move over 150 games at different depths of a Minimax game tree
28
Table 5.1: Easy vs Medium difficulty
32
Table 5.2: Medium vs hard difficulty
32
Table 5.3: Without endgame
34
Table 5.4: Without endgame
35
Page 5 of 43
1: Introduction 1.1) The Game of Light Cycles Tron AI refers to the game which appeared in the 1982 science fiction film Tron, and involves players controlling Light Cycles, futuristic vehicles resembling motorbikes that produce walls of coloured light. The Light Cycles are constantly moving inside an arena, with the aim of the game to survive longer than your opponents. Players are able to make a choice to change direction at each tick of the game, with the Light Cycle continuing in the same direction if no choice is made. A player is out of the game when they hit a wall of the arena or a wall generated by a Light Cycle. The aim of the project was to create a playable version of the game, along with an AI for human players to play against. 1.2) Distinguishing Light Cycles from other two player games 1.2.1) The need for low computation time In other two player games that have been studied in the field of AI, such as Chess, there is usually a longer time to make a move. In contrast to this, in the game of Light Cycles, an AI or a player has very little time to decide the next move to make. In addition, it would be expected that for a game to be playable for human players, it would have different speeds to play at. This characteristics mean that a high importance must be placed on having low computation time when calculating the next move to make. 1.2.2 Endgame Light Cycles also has an interesting property that if the players are separated from each other, then when computing the next move, you no longer need to take into account the possible moves your opponent can make. The game changes to one in which both players aim to fill their space as efficiently as possible in order to survive longer than the opponent. This brings in a further interesting issue, which is that an early game move, anticipating that the players will become separated, should be one that leaves the player with an area better for space filling. This will be discussed in the “Space-filling” section of the report. 1.2.3 Variable arena size In comparison to Chess, which has a set starting position for every game, Light Cycles can be played on arenas of different shapes and sizes. In Chess, many AIs will use a move table to select a response to a move made by the opponent, meaning that early moves can be played quickly with little computation. This is not a possibility in Light Cycles, unless a set arena is used every time. Page 6 of 43
1.3 Introduction to the Google AI Challenge The Light Cycle game has been studied in detail through a Google AI challenge in 2012 [1]. The aim of the challenge was to create a bot which played the game against other players. The contest ran for one month with over 700 entrants. There are a number of bots from the challenge which are still available online, and will be used in the project to evaluate the performance of the AI. In addition, the forums for the competition and writeups of the players will provide valuable information as to the techniques for developing a high performing bot. 1.4 Project milestones There were three main milestones for the development of the project. The first was to create a game engine for the Light Cycle game. Second, was to implement an AI for the game. The final milestone was for a GUI to be added to make the game playable by humans. 1.5 What this report will contain This report contains a look at the background to minimax and alpha-beta pruning in relation to game trees. The Tron Google AI challenge will be used to give insight into past work done in the area. How this project was designed and the results of that design will be then be discussed. Finally, the implemented AI will be evaluated against bots available from the Google challenge. 1.6 Defining some terminology Arena: The grid in which the players navigate their Light Cycles through will be referred to as the arena. AI: The artificial intelligence that will be developed in this project will be referred to as the AI. Bot: The participants of the Google AI Challenge will be referred to as bots. Although the bots use an AI, they will be called bots to avoid confusion between the AI developed in this project.
2 Background 2.1 What is a game? Page 7 of 43
In the scope of game theory, a game can be thought of as something that satisfies three requirements, taken from the COMP34120 notes. Firstly, “A number of agents interact with each other. These are known as the players”. Secondly, “At the end of their interactions there is an outcome that can be described numerically - each player receives a pay-off which may be negative. We assume that the pay-offs are given as real numbers”. Thirdly, “ there are clearly defined rules as to what actions each agent can take at which point in time and once all such decisions have been made the outcome is uniquely determined” [2]. The game of Light Cycles can be shown to satisfy these requirements. The agents in the game take control of a Light Cycle each and interact with each other in the arena. The numerical payoff for each player can be seen by representing a win, loss, or draw numerically. The rules for the actions an agent can make are that they can move in up to three directions at a point in time depending on their surroundings, and the outcome is either a victory for one player, or a draw. 2.2 Simultaneous move games A simultaneous move game is one in which the players have to make their moves at the same time with no knowledge of what the other player is choosing to do. An example would be the Prisoner's Dilemma, where the players have to decide whether to talk or not talk to the police. Both players talking results in a negative payoffs for each player, and both not talking results in higher but still negative payoff. If one player talks and the other doesn’t, then the talking player has a higher payoff than the silent one. When making the choice the players don’t know what the other will pick. Light Cycles is a simultaneous move game as the players both make their moves at the same time. 2.3 Zero Sum games A game is zero sum when one player's gain or loss is balanced with the other player. Light Cycles is zero sum as when one player wins, the other player has lost and vice versa. In addition, a draw for one player is a draw is a draw for the other. 2.4 Game Trees Game trees are based on the requirement of games that they have decision points, where we know what the options available to a player at that time are. A node of a tree is where one of these decisions is made, originating from the root node of the tree.
Page 8 of 43
In simultaneous move games, the second player does not know what move the first picked when deciding theirs. This imperfect information is shown on a game tree by bubbling nodes that are in the same information set. A game tree for the first move of a simple 4x4 game of Light Cycles is shown in figure 2.1.
Figure 2.1: A game tree of the first move of 4x4 Light Cycles Visualising simultaneous move games in this way lets us say that one player moves first and the other has imperfect information for their move which follows. 2.5 Game solving with Minimax Minimax is a way of deciding which move to make by assuming the opponent will always pick the move at their decision points which have the minimum score for us (and the maximum score for themselves). Additionally, we pick the move at our decision points which have the maximum value for us (and minimum for them). At decision points belonging to us, we call the player max, as in we pick the child node with the maximum value. At decision points belonging to the opponent, we call the player min and pick the child node with the minimum value. It is only the leaf nodes of the tree which have start with a value. We then work up to the parents assigning a value to the parent node based on whether the decision node is a min or max node. When we reach the root node which is a max node where we need to make a decision, we pick the child with the max value, and that the Page 9 of 43
move is the one which we make. For zero sum games, the value for one player is +X, and the value for the other player is -X. It is custom to just put +X as the value for a node, as the value of the second player is just the negative. A walkthrough of the minimax process is demonstrated in figure 2.2, 2.3 2.4 and 2.5.
Figure 2.2: Game tree with values for leaf nodes
Figure 2.3: The tree after the first max decision
Page 10 of 43
Figure 2.4: The tree after the first min decision
Figure 2.5: The final tree with value for root node In the game of Light Cycles, decision points for our Light Cycle are called max, and decision points for the opponent are called min. The difference when applying this method to Light Cycles, is initially we only have 3 possible values for a node, that of a win, loss, or draw. Figure 2.6 shows minimax applied near the end of a game of 5x5 Light Cycles. A score of 1 indicates a win, -1 indicates a loss, and 0 indicates a draw.
Page 11 of 43
Figure 2.6: Minimax on a simple Light Cycles tree Minimax works from the bottom up, so for the decision point on the bottom left, the MIN player chooses -1, as it’s lower than 0. Similarly, for the bottom right decision, they choose 0. The MAX player then picks the maximum out of -1 and 0, which is 0. This means that minimax would tell the blue player to move right at the initial arena position of the root node. 2.6 Alpha-Beta pruning Alpha-beta pruning aims to reduce the number of nodes that minimax evaluates. This is done by pruning the game tree when we know that there is no point exploring the children of a node further. In order to know this, additional values of alpha and beta are added to a node. The alpha value is never greater than the actual score of the node. For leaf nodes its the same as the score at the node, and for non-leaf nodes alpha is set to negative infinity. When performing minimax, at a max node alpha is set to the largest value of the children currently explored, and at a min node is set to the alpha value of its parent. The beta value is never less than the actual score of the node, and for leaf nodes it is set to the score of that node, and for non-leaf nodes it is set to infinity. During minimax, at a min node, it is set to the current smallest of the scores of its children, and at max nodes it is set to the beta value of its parent.
Page 12 of 43
It is important to note that the game tree is created depth first , allowing branches of the game tree to be pruned when the alpha value is greater than the beta value of a node. This is shown in figure 2.7, using the same game tree that was used in the minimax description, the nodes are labeled alphabetically in the order they are visited. The red branches of the tree are never explored. At node F, we are selecting the maximum value from the children, and have the first value of 8. So being a max node, we know that the minimum score for node F is 8. In addition, we know that at node B, we will be selecting the lowest value, as it’s a min node. Currently the maximum value is 5, so the only way the value will change is if the value of F is below 5. It is these two facts, that the minimum score for node F is 8, and the maximum value of the parent being 5, that cause the pruning. For node H, the opposite occurs, the minimum value of A is 5, and the maximum value of H is 2, therefore pruning occurs. Pruned nodes are not generated or visited. It is better to generate the node for the best move for each player, as this will lead to the most pruning.
Figure 2.7: Alpha-Beta pruned tree 2.7 Evaluation functions and heuristics In many games it is not possible to reach all the leaf nodes of the game tree to apply minimax and alpha-beta pruning, therefore it is necessary to have a way to get an estimation of the value that a node has in the game tree. The way of achieving an estimation is to run an evaluation function based on heuristics. This allows us to go to a certain depth of the game tree, use an evaluation function to value leaf nodes, then apply minimax to get an estimation of the value of the root node. Page 13 of 43
Heuristics often have to have a balance between speed and complexity, as a complex evaluation function may give a better estimation of the arena, however a quick evaluation function may allow you to go deeper into the tree. The type of game normally dictates the balance between speed and complexity. Usually in a game where going a little deeper in the game tree could see a victory or loss, or where a heuristic does not exist that gives a good prediction of the value of a node, is is better to use a simple evaluation function and go deeper into the game tree. The game of Kalah has been studied in detail and even solved at smaller sizes [3], and has these properties. Often, going one deeper in the Kalah game tree results in a very different value as the scores for each player can change substantially in one move. In a game where heuristics exist that give a good estimation of the value of the arena, or its unlikely that going deeper into the game tree will change the valuation of a position greatly, then a more complex evaluation function is favoured. A complex evaluation function give better performance in a Chess AI, as shown by Deep Blue, the famous chess playing AI that defeated Chess world champion Garry Kasparov in 1997. It used a complicated evaluation function as described by Campbell et al. [4] For the game of Light Cycles, the winner of the Google AI challenge wrote that “a deep minimax search using a flawed evaluation heuristic is self-deluded about what its opponent is actually going to do”[5], suggesting that the game of Light Cycles falls into the second category of requiring a good evaluation function over going deeper in the tree. 2.8 Space-filling In many games of Light Cycles, the players are separated from each other and a good AI will ignore the other player and try to calculate the best way of filling the space left. We’ll explore some background as to what makes a good space filling bot, primarily based on the google ai challenge. These were methods implemented by the top bots in the competition. 2.8.1 Checkerboard The idea that when filling a space it can be viewed as a checkerboard was first brought up on the forums of the AI Challenge [6] and expanded on by the winner of the competition a1k0n [5]. The theory is that a space that if you imagine that the space that needs to be filled has black and white squares like a checkerboard. You can only move from a black to a white square and vice versa, therefore if there is more of one colour than the other, the extra squares are ‘wasted’. If you count
Page 14 of 43
the number of black and white squares, there is an upper bound on the number of squares its possible to fill. Figure 2.8 and 2.9 show examples of this, with an optimum path shown in orange.
Figure 2.8: Checkerboard space filling
Figure 2.9: Checkerboard space filling In figure 2.8 there are 6 black and 6 white squares, with an optimum path filling 12 of them. Figure 2.9 has 8 black and 5 white squares, with an optimum path filling route length of 11. The consequences of this mean that when deciding what area to move into, the ‘extra’ checkerboard squares shouldn’t be counted. 2.8.2 Articulation points In many situations during the endgame of Light Cycles, it is not possible to fill every square that is available, as there may be situations where moving into one square would cut off part of the space. These such squares are called articulation points [3], figure 2.10 shows an example of an articulation point, as entering the square marked with an X cuts off the green squares from the pink squares.
Page 15 of 43
Figure 2.10: How an articulation point cuts off parts of the arena Similarly to the checkerboard concept, articulation points were taken into account when evaluating the arena in the higher performing bots. These bots also made early moves which led to favourable positions later in the game, with areas to fill having an even balance of white and black squares. 3. Design 3.1 Basic design of game The language used for the project will be Java. This is to enable interactivity with the bots that competed in the Google challenge which ran on a Java engine. A Java representation for the arena was used by these bots, and any conversion and communication methods are simplified by using the same language. The Light Cycles will start in two corners of the arena, with blue in the top left and orange in bottom right. The decision has been made to not make the start positions movable, so to avoid confusion by human players when interacting with the games. Furthermore, the two most used set of keys for directional input on a keyboard, WASD and the arrow keys, are situated on the top left and bottom right of a traditional keyboard. Variety will be introduced into the game arena by having different sizes and the ability to add internal walls to arenas. The maximum size of the arena should be one that still allows the Light Cycles to be distinguished by a player. Internal wall will act as obstacles for players of the game, and will give an additional benefit of being able to generate a unique arena. To ensure a fair arena, the walls will be mirrored giving the arena symmetry of order two. 3.2 Tree node design: Memory vs Computation Time There is a trade off needed between computation time and memory when generating nodes of the game tree in Light Cycles. This can be explained by showing two alternative approaches. Lets use the example of a node where the arena current looks as it does in figure 3.1. Page 16 of 43
Figure 3.1: Example arena position for a node 3.2.1 Approach 1: Storing the moves made for that node and the parent node This approach would involve storing the move for our cycle and the move for the opposing cycle, as well as the parent node. In order to calculate the moves which could follow this position and go to the next depth of the tree, we would first need to build a representation of the arena, and to do this we would need to visit the parent node recursively until we reach the root node. Only then will we know which moves would not result in a cycle crashing. Figure 3.2 shows the node and its parents for this position, and how traveling down from the parent constructs a view of the arena, from which we can work the moves available for each player. The advantages are that with less memory used for a node, a larger game tree can be kept within memory, and in theory we could go deeper into the game tree.
Figure 3.2: Storing player moves in game tree Page 17 of 43
3.2.2 Approach 2: Store representation of arena at each node The alternative approach is to store a representation of the arena at each node. This uses more memory, but greatly reduces the time taken generate a child node, as the representation of the arena is readily available. Given the characteristics of the Light Cycles game, this is the approach that will be chosen. 3.3 Handling simultaneous moves For Minimax with alpha-beta pruning, its beneficial to have a game tree level for each player, the MAX player and the MIN player, as this simplifies the pruning of the tree. In addition, when generating the children for the node, a method which returns the available moves for a player should be used. The problem that this causes is demonstrated in figure 3.3, where you can see that some of the children wouldn’t be generated by a method which calculates the moves that would be on the next level of the tree. A solution to this is shown in figure 3.4, the bubbling present in simultaneous moves is preserved by the moves that each player could make being calculated at the same time and passed to the children nodes.
Figure 3.3: Generating game tree without simultaneous moves Page 18 of 43
Figure 3.4: Generating game tree with simultaneous moves You can see the difference this makes in the game tree, in the first situation blue would make a move assuming a victory, when in reality that move would result in a draw. In addition, incorrect branches of the tree could be pruned. 3.4 Voronoi Heuristic Its been established that a heuristic is needed when the game tree generated by minimax reaches the maximum depth and there are further moves both players could make. The heuristic chosen for this project is based on Voronoi diagrams, which is to assign portions of an area to the closest point. Relating this to Light Cycles, we split the available cells left in the arena into three areas. Those closest to our Light Cycle , those closest opponent, and those which are the same distance from the heads of both Light Cycles. Distance in this context refers to the number of moves needed to reach a Cell. Figure 3.5 shows an arena split in such a way. The light blue squares a reachable by the blue Light Cycle first and similarly the light orange squares are reachable by the orange Light Cycle first. The white squares are those which the Light Cycles would reach at the same time. Page 19 of 43
Figure 3.5: An example arena split by a Voronoi evaluation To calculate which squares can be reached first, the 2d boolean array representation of the arena can be converted into 2d integer array for each player. A breadth first search can be done using a queue starting with the cell belonging to the head of the player that we’re calculating for. The two Integer arrays can then be taken away from each other, and the score for the heuristic is the number of positive valued cells minus the number of negative valued cells. Using this heuristic will give a good estimation of what player a particular arena representation will favour. By deducting the cells that the opponent can reach first from the squares we can reach first, it is possible to get a numerical value which can be given as a score for the node for use in a minimax calculation. 3.5 Endgame Detection End game detection needs to see if the arena is separated. To do this, it is possible to re-use the breadth first traversal of the arena from a Light Cycles position, and return true if the opposing Light Cycle is encountered. This will be run at the start of our move, and switch the AI to “end game” mode when the Cycles can no longer reach each other and are both in separate areas of the arena. 3.6 Endgame tactic: Exhaustive space filling Page 20 of 43
The end game algorithm to be used for this project is relatively simple, recursively make every available move depth first, storing the current longest route. The algorithm ends after a set length of time, or when every possible combination of moves has been tried. 3.7 Arena generation One of the interesting features of the Light Cycles game is the difference in performance of an AI in arenas with different layouts of walls. In order to later evaluate how well an AI performs at the Light Cycles game, a way of generating a variety of arenas is a benefit. The design for this functionality in this project is to give each cell on a arena 1/X chance to be a wall, with a higher X leading to a lower frequency of total walls. In order for the players to be in a fair starting position, the arena has to be mirrored. This can be achieved by generating half of the arena, then copying the generated wall arrangement for the second half. In arenas of an odd size, the middle row/column can be left blank. This project will only use square arenas, as to keep the aesthetic feel for a user friendly user interface. This brings up an issue that there is a chance for the arena to start seperated, with both players unable to reach the other, essentially turning the match into a space filling game. To prevent this happening, a further check needs to happen on the arena to check for if the Light Cycles can reach each other, and regenerating the arena if they can’t. 3.8 Graphically representing arena There needs to be a graphical representation of the arena for players to look at as they play the game. A simple way of implementing this would be to use a grid of coloured squares, with the Light Cycles being represented with a solid block, followed by trails of a similar colour. An alternative would be to have a pictorial representation of the Light Cycles and trails. The advantages of using solid blocks of colour is that it would be that it would be easier to implement, as Java comes with APIs to solid fill areas. On the other hand, using images to represent the Cycles would improve the looks of the game, giving a higher quality finish. The design decision will be made to use images instead of block colour for this reason. Figure 3.6 shows what the design for the two Light Cycles and trails. Solid colour blocks will still be used for the background of the arena and walls. Page 21 of 43
Figure 3.6: Light Cycle design 3.9 User Interface A goal of this project is to develop a user interface for players to use when interacting with the game, therefore it is appropriate to look at what makes a good user interface. Schneiderman [7], established 8 rules for interfaces, as shown in figure 3.7.
Schneiderman's Golden Rules 1 Strive for consistency. 2 Enable frequent users to use shortcuts. 3 Offer informative feedback. 4 Design dialog to yield closure. 5 Offer simple error handling. 6 Permit easy reversal of actions. 7 Support internal locus of control. 8 Reduce short-term memory load. Figure 3.7: Schneiderman's Golden Rules One design for the user interface is to adopt a modular approach. The concept is that upon starting the game the user is presented an option of the possible game modes. Upon selecting the desired mode, the required modules will be loaded. There will be modules present in all modes, such as a control area to start the game and change the speed. In addition will be modules for setting up an AI, and for displaying player information. A second design would be to have a single interface, with the options to change play modes only taking up part of the space. The second approach will be used, due to the first design being over complicated, resulting in a mass of information for a user of the system to take in and making it confusing to understand. The buttons for the user to interact with will Page 22 of 43
not be universally sized. Instead, those which would be expected to be pressed more by a user will be given a larger size. In addition to this, areas of the interface which can’t be interacted with (Such as messages for users and control instructions) will be distinguished from interactive components by different colours. The colour scheme that will be used is representative of the Tron universe, based on the palatte in figure 3.8.
Figure 3.8: Palette for Tron colour scheme [7] A final issue is whether to make the software usable by users with different resolution pc screens. One option would be to have a recommended resolution size, with set sizes for the interface components. The disadvantage of this however would be that users without the recommended resolution would may not be able to fully interact with the software if part of the interface is cut off or generated incorrectly. A second option would be to have a dynamically sized interface, so all users regardless of screen size would be able to use the software. This is the option that will be used in the project, as it is would be unexpected for a user to be forced into one resolution to use the software. This second option is seen in figure 3.9. A is where the game arena will be, it will be interactive and allow the user of the software to place walls as obstacles. B is the game message, which will change depending on the last action performed by the user. C is where the game controls will be situated, such as start, stop and pause. D and E show where the user will change the input methods for both Light Cycles, with options of Player, AI and Bot. Finally, F will be where the user may change options in regards to the arena, such as the size and wall options.
Page 23 of 43
Figure 3.9: User interface design 3.10 Interactivity with Google Challenge Bots In the Google challenge engine, communication occurred between the engine and the bots through input and output streams. The design for this project is to use a similar technique to communicate. A problem that arises is the Google challenge uses a different representation of the arena than what is planned for this project. In order to solve this problem, a way of translating arenas generated in this engine to the arenas that the Google challenge engine used is needed. 3.11 Designing levels of difficulty for human opponents The software will include an option to play against the AI at various levels of difficulty. The design will be to have three levels of difficulty, easy, medium and hard. The game will be designed to be played at variable speeds, so a requirement of these ‘AIs levels’ will be to make a move within the time of the smallest game tick implemented. The higher levels of difficulty will be expected to perform better. This can be achieved simply by increasing the depth that they go into the game tree when deciding moves. Work will have to done to see how going deeper in the tree affects performance. In addition, it would be possible to use a different ‘endgame’ strategy, with the lower level of difficulty not switching to an ‘endgame’ Page 24 of 43
strategy. An alternative would be to use experiment with evaluation functions, and use worse performing evaluation functions on the lower levels of difficulty. 3.12 Designing ‘Title screen’ To give the game a polished feel, a title sequence will be added when it is first started. The design for this will be both Light Cycles spelling out the name using the the light trails. The name will need to be readable, and appear in a reasonable time. A decision needs to be made between using a continuous trail for the Light Cycles, essentially writing the title in cursive, and implementing a way of turning off the trails, allowing the Cycles to position themselves. The benefits of the first option is that it would be easier to program, however may result in a difficult to read end result. The second option presents an additional programming challenge, but would give a clearer result. 3.13 Designing Evaluation software A way to run the Light Cycles game headlessly without an interface will be required to run multiple games consecutively without requiring user input will be needed to evaluate the performance of the AI. In addition, a way of generating multiple arenas to evaluate with is needed to give more confidence in the judgement of performance. The design is to have a separate program which will generate a number of arenas based on arguments and store them in a file to be used by a second evaluation program. The evaluation program will read the file for the arenas and play two opponents against each other on them all, recording results in a log file. These opponents may be the project AI or bots from the Google challenge. Furthermore, a way of reading the logs and separating the results based on characteristics of the arenas will aid in determining where the AI performed well or badly. Finally, a way of viewing the logs and giving a visual representation of the match in a game viewer will aid with understanding individual matches and analysing potential mistakes the AI makes. 4. Results 4.1 Solving the game of Light Cycles Solving a game involves being able to correctly predict the outcome from a position if both players play perfectly. To play perfectly in Light Cycles would be to play the move from which you can guarantee a victory, or at least a draw. To know this, the game tree would have to be constructed to the end of the game, to reach leaf nodes of win, lose or draw. Evaluation functions can’t be used as they only provide an estimate of the value, unlike the true leaf nodes of the tree which give the Page 25 of 43
actual value. Table 4.1 shows results from how long it takes a minimax with alpha-pruning AI to reach the bottom of the game tree for Light Cycles on an arena of increasing size, in addition to the total number of nodes that were generated. When the arena was sized 9x9 the program was run for two hours and then stopped. when The exponential nature of increasing the size is clear, demonstrating the need for evaluation functions to estimate the value of a node on large sized arenas.
Arena Size
Nodes Generated
Time taken (seconds)
4x4
460
0.004
5x5
8368
0.024
6x6
160387
0.092
7x7
6856663
3.358
8x8
577202409
31.5304
9x9
?
7200+
Table 4.1: How increasing arena size exponentially increases processing time 4.2 Alpha beta prunings improvement over basic minimax The implementation of alpha-beta pruning leads to improvements over a basic minimax implementation. For the arena shown in figure 4.1, table 4.2 shows the total number of nodes generated by a minimax implementation and the time it took, going 12 deep, in comparison to an implementation that included alpha-beta. 63% more nodes were generated in the minimax only implementation and the processing took a second longer, demonstrating the improvements that pruning gives.
Page 26 of 43
Figure 4.1: Arena used to test pruning
AI used
Nodes generated
Time taken(seconds)
Minimax
362346
2.936
Minimax with Alpha-Beta pruning
229802
1.975
Table 4.2: Comparing minimax to minimax with alpha-beta
4.3 Selecting difficulty levels for human opponents The design was for an AI which had multiple levels of difficulty, with improving performance, as well as being able to make moves in a short lengths of time, allowing for the game to have multiple game speeds. Table 4.3 shows the results from testing how changing the depth level of the game tree increased computation time. 150 games in total were run between the different levels and the maximum time Page 27 of 43
to make a single move was recorded. The arenas used were the same ones used in the later evaluation sections. The maximum speed that the game runs at is 5 moves a second, therefore if both players are an version of the AI the maximum expected processing time for a move would be 100ms, one tenth of a second. The table shows that the highest depth that can be processed in 100ms is 6, and therefore a depth of 6 will is used for the hard difficulty. Depths of 2 and 4 are used for the depth in the easy and medium difficulty respectively. There were not enough human users of the software to gain a good understanding about how the levels of difficulty affected the win-rate. Therefore the levels of difficulty are subjective in terms of performance against humans. Evaluation will be run to gauge whether the increased depth made much of a difference in performance.
Depth of game tree
Maximum Time taken single move(ms)
2
1
3
6
4
22
6
85
7
209
8
860
9
2114
for a
Table 4.3 maximum time taken for a single move over 150 games at different depths of a Minimax game tree
Page 28 of 43
4 .4 Interacting with final software
Figure 4.2: Screenshot of the starting screen for the Light Cycles game Figure 4.2 shows the screen the user is presented with when launching the game. There are three ways in which the user can proceed, with the first being to start playing player vs player. Secondly, they can change the input of both Light Cycles, with options of ‘Player’, ‘AI’ and ‘Bot’. Thirdly, they may change the speed of the game or customize the game arena. When the ‘Customize Arena’ button is pressed, the section expands presenting the user with the options, shown in the following screenshots. It should be noted that buttons that can’t currently be pressed are greyed out, such as the pause button in this screenshot, and options which change the arena whilst the game is playing, as shown in Figure 4.3. The start button is larger as it will most likely be the button that is interacted with the most with by the user. Figure 4.3 shows what the game looks like whilst two players are playing and the game is paused. The game message at the top shows the status of the game and gives instructions to the player when different parts of the interface are interacted with. It is possible to change speed during a game, however changing the input methods is disabled. As shown by the stop and pause buttons, the display labels change depending on the state of the game. The options for changing the arena are to enable/disable walls, randomize walls, clear walls and change the size of the arena. The size can be changed from 6 to 100, as any larger would result in the Light cycle images being compressed beyond recognition. In addition to this, clicking on the arena will toggle a wall in the cell that is interacted Page 29 of 43
with, note that to ensure symmetry a wall is placed in the same position on the opposite side of the arena.
Figure 4.3: Screenshot of game paused in player vs player mode
Figure 4.4: Screenshot of the options for AI Light Cycles Page 30 of 43
Figure 4.4 shows a view of the options available for customizing an ‘AI’. The user has the choice of ‘advanced’ where they can select the desired depth into the game tree that the AI will go to, or a simple view, where they can just select a difficulty from easy, medium, or hard. In addition there is an option to toggle the end-game space filler for the AI.
Figure 4.5: Screenshot of the ‘Bot’ menu with arena walls Figure 4.5 shows what the arena looks like when populated with walls and with the Bot menu visible. A user will be able to select a bot from numerous bots available in the ‘bot’ directory, and are able to add new ones by adding bots to the directory. The rotational symmetry of order 2 is evident in the generation of the arena walls. 5. Evaluation 5.1 Selecting arenas for evaluation The rules for the google challenge stated that “The maps will typically range from 15x15 to 30x30” [12] therefore it would be sensible to use this range when generating arenas to use for evaluation. Arenas with no internal walls will be used, along with arenas that have a lower and a higher frequency of internal walls, so that performance can be evaluated in regards to initial obstacles.
Page 31 of 43
A number of variants of each arenas size and frequency were used as to give a wider range of arenas used. Odd sized arenas were not used due to the arena generation method always leaving the middle square without a wall. Due to its position in the middle, most games would involve the players heading towards this position, almost always resulting in a collision at the start of the games. In total 150 arenas were generated, ranging from size 10-28, with frequency of internal walls being either 0, 1/6 or 1/8, with 4 variants of each. Each arena will be used twice, with the starting positions of the players switched. 5.2 Easy, Medium and Hard difficulties Tables 5.1 and 5.2 shows the results from the three levels of difficulty playing against each other.
Easy wins
Medium wins
Draws
18
218
64
Table 5.1: Easy vs Medium difficulty
Medium wins
Hard wins
Draws
54
124
122
Table 5.2: Medium vs hard difficulty This shows a clear increment in performance as the difficulty level increases. 5.3 Evaluating against Google bots - Deciding level of AI The characteristics of the AI that should be selected to be evaluated against the bots of the Google challenge should be one that does not the break the rules given by the challenge, specifically not going over the processing time of one second to make an individual move. Table 4.3 shows the results of some tests ran to determine the time taken to make a move at different. A depth of 8 is selected to use in the evaluations as it keeps below the one second processing time, one of the rules for the Google challenge. Evaluation will be run on with an AI that used an endgame space filling strategy as well as one which doesn’t, to see the difference in performance that adding an endgame achieved.
Page 32 of 43
The graph in figure 5.1 show the performance of these two AIs against each other. Although the AI which used an endgame won more games, it’s interesting to point out that when the arenas were larger performance dropped, indicating that increasing the arenas beyond that of the evaluation arenas would result in the endgame AI ultimately losing more than the other.
Figure 5.1: Results from AI with endgame vs AI without separated by size The graph in Figure 5.2 splits the games based on the frequency of walls, it’s interesting to note that every game on an arena that was free of internal walls led to a draw.
Page 33 of 43
Figure 5.2: Results from AI with endgame vs AI without separated by frequency of walls 5.4 Selecting opponent bots The source code for a number of bots is still available online. In order to evaluate how the AI performs against these bots, we’ll select the lowest scoring bot available, and move up the table whilst our AI still performs well, so that an accurate final position can be concluded. The bots that were selected were ExperimentGarden[9], luv2run[10] and dmj[11].
5.5 Evaluation results
Opponent (position)
AI wins
Opponent wins
Draws
ExperimentGarden (196)
290
4
6
luv2run (78)
133
15
152
dmj (37)
24
99
177
Table 5.3: Without endgame
Page 34 of 43
Opponent (position)
AI wins
Opponent wins
Draws
ExperimentGarden (196)
280
6
10
luv2run (78)
124
26
150
dmj (37)
26
98
176
Table 5.4: Without endgame 5.6 Evaluation against ExperimentGarden ExperimentGarden is a bot written in Java that was placed in position 196. Our AI wins the vast majority of games, however it would be interesting to see what happened in the two games that were lost without an endgame. In both of these arenas, shown in figure 5.3 and 5.4, ExperimentGarden was the blue LightCycle. In figure 5.3, it appears that an articulation point caused the loss, there is an articulation point (marked with a yellow cross). The Voronoi heuretic would have miscalculated the number of squares that our AI could reach. In figure 5.4, it that the space could have been filled better by our orange Light Cycle, demonstrating the improvements adding an endgame adds to the AI.
Figure 5.3: Match against ExperimentGarden on Arena 21 Page 35 of 43
Figure 5.4: Match against ExperimentGarden on Arena 29
5.7 Evaluation against luv2run The bot luv2run was written in lua and came 78th in the challenge. The graphs in Figures 5.5 through to 5.8 show the matches split by size and frequency.
Page 36 of 43
Figure 5.5: Results from AI without endgame vs luv2run separated by size
Figure 5.6: Results from AI with endgame vs luv2run separated by size
Page 37 of 43
Figure 5.7: Results from AI without endgame vs luv2run separated by frequency of walls
Figure 5.8: Results from AI with endgame vs luv2run separated by frequency of walls
Page 38 of 43
5.8 Evaluation against dmj The bot dmj came in 37th. The graphs in Figures 5.9 and 5.10 show the matches split by size for an AI with endgame and without. Figures 5.11 and Y5.12 show the matches split by frequency.
Figure 5.9: Results from AI with endgame vs dmj separated by size of arena
Figure 5.10: Results from AI without endgame vs dmj separated by size of arena
Page 39 of 43
Figure 5.11: Results from AI without endgame vs dmj separated by frequency of walls
Figure 5.12: Results from AI with endgame vs dmj separated by frequency of walls It is clear that the dmj bot performs better than the AI developed in this project.
Page 40 of 43
6. Conclusion The evaluation shows that the endgame method used does increase performance on smaller maps, but causes a drop in performance on larger maps. This is most likely due to the exponential nature of an exhaustive search for the route-finder method. If this method is used, a better performing AI would only use it when the arena sizes were smaller. The addition of walls led to more interesting results, as shown by the matches which used walls mostly ending in draws. Draws between players dropped significantly when the size of the arenas were increased. Further evaluation and more matches and variants of arenas would have given more conclusive results, however each match takes a reasonable length of time to run and it was not possible to run more due to time constraints. From the bots which were evaluated against, our AIs performance would place it in between 37 and 78, which is inside the top 10% of bots that entered the Google challenge. 7. References [1] http://tron.aichallenge.org/ Google AI Challenge Date referenced: 30/4/14 [2] COMP34120 Lectures Notes, Andrea Schalk 2013 http://www.cs.man.ac.uk/~schalk/34120/index.html [3] Irving, Geoffrey, H. H. L. M. Donkers, and J. W. H. M. Uiterwijk. "Solving kalah." ICGA Journal 23.3 (2000): 139-148.
Page 41 of 43
[4] Campbell, Murray, A. Joseph Hoane Jr, and Feng-hsiung Hsu. "Deep blue." Artificial intelligence 134.1 (2002): 57-83. [5] http://www.a1k0n.net/2010/03/04/google-ai-postmortem.html Google AI Challenge post-mortem Date referenced: 30/4/14 [6] http://forums.aichallenge.org/viewtopic.php?f=8&t=319&start=10#p1568 Google AI challenge forums Date referenced: 30/4/14 [7] Ben Shneiderman Designing the User Interface Prentice Hall 1987 [8] http://www.colourlovers.com/palette/1406402/Tron_Legacy_2 Tron palette Date referenced: 30/4/14 [9] http://www.experimentgarden.com/2010/02/google-ai-challenge-final-java-tron-bot.html ExperiementGarden write up Date referenced: 30/4/14 [10] https://gist.github.com/anonymous/317677 luv2run source code Date referenced: 30/4/14 Page 42 of 43
[11] https://bitbucket.org/dmj111/tronbot/ dmj source code Date referenced: 30/4/14 [12] http://tron.aichallenge.org/contest_information.php AI challenge contest information Date referenced: 30/4/14
Page 43 of 43