In this lab you will be writing agents that use depth-bounded Minimax search with Alpha-Beta pruning to play Nim, Mancala and Breakthrough.
In Nim, players take turns grabbing sticks from a pile, up to three at a time; the objective is to be the one who takes the last stick in the pile.
In Mancala, players take turns grabbing all of the stones from one house on their side of the board and sowing them counterclockwise. The objective is to end the game with the most pieces in one's scoring house.
In Breakthrough, players take turns advancing pawn-like pieces on a rectangular board. The objective is to get a single piece to the opponent's end of the board. The examples below show a mid-game board state from each game.
There are 6 sticks in the pile: |||||| ----------------------------------- | | 0 | 7 | 1 | 0 | 1 | 0 | | | 13 |-----------------------| 14 | >| | 0 | 1 | 3 | 3 | 5 | 0 | |< ----------------------------------- --------- |· · · · ●| |· · ● · ●| |· ● ● ● ●| |● ● · · ·| |● · ● ● ●| |· · ● ● ·| |· · · · ●| ---------
The following Wikipedia pages have complete rule sets for each game. Note that Mancala has many, many variants, so if you have played it before, you might have used different rules.
As in the previous lab, use Teammaker to form your team. You can log in to that site to indicate your partner preference. Once you and your partner have specified each other and the lab has been released, a GitHub repository will be created for your team.
The objectives of this lab are to:
You will need to modify these files:
You should also look over the following files:
To see the available command-line options for game play try running:
./PlayGame.py -hThen try playing Breakthrough against a really terrible opponent by running:
./PlayGame.py breakthrough random human --showAnd try playing Mancala against your lab partner by running:
./PlayGame.py mancala human human --showTry playing several games (and refer to the Wikipedia links above) to make sure you understand the rules of each game.
Open the files Nim.py, Mancala.py and Breakthrough.py to review the methods that have been provided. Notice that two of the games represent the board as 2-dimensional array of integers, and the 'turn' is represented as either +1 or -1, but otherwise they have different internal semantics.
In Nim, the state-of-play is just the number of sticks in the pile, which can be stored as a single integer. In Breakthrough, the board is a 2D array where blank spaces are 0s, the first player's pieces are +1s, and the second player's pieces are -1s. In Mancala, the non-scoring houses are represented by one array, and the scoring houses are represented by another.
These games provide several methods and attributes that you should make use of in your search:
Open the files BasicPlayers.py and MinMaxPlayers.py to see how we will be implementing game-playing agents. The HumanPlayer and RandomPlayer classes are provided to make your testing easier.
All players must implement a getMove() method that takes a game instance representing the current state and returns one of the legal moves from that state.
In the file StaticEvaluators.py, implement the basic static evaluation methods for both Mancala and Breakthrough.
Your static evaluators should:
The comments within each basic method explain how you should evaluate the boards. At the bottom of the file there is a testing section. Some test code has been provided for the Mancala basic static evaluator. Add your own test code for Breakthrough.
Once you are confident that your basic static evaluators are working you can move on to implementing the search itself. Save the better Breakthrough evaluator for later.
Next focus on the MinMaxPlayers.py file, and complete the following steps:
In the file MinMaxPlayers.py, implement the MinMaxPlayer.getMove() method which should run a helper method to conduct a bounded Minimax search (the pseudocode is given below). Minimax will return the best value found for the current player. However, we need to determine the move associated with the best value found. To do this, the Minimax search will update a class variable to represent the best move found. This move is what you will need to return from the getMove() method.
bounded_min_max(state, depth) if depth limit reached or state is terminal return staticEval(state) # init bestValue depending on who's turn it is bestValue = turn * -infinity for each move from state: determine the next_state # Recursive call value = bounded_min_max(next_state, depth+1) if player is maximizer if value > bestValue bestValue = value if depth is 0, update bestMove to current move else # player is minimizer if value < bestValue bestValue = value if depth is 0, update bestMove to current move return bestValue
Thoroughly test Minimax. During testing, add print statements during single games to track the values coming back from recursive calls. Remember that positive values are good for the maximizer and negative values are good for the minimizer.
It is strongly recommended that you do your initial testing using the game of Nim, since it is a smaller/simpler game and therefore it's easier to see what's going on if you have a lot of print statements. It's also small enough that you can construct search-trees by hand to check your work. Remember that you can control the initial size of the pile using command-line arguments in order to make the tree bigger or smaller; e.g. to have the starting state contain only 4 pieces, you could do:
./PlayGame.py nim minmax random --show -game_args 4Note that if you don't specify a depth-bound on the command line, PlayGame.py will use a default depth bound of 4. If you actually want to try to force it to build the entire game tree, you'll need to manually put in a large number here; however, keep in mind that this may take a very long time. It's fine for Nim (assuming a reasonably small starting pile size) but don't expect to expand the entire tree for the other games in a reasonable timeframe.
Once you are confident that Minimax is working properly, remove the print statements and do more extensive tests using repeated play of the more complex games. With a depth limit of 2, your MinMaxPlayer should win the significant majority of games of Breakthrough or Mancala against a random player. In general, an agent with depth D should lose or at best tie against an agent with depth D+2.
You can set the depth using the -d1 and -d2 arguments (for player 1 and player 2 respectively). By default the depths are set to 4. The following command plays a game between two minmax agents where player 1 uses depth 2 and player 2 uses depth 4:
./PlayGame.py mancala minmax minmax -d1 2 -d2 4 --showYou can also have your agent play several games (alternating sides) and report the results:
./PlayGame.py mancala minmax random -d1 2 -games 10You should also try playing against it yourself!
./PlayGame.py mancala minmax human -d1 4 --show
Next, you should implement Minimax search with alpha-beta pruning in the PruningPlayer class using the pseudocode we discussed in class. Note that alpha-beta pruning should always return the same moves that Minimax would, but it can potentially do so much more efficiently by cutting off search down branches that will not change the outcome of the search.
You should make sure that your agents are exploring moves in the same order and breaking ties in the same way so that you can check the correctness of alpha-beta pruning by comparing it to standard Minimax. You can run two games between your pruning and Minimax players as follows:
./PlayGame.py mancala minmax pruning -games 2Note that these agents should be equally matched; pruning should just make decisions faster.
Once again, it is recommended that you do your initial testing using Nim (with a lot of print statements) so you can check your work and debug things easily; once you're reasonably confident things are working, then move on to the harder test cases using the games with higher branching factors.
Come up with an improved static evaluator for the breakthrough game and implement it in the file StaticEvaluators.py.
Your goal is to create an evaluator that will beat the basic evaluator function when tested using minimax players with equal depth limits.
Write a clear and thorough comment with your betterEval
method to describe how it works. If you add helper functions, be
sure to include comments describing these as well.
You can tell your agent which static evaluator to use via
the -e1 and -e2 command line options. The default
is to use the basic evaluator. The following command plays 2 games
with agents that both search to depth 4, but uses different static
evaluators:
./PlayGame.py -games 2 breakthrough pruning pruning -d1 4 -d2 4 -e1 better -e2 basic
Player1 is using the better static evaluator so should win both games.
NOTE: If for some reason your alpha-beta pruning is not working, you can also test the better evaluator using minmax for both players.