Navigation Fritz Dooley Home Mancala Home Mancala Game Rules Mancala Resources Mancala Game Theory Thesis Introduction Terminology Structural Analysis of Mancala Human versus Computer Mancala Mancala Strategies Best Opening Move in Mancala Conclusion Exhibit 1: Rules of the Game Exhibit 2: Number of Game States email me: firstname.lastname@example.org
Best Opening Move in Mancala
Having given thought to soft strategy, we were now prepared to identify an opening move as a candidate for our computer test, to see whether we could prove it as a “sure win” strategy by backward induction and our recursive computer program. After several hours of playing the actual game, we stumbled across a particular opening that seemed especially robust. (Note that, though there are only six bins, there are a total of ten possible first turns, as the opener goes again if she plays from bin C.)
Our candidate for the ideal opening is the sequence CF, which leaves the board configured as in Figure 19.
This opening has the following strengths:
§ It empties bin F early in the game, allowing Mi to play a single pebble from F into her mancala each time a single pebble lands there.
§ It empties bin F at a time when at least two of the pebbles from F (those landing in J and I) will likely eventually come back around the board to Mi’s side. Furthermore, if Yo makes the defensive G play, all five of the pebbles from G are ultimately likely to wind back up on Mi’s side. For example, after playing G, Yo has six pebbles in H. If she plays H on a subsequent turn, the sixth pebble in H will land in A, back on Mi’s side.
§ It allows Mi to play from A, B, C, and F without repopulating Yo’s side of the board. It therefore sets Mi up for an effective “starving” strategy.
§ Because of the large number of pebbles in each of Yo’s bins, it is unlikely that Yo will be able to counter with a stalling or starving strategy herself.
§ Most importantly, it puts Yo in a defensive position at her very first move of the game. Yo has an immediate glaring vulnerability to being raided in bin G, and must therefore seriously contemplate the defensive G or HG responses. Either of these responses, however, leaves her in the weakened board position of having each of her remaining bins populated with enough pebbles to play around to Mi’s side, thus reinforcing a starving strategy by Mi.
We further refined the opening strategy by proposing that, if Yo plays I, J, K, or L, Mi responds with a raid from A. If, instead, Yo plays the defensive G or HG, Mi responds by playing E. This is in order to unload E before it accumulates too many more pebbles, and gives her yet another bin from which she can play as she attempts to starve Yo. It places the pebbles on Yo’s side in such a way that D should be safe from a raid in the near future, though Mi will have to be careful of a looping play by Yo. Mi will attempt a hoarding strategy with bin D.
We had thus significantly narrowed the portion of the tree which we would need to test with the computer in order to prove CF-?-[A/E] as a sure-win opening.
Rick and I then set up six computers to begin processing all of Yo’s responses to the CF opening, restricting Mi’s second turn to A or E depending on Yo’s first turn. Yo had nine possible responses to CF, namely G, HI, HJ, HK, HL, I, J, K, and L. By this time Rick had upgraded the program with a counter feature that showed, for each branch of the tree explored, how many terminal states the computer reached in solving that branch; i.e., how many games the computer had to play in order to determine the character of a move. We discovered the computer was processing millions of games in brief seconds, and billions of games per hour. Hope mounted.
After four days of processing, the computers had tested some 500 billion games and had shown that the CF opening was indeed a sure-win strategy for most responses by Yo. Only two of Yo’s responses remained unproven. Then Rick discovered a bug in the way the program evaluated a terminal state, and all our testing thus far was invalidated. That was not so bad; we still had time to recover from a four day set-back. Rick corrected the bug immediately. The discovery was more disconcerting when we found that the corrected program had a much slower execution speed. Branches of the tree which had taken minutes to solve previously were now taking hours, and some which had taken hours were now still unresolved after days. Our only hope would be to seek to build more intelligence into the program.
We had already observed that the program in its present form spent a lot of its time exploring branches of the tree that would never be played in a real game. For instance, if Mi has the option of winning the game in the next move by playing A, or extending the game by another fifty turns by playing from B, the computer might spend hours processing B, C, D, E, and F before coming around to spotting the easy win of A. If we could define some simple rule by which the computer would eliminate “stupid” moves, or at least attempt more promising moves first, without slowing execution speed too much by adding an evaluation step to each node, we might see significant performance improvement.
We had observed that a starving strategy is often effectively employed in the last few moves of the game. Rick added a feature to the program by which the computer could be instructed to test all moves for both players up to depth n, and then only to test moves beyond n which did not result in Mi (or Yo, as specified by the user) placing additional pebbles on the opponent’s side. We called this a “reduced game.” The player whose plays were thus limited is called the “bold player.” If the computer then returned “win” for the bold player, it would be a definitive win. If it returned “lose,” the game would be inconclusive; perhaps the bold player could have won by making a play beyond depth n which placed pebbles on the opponent’s side. The program could then be run again, setting n to a higher value, or testing the same branch of the tree with the other player being bold. This required more hands-on manipulation, but did speed the processing of many branches of the tree substantially. A later refinement caused the computer to return not “lose” but “????” if the bold player did not have a decisive win, to help distinguish definitive losses from “you should have tried harder” losses.
After another week or so of running tests and refining the program, we had the happy results: CF is, indeed, a sure-win strategy for Mi as the opening player according to the backward-induction method, taking the full game tree in scope. Furthermore, our strategy of playing E as Mi’s second turn if Yo plays G or HG, and A for a raid if Yo plays anything else on her first turn, was also proven. We had scored a major victory in our experiment.
A few nagging questions remained. Just because we proved a single opening strategy was a sure-win by backward induction didn’t necessarily mean there was any correlation between opening strategies that looked strong from a soft-player perspective and those that were backward-induction winners. We only had a sample size of one. Furthermore, I still wondered about the relationship of these two distinctly different approaches to the third approach of look-aheads.
Time for the analysis was running out, and I was preparing to settle for the findings we had made thus far, when Rick emailed me a jewel – a new program that allowed a user to play out a game of mancala with the computer interactively, with a built-in “look-ahead” function (see Exhibit 4 for instructions on downloading and running Mancala.exe). The look-ahead function could be set to any depth of turns, but execution speed made any depth beyond nine impractical. The look-ahead function returned an integer for each of the available next moves, showing the advantage of the current player at the specified depth, assuming each player made her best available moves and that both viewed payoffs at the same level of depth.
Now we could test correlation among the three approaches to the game. Furthermore, if look-aheads did correlate to backward-induction strategies, then the look-ahead computer program could give us hints to helping the computer solve backward induction games more efficiently. For example, if we wanted the search program (backward induction) to test a particular branch of the tree, we could use the look-ahead program to find the best candidate for a sure-win. We could then tell the search program to test that best candidate move first.
The results were quite surprising. After a weekend of running such back-and-forth tests, using the look-ahead function to find a best candidate and the search program to prove it, I was able to solve conclusively all but two of the ten opening moves of the game. The remaining two are still slowly in process. The following table shows the results:
Very surprisingly, of the eight openings solved, I found perfect correlation between winners and players who have positive advantage using 10-deep look-aheads. Even more surprisingly, the opening move we had identified as strongest according to soft strategy turns out to yield the highest advantage of all the possible opening plays, using nine-deep and ten-deep look-aheads! Also, the two openings which have proven hardest to solve (CB and CD) have nine-deep advantages near zero. The opening move D, which was finally proven to be a sure-win for Yo, was the third hardest opening to solve, and has a nine-deep advantage of zero.
After using the look-ahead function extensively to help the computer “guess” which move to attempt first as I solved the remaining opening plays, I continued to discover high correlation between look-ahead values and backward induction winners. (I used the look-ahead function not just for the opening move, but often several moves deep into the game when the search program seemed to have particular difficulty getting beyond a given node.) As expected, there were exceptions and discontinuities, but I discovered that the higher the look-ahead advantage of a given move, the higher was the likelihood that that move was a backward induction winner.
This implies some level of “stickiness” or persistence of advantage through the game. Though much more could be done to study this, I wanted at least to see a plot of look-ahead advantage of the opening moves through the first ten levels. The following table shows the values:
A graph provides a more meaningful way to observe the relationships:
This graph provides unfortunately only one data point – a view from only one node in the game – and thus any general conclusions drawn from this graph about persistence of advantage at all nodes throughout the game would be invalid. However, the pattern, even at this one data point, is interesting to observe. At this single point at least, it does appear that advantage seems to be not only persistent, but largely cumulative (“the rich get richer, the poor get poorer”).
(c) 2008 by Fritz Dooley