General > Tutorial, training, online play

Any info on how the AI works?

<< < (3/3)

Jimmy V.:
Thanks for your input Christian. Interesting thoughts.

Anne M.:
I just wanted to say thank you for such a great response! I've only been playing for a few weeks, but this gave me a great idea of how the bots work without giving too much away.  I've done some work with stochastic decision processes and I couldn't figure out how the bots were computing so quickly.  Thanks for the great read!

Jimmy V.:
Great to read Anne. Thanks!

Not it is your time to teach me something. How do stochastic decision processes work? What did you apply this on?

Anne M.:
Sure! You may know them as Marcov decision processes - essentially it is a decision support framework for games (or anything else) that has a random element and a decision element. An objective function is created to optimize the score at the end of the game.  In Lost Cities, the objective function would likely be simple: maximize the differential in the end game scores between the bot and the player. The strict form of MDPs can rely on explicit enumeration of the entire decision space, but this obviously not ideal for large problems.  As a way to get around explicitly naming the entire decision space, there are approximate decision processes (ADPs) that sacrifice optimal solutions for processing time, and I initially thought this might be what the bots are using.  Thinking this through now though, I think it would be hard to use MDPs here because you have to adhear to the "memoryless" property - ie the probably of something happening cannot depend on what happened in the past. Clearly it would take a lot to force Lost Cities into that structure.

Typically we see MDPs where policies can be set based on the current state. For example, given the current rainfall and weather patterns, how many crops should be planted?

Elle D.:
I suspect the real trick to differentiating the difficulty of playing with different bots rests on the the fact that there is not really a fixed draw pile or a fixed opponent/bot hand. In other words, the only cards the human player knows of are those that have been observed in their hand or on the board. The computer algorithm can select any possible remaining non-observed card to be the next card played onto the board by the bot, and can select any possible remaining non-observed card to be the next one you draw in each subsequent turn. Think about it - it’s so much easier to control the experience the human player will have. The aim is to make each game a 50/50 outcome for the human player, revealing a more or less challenging starting deal to the human player, and then determining the sequence of subsequently revealed cards as the game progresses, based on a player’s performance.

This is why the experience is so different when you face human opponents.

It is so clever to do it this way!!


[0] Message Index

[*] Previous page

Go to full version