Giant Battling Robots: probability

Showing posts with label probability. Show all posts

09 February 2012

Conditional Dice Rolls with Partial information

Last time I wrote about Conditional Dice Rolls, or using conditional probability to correct the results of an earlier roll that was made incorrectly. The rest of this will make more sense if you read that first.
To do this you need to know 2 things --- You need to know what the probability of success for your first roll, and what that probability should have been. But wait, there is one more thing, an important assumption I rather glossed over last time --- You need to have completely forgotten everything about the first roll except that you succeeded or failed.

And that's harder than you might think. ---> More after the fold --->

Conditional Dice Rolls

It happened to me again at the Wednesday Battletech game, one of those honest mistakes that can happen in a complicated situation. My Griffin fired two medium lasers at a Jenner needing 7's or better, and both hit. A short time later I realized my mistake, I had miscounted and the roll should have been 8 or better. I could not remember what the original rolls had been.
There is a way to fix this, to repair my mistake, to determine if my shots really should have missed, and that is fair to both players. It's a good trick, and you don't even have to do any math to use it.

But you might have to roll a lot of dice. ---> More after the fold --->

Sicherman's Dice

There is something different about these dice - can you spot it? I'm guessing you'll get it right away ...

Image found at Chuck-A-Con *

You can't see the non-facing sides, but the d6 on the left is labeled with 1-2-2-3-3-4, and on the right labeled with 1-3-4-5-6-8 (like this). That's not our standard 1-2-3-4-5-6, and if someone rolled these on the gaming table the 8-pip is a dead giveaway that something is off.

Now here's the trick: The probability distribution for the sum of these Sicherman Dice is identical to the distribution of the standard 2d6, so if you only see the results (the sum) there is no difference at all.

The mathematics for this gets into Generating Functions and Combinatorics, but essentially it involves doing the algebra to show that:

(x + x² + x³ + x⁴ + x⁵ + x⁶)² = (x + 2x² + 2x³ + x⁴)(x + x³ + x⁴ + x⁵ + x⁶ + x⁸)

Where the left-hand-side is the generating function for the sum of two standard 6-sided dice, and the right-hand-side is the appropriately factored generating function for the sum of Sicherman's dice. (OK, maybe a little harder than that.) There is only one way of doing this with 6-sided dice, but such variations exist for other polyhedral dice. It seems to be possible in general to do this with N-sided dice, and there might be multiple ways of doing this for some. The Mathematics Magazine article "Renumbering of the Faces of Dice" by Duane Broline (1979) goes into some detail, but I cannot access the full article from home. If I can grab it at work maybe there will be an addendum.

The Hard Way

I tried this working out possible numberings for Sicherman-type 2d8 dice by scribbling with pencil and paper until I found a combination that worked. On my third-and-a-half attempt I came up with 1-2-2-3-3-4-4-5, and ~~1-3-4-6-6-8-9-11~~. CORRECTION: TPC checked more carefully than I did, and offers 1-3-5-5-7-7-9-11 in place. ~~It took me a while to work this out "the hard way", but~~ it was probably still faster than I could have factored a 16^th-order polynomial**.

[Hat-Tip to The Endeavor/John Cook. Again!]
[As seen on Eon.]
* Sicherman Dice are available from Amazon, or directly from Gamestation - Gamestation.net is likely the original source for the image I used above.
** "Dammit Jim, I'm a statistician, not a combinatrician!"

12 December 2010

Granular Skill Checks, and Interpolation

I just read a discussion in the CBT forums (Thanks PiP!) about the "granularity" of skill rolls in Battletech. The basic problem is that skill level changes in the boardgame make a BIG difference in play, but if you are running A Time Of War (Battletech roleplaying) then you want may small changes so that characters can improve gradually in many small steps - as opposed to 3-steps to a Superman.

I'll keep this discussion about the Battletech RPG, but these comments should apply to any game where characters have skill levels that seem too granular. I originally posted this whole thing in the CBT forums, then UN-posted it because what I had was broken. Now it is fixed, but much longer, so I hope nobody minds that I'm linking back to myself.

One way of doing this would be to use a different random distribution for skill checks. The second editions of the Battletech roleplaying game (Mechwarrior) uses 2d6 rolls just like the boardgame, and all skill were very granular. The 3rd edition of the game made a switch to 2d10 "exploding" dice, which greatly reduced the granularity problem, but suffered because it was difficult to make meaningful improvement in character skills.

There are also various "house rules" for doing the same sort of thing, but these generally require changing other aspects of the game to balance out the change in probability distribution. For instance you might switch to a 2d10 or 4d6 to-hit roll. There is less granularity now, which is good for your RPG, but the game has changed! On this new scale a +1 or -2 modifier will have relatively less effect or results, potentially "breaking" the usual balance of the game. You might fix this by adjusting all these modifier, but you won't ever get the original balance back this way.

I have an alternate suggestion: Add a decimal point to the skill levels, and an extra 1d10 roll when rolling for a skill check. Differences between Battletech skill level are BIG changes, so the idea is to add steps in-between. For example, instead of Gunnery 5 and 4, some possible skill levels are 5.0, then 4,9, 4.8, 4.7, 4.6, 4.5, 4.4, 4.3, 4.2, 4.1 and finally 4.0. Likewise any skill level, just adding a fractional skill level to it. There is a word for this - "Interpolation". We can interpolate between whole number skill levels, filling in with smaller changes in probability.

To use this, calculate the target number (TN) normally adding the skill level and any modifiers, and round the final number down. Make the usual 2d6 roll;
if this is less than the TN, you fail;
if this is more than the TN, you succeed;
if you roll exactly the target number, then you must also roll the 1d10, read it as 0-9, and this must be equal or greater than the decimal in your skill to succeed (if the decimal is "0" then this always succeeds, no need to roll).

Example: Suppose the base gunnery skill is 3.6, and after various modifiers the target number to-hit is 8.6, which rounds to 8. You roll 2d6 and ...
On a 9 or better you hit,
On a 7 or less you miss,
On exactly 8, you roll 1d10 (0-9), and if this is a 6,7,8, or 9 then you hit, otherwise you miss.

With TN=9 probability of success would be 0.278, and TN=8 it would be 0.417. The effect of the decimal in the skill level and the 1d10 die roll is to interpolate, or smooth out, between those two probabilities. The final probability of success for a TN of 8.6 is 0.333.

This gives 10 steps of skill improvement to every 1 in the regular rules, which ought to be fine-grained enough to satisfy the pickiest Game Master. In fact it may be too fine, and you might want to restrict it to just 5 steps (.0, .2, .4, .6, .8) or even 2 (.0, .5). Further, you will need to adjust the experience needed for fractional skill improvements accordingly. If it cost 100 experience points to improve Gunnery skill from 5 to 4, then it should cost about 10 to improve from 5.0 to 4.9. Most GM's love to tinker with this sort of thing anyway, so I'll leave the application in your capable hands (or fangs, tentacles, whatever).

Now the really good news - interpolating skills does not "break" any other parts of the game by changing the probability distribution the game is designed on, it just smooths it out, so a +1 or -2 modifier still has the same effect it always did. There is nothing special about using 2d6 with this either, so you might easily apply this to any RPG with granular skills.

Here is a chart with the probabilities of success on a 2d6 skill check. The way I have set this up makes it look a bit like a wavy staircase:

Probability of success on 2d6 with standard skills

You could walk up those steps! And the granularity is obvious. This chart has target number up to 13 because I need to show the probability go all the way down to zero for the next chart. Now a chart showing the same for interpolated "Skills with a decimal":

Probability of success with interpolated skills

Nice and smooth. If you look really hard you might notice this is actually 11 straight line segments joined together. This method of interpolation really just connects the dots between probabilities for whole number skill levels.

Now we have wiped out the granularity problem, but at the cost of some extra dice rolls. If you don't want to roll so many extra dice, you could make a single 1d10 interpolation roll (after fire declaration) and apply it to all skill checks results for that turn. This will be weirdly granular, because it is like changing your skill level randomly from turn to turn, but it will average out to the same effect over the course of many rolls.

10 December 2010

Dice and Information ... So What?

After I finished my previous post on Dice and Information I had the thought, "These are some pretty neat numbers, but So What? What's it good for?"

One way to look at information is as a measure of uncertainty in the game, or at least the uncertainty in the outcome of a single move that is part of a larger game (The information in an entire game, start to finish, is a matter for another day.). Consider the game of Chess, which has nothing random about it. There is no information at all in a chess move, your just make your move, perhaps taking another piece, and it always works. Suppose now we change Chess so that it requires an attack roll if you want to take another piece, with a 50% chance of success (put the piece back where it started if the attack fails). Now there is uncertainty with each move (1 bit of entropy) and the outcome of any move is far from certain. We might change this to 90% success, which works out to about 0.08 bits* (oops. Thanks Bradley!) about 0.47 bits* of entropy per attack. This means that any attack is ~~nearly certain~~ quite likely to succeed. If the chance of success is 10%, then the entropy is again ~~0.08~~ 0.47 bits*, and the attack is ~~nearly certain~~ likely to fail.

So entropy is measuring uncertainty in the sense of the predictability of results, but NOT the predictably of a preferred result, such as a successful attack roll.

* It might help to think of 10% or 90% success as the flip of an unbalanced coin. Information is maximized, and the result is most uncertain, when the coin is fair (50% success).

This post hasn't gone the direction I thought it would - I was thinking (incorrectly) I could describe Entropy as a measure of the uncertainty in the outcome, but this is rather different. Consider a player making three to-hit rolls in a game, at 90%, 50%, and 10% success. The first (90%) has just a little entropy (0.47 bits) and the outcome is quite likely. The player has a high degree of control, because the decision to attack is very likely to succeed. At 50% the entropy of this roll is maximized at 1 bit, and the player will be most uncertain of the result either way. At 10% entropy is again 0.47 bits, but the player is very likely NOT to succeed. Now the player has very little control, or very little influence on the outcome of the game (with this single roll).

Back to the drawing board? The paragraph above hits a few rough spots, because Entropy and player control out outcomes in a game are maybe two different things. This gives me something new to think about.

...

And before I can finish hammering the bugs out of this post, Ashley has her response up over at Paint-it-Pink. At risk of quoting Ashley out-of-context ...

Ashley: Now when one applies modifiers to a 2D6 roll, if one knows that it only encodes 3.27 bits, then a modifier of one is equivalent to one bit. If I'm correct then a modifier of three is equal to three bits of information, which has the effect of reducing surprise in the diced for result? Such modification of the base 2D6 roll is therefore highly significant, which seems to me to support my proposition about the modifiers for targeting computers and pulse lasers in the Battletech game being too coarse.

Not quite right, but Ashley's intuition is basically correct. We need to sort this out a bit. First, the modifiers she mentions are for a Battletech "To-Hit" roll with Hit-or-Miss outcomes, and this is the same situation as flipping an unbalance coin. As in my "battle chess" example above, modifying this roll doesn't necessarily decrease the information. Second, the 3.27 bits a for a 2d6 hit-location roll** is separate from the To-Hit roll. However, we might combine the To-Hit and Hit-Location rolls by considering a "miss" to be a no-location result and grouping it with actual location results.

** Irrelevant quibble: this is actually about 3.0, because there are multiple ways to roll "Arm" hits.

Here is a new table, similar to my earlier table where I calculated the Entropy of a 2d6 result, but now the possibility of a miss of a 2d6 roll of 7 or less.

Now suppose the To-Hit roll is more difficult. Ashley's intuition says the entropy ought to decrease. Here's another table with a "12" needed to-hit.

Sure enough, the entropy has decreased. Unlike the battle Chess example, here the Entropy of hit-location (including no-location) will usually decrease with more difficult to-hit rolls (it hits maximum entropy with a to-hit roll of 3+).

Lesson learned: When thinking about entropy, it is important to include all possible outcomes of the random result.

Somehow I think this topic is not done yet, but that is all I have time for today.

A small update (12/12/2010, 4 PM): I just made the following comment on Ashley's blog, and I'm copying it here so I might remember to come back to the idea later.

Now here is a brain bender - Suppose you could roll one set of dice to resolve a whole turn of Battletech play, or a whole game - How much information would be in that? I don't know myself, but I'll think on it. My intuition is a simpler game should have less total information than a complex one, but I'm not sure yet what it would mean to compare them.

More:
Dice and Information
More Dice, More Information ...

07 December 2010

Dice and Information

There is a concept is statistics and the information sciences of information. Several concepts actually, as there are different types of information, but I want to focus specifically on Shannon information or Entropy. Entropy is a way of measuring the amount of variability or uncertainty in a probability distribution, and a simple way to illustrate this is with the example of a coin flip.

But first, a comment of notation, since the Blogger editor is not too equation friendly. Calculating entropy requires a logarithm function, usually denoted ln(x) or loge(x) for base-e or natural-log, and Shannon Information specifically uses a base-2 logarithm, which I denote here as log2(x). If my equations are not clear, any mention of the log function (outside this paragraph) always means the base-2 logarithm. If you are following along with a calculator, you probably have a natrual-log button ln(x), but can calculate the base-2 log as log2(x) = ln(x)/ln(2).

Image source, and quite interesting in itself.

Assuming a fair coin with a 0.50 probability of heads or tails, then the first step is to calculate a quantity called the Self-information or "surprisal" of all events. This is a measure of how surprising a given event is relative to the other possible events in the distribution. This less likely the event, the higher the value of its surprisal.

Surprisal is equal to -log2(p), where p is the probability of a given outcome. Calculating ...

log2(.5) = -1,

-(-1) = 1

... and the NOT so surprising result here is that heads and tails are equally surprising, with a value of 1 each.

Shannon Information is measured in "bits", the basic unit of information used in calculation by computers. To relate this to games it might help to think of one bit of information being equal to the amount of variability in the flip of a coin. Now that we have the surprisal, we can calculate the Entropy as the average or expected value of the surprisal over the entire distribution. This is p times the surprisal -log2(p) of each event, summed over all events. For this example the calculation is trivial; 0.5 times the surprisal of 1 (for heads) plus another 0.5 times a surprisal of 1 (for tails), is just 1, so a fair coin flip has 1 bit of entropy.

Here I have a table representing the information in discrete uniform distributions from 1 to N. In gaming terms this is the information in single N-sided dice, with each face of the die being equally likely as all others. I included all the values representing true polyhedral dice, and some additional values for comparison (most of these are powers of 2 or 10).
The second column p(x) gives the probability of each "face", the third the surprisal, and the forth the entropy.
Here we can see that a 2-sided die (a coin!) again has 1 bit of entropy, a 4-sided die (d4) has 2 bits, a d8 has 3 bits, and a hypothetical d16 has 4 bits, following powers of 2 as you might expect. I put in some extreme values just for fun - the final row, a one million-sided die, would have nearly 20 bits (or 20 coin-flips) of entropy.

As in the example of the fair coin, when all outcomes are equally likely, the surprisal and entropy are equal. This also maximizes the value of the entropy - meaning that if any result was more or less likely than another, the result can only become more predictable, and the value of the entropy must be less, as will be seen in the next example.

For the second example I'm calculating the entropy of the sum of two six-sided dice. This table shows the possible results from 2 to 12, the probability of each result (twice) as the odds-in-36 and a probability. Next (4th column) is the surprisal of each result, and unlike the uniform distributions this values varies with the probability of the outcome. A roll of 7 has a surprisal of 2.58 bits, and a roll of 12 (or 2) 5.17 bits; a 12 is the more surprising result, relatively speaking.
The final column is the surprisal multiplied by the probability, and these are summed to determine the Entropy at the bottom, which is 3.27.

In terms of information, a 2d6 roll is in-between the d9 and d10 rolls from the first table. This doesn't mean they are the same, but that they have a similar amount of variability.

For the third table I have calculated the entropy for some commonly used dice-rolls in games and listed them in order of increasing entropy. The 2d10- designates the difference of two ten-sided dice, as used for penetration damage in Squadron Strike.

Note the entropy of 4d6 is less than twice than of 2d6, and likewise 2d6 is not twice that 1d6. As numbers from single dice are summed, the distribution becomes less uniform, more like the bell curve of the normal distribution, is more predictable, and therefore has less entropy. If we were using two separate d6 rolls to generate a uniform random number between 1 and 36, we should expect the d36 entropy to be twice than of a d6, and it is; log2(1/36) = 5.17. We also see this with the entropy of d100 being twice that of d10.

What strikes me from this is rolls of 1d6, 3d6, and everything in-between, vary by only about 1 coin-flip of entropy, so maybe the many variations of dice used in games really don't make so much difference in terms of the variability of play.

A final note: Just because there might be more information in some combinations of dice does not mean the game takes full advantage of that variability. For instance if you are making a to-hit roll with some probability of success (hit or miss), then there is at most 1 bit of information in that result no matter what kind of dice you roll it with. There are only a full 6.64 bit of information in a d100 roll if there are 100 unique outcomes.

More:
Dice and Information, So What?
More Dice, More Information ...

17 June 2010

It's All In The Cards

A two articles related to Collectible Card Games, with some math and game design aspects.

Gotta Catch 'em All? Just how many of those collectible card pack do you have to buy in order to get a complete collection, or maybe just that one card you are looking for. Not About Apples has a good discussion of the problem:

Getting a Complete Collection.

[Image kotaku.com]

Once you figure out how many cards you might have to buy, you could start to wonder if getting into such a game is a good idea in the first place. Ethan Ham article at GameStudies.org discusses problems and answers to creating balance in a CCG.

Rarity and Power: Balance in Collectible Object Games

For collectible card games (CCGs), game designers often limit the availability of cards that have a particularly powerful gameplay effect. The conventional wisdom is that the more powerful a card is, the more rare it should be. The long-term implications of such an approach can have negative consequences on a game’s suitability for casual play. Digital Addiction (a company that produced online, collectible card games in the 1990s) developed a different game design philosophy for balancing collectible card games. The approach called for the most obviously and generally useful cards to be the most common and to equate rarity to specialization rather than raw power.

If you need more math about math and cards, then John Cook has it for you at The Endeavour:

Chances a card doesn’t move in a shuffle

Finally, this doesn't have anything directly to do with cards, but it's still a good read on understanding probability:

Game Developer Column 9: Playing the Odds

28 February 2010

More Exploding Dice: T&T Saving Throws

I received a request from Christian Lindke, author of Cinerati, about the probability distribution of a different sort of "exploding" dice: Saving throws in Tunnels and Trolls. I have previously written about Exploding D10, and this is a closely related problem. I'll let Christian explain it ...

I am getting ready to do a few posts on Tunnels and Trolls. Specifically, I will be proposing an alternate combat resolution system -- one based on an existing system within the game. I want to play around with the concept of using T&T "Saving Throws" as the basis for all mechanical resolutions in the game.

In T&T the saving throws are resolved by comparing a statistic to a difficulty # and using the result as the basis for a 2d6 roll. The actual equation is [15 + (level of challenge x 5)] - Statistic = Target Number. So a character with a Luck of 12 attempting a 1st level challenge would need a result of (15+5-12=8) eight or better to succeed on the attempt. Figuring out the probabilities on a basic 2d6 roll is a simple affair, but in T&T a player who rolls doubles keeps the result, re-rolls and adds the result to the prior sum until the player no longer rolls doubles. It's an open ended doubles system. What would be basic equation be to determine probabilities in this case. Logically, given that the chance of doubles is 1/6, it seems that the probabilities would be the same as a normal open ended d6 roll for each die (which I believe produces an average of 4.3 per die), but I'd like an equation I can use to determine the game balance as characters advance etc. I'd like to use the base probability of an "average" character of a given level attempting a task as the basis for my new system.

If you could be of assistance, it would be greatly appreciated.

A friend of mine had noted that asking me questions like this is like teasing a small child with a shiny toy, holding it just above my reach; You just know I'm going to jump up and try to get it - and I did. ;-)

As I said, this is very similar to how the probability for Exploding D10 work in the MechWarrior 3rd edition RPG. With D10X you count the roll if is in the range 1-9, and if it is a "10" you count the 10 and roll again (that the explosion). Rinse and repeat until done. In T&T we have a 2d6 roll that explodes if a tie is rolled on the two dice instead of the highest value on either one.

As with D10X, this breaks down into ~~two~~ three parts - which are a geometric series, the value of rolling ties tie, and the value of rolling no-ties.

The probability of rolling a ties on 2d6 is 1/6, so the geometric series starts of like this:
0 ties with probability = 5/6 = 0.8333
1 tie with probability = (1/6)*(5/6) = 5/36 = 0.1389
2 ties with probability = (1/6)*(1/6)*(5/6) = 5/216 =0.02315
3 ties with probability = (1/6)*(1/6)*(1/6)*(5/6) = 5/1296 = 0.003858

... and so on out to infinity. The average number of ties in a series is (1/6)*1/(5/6) = 1/5 = 0.2, which is nice it you only want to know the average 2d6X roll (which is 8.4, btw), but we need the entire probability distribution.

Now there are six different ways to roll a ties, and they have values of 2,4,6,8,10,12, all with 1/6 probability, which is just the same as 2 times the value of a 1d6 roll. I will note this as 2*1d6. Combining this with the geometric series, we get this:

a "0" with probability = 5/6 = 0.8333

a 2*1d6 roll with probability = (1/6)*(5/6) = 5/36 =0.1389
a 2*2d6 roll with probability = (1/6)*(1/6)*(5/6) = 5/216 = 0.02315
a 2*3d6 roll with probability = (1/6)*(1/6)*(1/6)*(5/6) = 5/1296 = 0.003858
... and so on out to infinity.

NOTE: The probability of zero ties is really 0.8333 (the first bar); I chopped off the graph to show the what ties actually add in, which really isn't very much. I calculated the probabilities out to 6 consecutive ties, which is only a little bit of overkill. It could go on forever, but unless you are infinitely lucky (and have a lot of spare time on your hands) your streak of ties will eventually come to an end, where you add the sum of the the final not-tied dice to your previous total. That probability is a modification of the usual discrete triangular 2d6 distribution, and it looks like this (2d6-ties):

The final step to pull this all together is a bit messy, so pardon my hand-waving over the details (if you really want the nitty-gritty, look in the accompanying spreadsheet). You multiply each probability in the geometric series with each the probabilities in the 2d6-ties distribution and sum up the corresponding values of the roll, then tally up the total probability for all combinations with the same sum. It looks like this:

This looks a lot like the 2d6 distribution with a long "tail" tacked onto the right hand side. Here's the cumulative probability:

Finally, a table of the rolls and cumulative probabilities, though if you really want to numbers it's probably easier to grab them from the spreadsheet. Again, the table theoretically should go to infinity, but in practice you will only rarely see any rolls greater than 20.

I'm curious to see what Christian does with this, and I'll link to his post(s) on the topic when he has them up.

Have fun rolling the dice!

[Edit: Typo corrected, added decimal probabilities.]
[Edit2: Added link to spreadsheet, which I swear I did once already.]

23 October 2009

Games and Reality are Probably Different, Part 4

In the previous posts in this series (1 2 3) I have been describing the probability distributions generated by dice and trying to describe why that doesn't quite match what we experience in reality. Not all games have dice though; some games use physics to simulate the real world, and the only random element might be the actions of the player themselves. Do these games suffer the same problem? - I think they do - but first, I need to tell you about my favorite TV show.

My favorite TV show - Top Gear on BBC television - is a mix of fast cars, testosterone, and the best of British absurdest ~~humor~~ humour. The show is co-hosted by Jeremy Clarkson, BBC television host and professional overgrown child. I can call him that because I am horribly jealous of his job, which seems to consist entirely of driving fast cars and making snarky comments. Here is his mini biography:

Jeremy has often been described as 'the most influential man in motoring journalism', mainly by himself. Estimates suggest that he is slightly over nine feet tall, owns 14,000 pairs of jeans and has destroyed almost 4.2 million tyres in his lifetime. He is best known for possessing a right foot apparently consisting of some sort of lead-based substance, for creating some of the most tortured similes ever committed to television, and for leaving the world's longest pauses between two parts... of the same sentence. He has never taken public transport.

In a recent (recent to me) segment of the show Jeremy takes on "The Corkscrew" at Laguna Seca, perhaps the most difficult corner of any race track in the world. First Jeremy first practices with Gran Turismo 4 to get a good track time, and then tries the same track in real life. (What is there about his job not to be jealous of?) See how well he does:

[The video is broken, but try one of these links:
http://videosift.com/video/Top-Gear-Real-life-racing-vs-Gran-Turismo
http://en.wikipedia.org/wiki/Mazda_Raceway_Laguna_Seca#Automotive
http://www.streetfire.net/video/top-gear-nsx-laguna-seca_208766.htm
http://www.kewego.com/video/iLyROoaft0ZG.html]

The Gran Turismo games are great simulations, but they miss some of the little things that make race driving harder. While there is no random dice rolling to this game or to driving a car (1), the limitations of human reactions add an element of uncertainty and randomness. Most of the time that random aspect is too small to notice, but when it comes to doing something really hard those little things start to matter. The Game is no longer a good representation of the Reality. One of the things Jeremy points out is that a game can't make you afraid of spinning off the track, and so fear adds another layer of difficulty in the real car.

That's OK, it's supposed to be a game. If every player had to learn all the skills of a real race driver it wouldn't be much fun. As pointed out in the comments to Part 1 of this series, games don't need to have a perfect representation to give players a challenging task and tough decisions to make.

Footnotes:

If you want to get picky, then for practical purposes it's not possible to measure or simulate every last detail, and this error could well be described as "random".

Related Post: Physics of Racing, and Gran Turismo 2.

19 October 2009

Games and Reality are Probably Different, Part 3

In Part 2 I did not give an adequate explanation of what I was showing, so I want to go over some of this again more carefully. I also want to re-visit my original question: If additive and proportional representations of probability are so different, and game are representing probability inaccurately, why don't we notice?

I also need to dig myself out of a bit of trouble, because I have been confusing two separate issues. The first is the proportional representation of probability, the second is the how to represent difficulty on a meaningful scale, which I am saying should also be proportional.

First the probability: The probability issue is clearly defined. On the additive side the Uniform distribution is the ultimate example. It has a limited range and you can get from probability zero to one in a finite series of steps (but not infinitely small steps!). When we make a graph of the cumulative probability the uniform distribution forms a straight line.

On the proportional side the best example is the logistic distribution, which I had intentionally left out for simplicity. It has an infinite range (negative to positive infinity), but when you go left or right on the scale you never quite get to probability zero or one (though it may be arbitrarily close). When we graph odds on the logarithmic scale (Log Odds, or "LO") they form a straight line. I have redone my graphs from Part 2 to include the logistic distribution. You can see that the logistic PDF looks very different from all the others, but in the CDF and Log-Odds charts it looks very much like the Laplace distribution. This is perhaps deceptive, because the logistic distribution has very heavy tails. (Click to see a larger image).

There is an additional issue here because there is no obvious reference probability; 0 and 1 are good reference probabilities for the uniform, but that doesn't work for distribution with an infinite range. I have arbitrarily chosen 0.5 as the reference probability for these graphs, which occurs at Z=0.

It will help to have an example to think through this; imagine that you have a set of dice that will generate random numbers from each of these distributions. For the uniform and triangular (2d6, 2d10, 2dX) this is very familiar; any single die is uniform, and any pair of dice is a triangular distribution. For the normal, Laplace, and Logistic distributions we need to imagine we have some "magic dice" that will do what we need. These would be very unusual dice indeed, but it is helpful to compare them to the behavior of 2dX dice we know. The Normal distribution dice will be most like the 2dX dice. The Laplace dice will tend to roll very close to the average most of the time, but will occasionally roll very high or very low. The Logistic dice will tend to roll farther away from the average more than any other dice, and will generate a relatively more extremely high or low rolls.

Now the difficulty: The X-axis on the charts is standardized to a common scale of difficulty, the Z of the standard normal distribution. Think of your dice again, and imagine you are rolling with a "+1" modifier (-1 if you like). On the Z-scale a "+1" standard deviation, or one common unit of variability, is the same for all of these dice.

One more graph - This is the same zoom-in chart from Part 2, but I have annotated it for discussion (except no logistic distribution here). Between log-odds of -1 to +1 is about 45% of the total probability of all these distributions. That means nearly half of the rolls of your dice will be within this range. Within this limited range the cumulative probabilities for these distributions are very similar. The uniform and Laplace distribution are practically on top on each other here, though the shape of these two distribution (see PDF chart) could hardly be more different (The logistic distribution would be very close to these). Likewise for the 2dX and normal distributions; these are barely distinguishable within this range. Although these distributions might be very fact different, the differences in the cumulative probabilities only matter at the high and low ends, not in the middle.

It has been a long haul, but I can finally (FINALLY!) start discussing why I think we don't notice the difference between additive and proportional probability.

As I demonstrated above, for the middle range of difficulty there isn't much difference in the cumulative probabilities, and one distribution might do about as well as any other. Games tend to emphasize tasks of medium difficulty because they are interesting - It's not much fun to play a game where you are trying to do something that is practically impossible or incredibly easy. On one hand hundreds of rolls might be needed to succeed, and on the other success is not a challenge. Good games avoid this by keeping difficulty in the the middle where the chance of success or failure is interesting.
In games it is common for the dice rolls for success to be identical in similar situations. This is not the case in reality; In the real world things are constantly changing, and many tasks are never exactly the same twice. It seems likely that we perceive the average difficulty of many tasks, which may mask the proportional relationship. There is a mathematical question here about the average difficulty of tasks and whether this means that the normal distribution better represents how we perceive difficulty. It seems possible, but I don't know how to justify it mathematically.
Oops? Meaning maybe my example that lead me to the Laplace distribution as a motivating example is wrong. There is a simplifying assumption I made in that example that may not be quite right, and I'll have to work it through again to examine it carefully. There are a lot of things I didn't define very carefully that might come back to bite me here, but it's a blog, not a textbook, and I think my main points are essentially correct.

A closing thought: What does it mean to measure difficulty on a scale from negative to positive infinity? There probably are some tasks that are too easy to fail, or too difficult to ever succeed, yet on an infinite scale there is always some probability of each. This seems to border on a philosophical question, but if I work at it perhaps I can pin it down better than I have so far.

In Part 4, I have a game versus reality example to show you. Stay tuned!

Related:
The Endeavour: Sums of uniform random values

12 October 2009

Games and Reality are Probably Different, Part 2

In Part 1 I tried to describe what I consider to be a basic difference between games and reality in the scale used to represent the difficulty of tasks. When you are playing a game and need to roll a certain target number or higher on the dice to succeed, this defines a scale of difficulty that is additive, and adding or subtracting 1 from the target number changes the probability of success in a certain way. I would argue that real world difficulty and probabilities for success are better represented on a proportional scale.
I have prepared some graphs to illustrate what I'm getting at. It's a bit of a difficult concept, and I find it hard to describe in simple terms. Hopefully this will help get my idea across.

Below is a graph of the probability density functions (PDF) for some common probability distributions, including several dice distributions (1), with two important changes: On the X-axis, the units here are not in the numbers you might roll on the dice, but instead in standard deviations as are used to describe the spread of the standard normal distribution (represented by Z). All the these have been "centered" so the most likely roll (or event) is at 0 (zero), and "scaled" so that the are spread out in an equal way. The other change is I have "inflated" the distributions by re-scaling the probability inversely proportional to the standard deviation of that distribution (2). By standardizing these distributions onto the same X & Y scale, I hope it will make them easier to compare. You might want to consider popping the image out to another window for reference.

Working my way down the legend:

Uniform distribution - This represents random numbers between 0 and 1 (any range, actually) where every number is equally likely to occur. Thus, this line is perfectly flat across the graph. Compared to the other distributions, the most extreme events (easiest and hardest) are much more probable, but the scale of difficulty doesn't go out as far, limiting the range of smallest and largest probabilities.

1d10 - The dots on top of the uniform distribution represent the distribution of a 10-sided die. This a discrete uniform distribution with a range from 1 to 10. In fact, the distribution of probabilities for any single die roll will fall on this line, though the dots would of course be spread differently (any regular sided die, that is).

2d10 - this represents the distribution of the sum of two dice, and you might recognize the distinctive triangular shape from Kit's recent post about the Math of 2DX Systems. This random distribution is much more "central" than the uniform, but it also extends out to smaller probabilities. Note this distribution is closer to the normal distribution than any other presented here.

2d6 - Due to the way I have standardized the distributions this has just the same shape as the 2d10 distribution. The sum of any two regular dice will look much the same.

(Standard) Normal distribution - This is here partly as a reference for comparison, because I have tweaked the other distributions to the same scale. It's also a useful reference because it shows up in many real world applications. This distribution can describe very extreme events, but the probabilities becomes very close to zero rapidly as you move away from the middle of the distribution.

Laplace distribution - This is the the mathematical relation I originally worked out for my shooting a target example in Part 1 to demonstrate proportional probabilities (the problem that started me thinking about all this in the first place). This distribution is very "central"; if you could have a die that rolled numbers with a Laplace distribution, most the the rolls would be fairly close to the average.Most, but not all, because the the remainder of the rolls would tend to be very high or very low. This distribution has "heavy tails", meaning that the probability of the most extreme events gets smaller very slowly as you move out from the middle (the tails are "heavier" than the normal distribution).

The next chart are the cumulative density functions (CDF) for the same distributions (3). By statistical convention I have created this so the probabilities accumulate from left to right, so if you think about this as trying to roll your dice higher than a target number, the higher number start on the left and go down to the right.
This manner of presenting distributions tends to squish everything together in the middle, but you can see that the heavy tails of the Laplace distribution really stand out from those of the normal.

So far I've shown these probabilities on a the usual scale from zero to one. However, to demonstrate the proportional relationship, it helps to present it on a logarithmic scale; just the thing for presenting proportional relationships. This requires converting from the usual 0-1 probability scale to an "odds scale" than ranges from 0 to infinity, and then taking the natural log. This really requires a separate description to fully explain this, which is the reason for my previous post on Probability versus Odds. Reading that first may be helpful.

Here is the previous chart again, except that now instead of probability, the Y-axis is the log-odds:

On the logarithmic scale proportional relationships appear as straight lines, and look what has happened here with the Laplace distribution; it is very nearly a straight line. Everything is crunched together in the middle, so I made a "zoom in" of the middle portion:

It looks as if this post is headed for Part 3, because it's getting late and I have to get up very early. I still need to discuss what I think this really means about the differences between games and reality, and this is a good place to break for comments. Stay tuned for Part 3.

Footnotes:

For dice, which can only generate discrete numbers in a limited range, these are technically probability mass functions.
This is a weird thing to do, but I have re-scaled probability by dividing by the standard deviation. I could come up with distributions that looked like this in the first place, but they would be harder to compare directly. I really ought to redo that plot to labeled the Y-axis correctly.
With one additional tinker: I shifted the discrete distributions so that a 0.50 probability lines up at Z=0 for all distributions.

08 October 2009

Probability versus Odds

I keep referencing the odds and log-odds as ways to express probability, so it's worth the time to explain this concept by itself. In the future I can reference this post rather than re-explain the same idea every time.

Probabilities are numbers between zero and one [0,1]. This is sometimes also expressed as a percentage between 0% and 100%, but I percentages are sometimes used to represent proportions less-than zero or greater-than one, so I generally present probabilities as a number between zero and one to avoid that confusion (and it you ever teach intro-stats, it IS a confusion for some).

Odds are another way of expressing probability. For some event A that occurs with probability p, the "odds of A" are the ratio of (the probability of) A happening to A not-happening, so the odds of event A are p/(1-p). The odds transform a probability p between zero and one into a number that is between zero and positive infinity, and can represent any probabilities except for zero and one exactly. Fortunately, this isn't much of a limitation, because random events that never occur or always occur are not really random.

The odds are also often expressed as the ratio of two whole numbers. For example: if the probability of event A is p=0.6, then the odds of A are 0.6/(1-0.6) = 0.6/0.4 = 1.5. In whole numbers, 1.5 is equal to 3/2, and the odds are expressed as "3-to-2" odds, or sometimes just "3:2 odds". It's OK to skip the whole number step and just express the odds as "1.5-to-1" or "1.5:1" or just plain old "1.5". (Some people just don't like fractions I guess.)

Odds ratios are the ratio of two odds (I bet you didn't need me to tell you that). These might also be important, but not for today. Maybe I will come back and fill this in later.

Statistics are what I do to fund my gaming habit. Totally unrelated, I just thought I would throw this in for fun. :-)

Now, statisticians like to do regression models, and that usually means fitting a line equation to data that may range between negative and positive infinity. Numbers that are probabilities or odds present a problem because they have limited ranges, and regression models fit so nicely.
Enter the logarithmic transform, that funny little button on your calculator that most everyone learned about in school and promptly forgot about because they never use it. Logarithms transform numbers between zero and positive infinity to numbers between negative and positive infinity. They have some other nice properties too, like changing equations that are a series of multiplications into a series of sums, that are often easier to deal with mathematically.

Taking the logarithm of the odds turns this number into something statisticians know well. This facilitates regression models predicting the probability of an event occurring in much the same way as we might create any other regression model. Usually we use the natural log (log base e) for this, but the base doesn't matter too much. There are other functions that can also be used for this purpose (i.e.: Probit), but that is a tale for another day.

Back to the example: We started with a probability of p=0.6, which gave an odds of 1.5. The log-odds are then log(1.5)=0.4055. The charts below show the relationships between probability, odds, and log-odds.

This chart isn't very useful, because on a linear (our usual) scale the odds are relatively "flat", and then the explode to infinity as the probability of success approaches one.

Here is the same chart with the Y-axis changed to a logarithmic scale. Here is it easy to see what the odds are doing on the low end of the scale, and the symmetry of the relationship is clear.

Now the log-odds. Surprise! (well, maybe not.) This chart is identical to the last. All I have done is to switch the Y-axis back to our familiar linear scale, and substitute the natural log of the odds in place of the odds. Six of one, or a half-dozen of the other.

It is interesting to note that for probabilities between 0.25 and 0.75, the log-odds are nearly a straight line on this graph. In this range you can use a simple no-calculator conversion as a pretty good approximation between the two: log-odds = 4*(p-0.25) - 1, and p = 0.25 + (log-odds + 1)/4.

07 October 2009

Games and Reality are Probably Different, Part 1

I've been thinking about how games represent reality. Realistic games are used to depict historical scenarios and as training for real strategy and tactics. They do it well enough that gamers get into discussions (even arguments) about how realistic a game is, and how the rules might be made better. The same applies to how games represent fantasy and fiction; A game about battling against dragons or Giant Battling Robots can represent that particular sort of fantasy very well, and players get into the same sort of discussion about how the game could be made more realistic (though in this case maybe I ought to say "more fantastic"). There is one aspect of games (boardgames, miniatures games, and RPG’s in particular) that I think is not realistic, and in all the various discussions of games I’ve been a part of no one has ever raised this complaint: Probability distributions generated by dice do not accurately represent the difficulty of real world tasks. (see footnote 1)

When you roll dice in a game to determine success or failure of an action, there is some predetermined probability of success. This probability is modified by various conditions; it might be range, terrain, type of weapon or armor, or any number of things. Each one of these things will add (or perhaps subtract) from the difficulty of the task. Add enough of these modifiers and the task becomes impossible (or impossible to fail). This is what I will refer to as “additive” probability, because difficulty modifiers add (or subtract) from the probability of success.

Now let’s consider a real world task; my example task will be shooting a weapon to hit a target of a fixed size. Suppose you are shooting a weapon (gun, bow and arrow, laser, PGMP-15, etc) at a target that has an area of 1 square meter. Hitting within that area is a success, and a miss is a failure. Also suppose the target is located at a distance D such that your probability of hitting within the 1 meter area is 50%. I’m assuming there is a bit of inherent randomness to the aiming process here that can be represented as a probability.

Now consider a second target of the same size but twice as far away. When we aim our weapon at this target, it’s apparent size is going to be 0.25 square meters, because it will appear half as tall and half as wide, and present ¼ the visible area of the closer target. At one quarter of the apparent size, it should be 4 times harder to hit the target (2,3). If we double the range again, we will get another proportional reduction in apparent target size, and a proportional increase in difficulty. The effect of increasing range has a proportional (or multiplying) effect on difficulty. We might find plenty of other example where adding difficulty has this proportional effect on the probability of success.

This is the difference I wanted to point out, that the real world often, and maybe always, has proportional probability instead of additive probability. Additive and proportional models of probability are two different ways of representing a probability that depends on other factors. (4) So, how different are they, really? Does it make any difference? If it does make a difference, why don't we hear more about this? If it doesn't, why not?

A demonstration of these differences would be helpful, and I've made up some charts comparing different probability distributions. This is getting long, so I will save those for part 2, where I will also try to answer some of my own questions.

Footnotes:
(1) I’m intentionally being a bit contrary in this post to make a point. Please feel free to disagree with me.
(2) The exact form of the relationship depends on certain assumptions that I am not stating, but proportionality still holds.
(3) When I write “4 times harder” this means I am representing probability on a scale where this make sense (Odds). If you have a 50% chance of success, making this 4 times easier is nonsense (200% success?). If we represent 50% chance as 1:1 odds (read as “one-to-one odds”) of success, it now makes sense to talk about 4:1 odds. For probability p, the corresponding odds = p/(1-p).
(4) I use these proportional representations regularly in my work, and it is a standard statistical method (logistic regression, proportional odds and proportional hazards models).
(5) Bonus points if you figured out that the title of this post is a play-on-words. :-)

05 October 2009

Farkle Probabilities

I've been playing Farkle on Facebook, so I started getting curious about the probabilities involved with the game. I'm not the only one doing this, as I've found several others doing much the same (1,2,3). You can also play Farkle at TADMAS. It took me a quite a bit of time tinkering with a spreadsheet until I understood how to think about this. I'm already up past my bedtime tonight, so I'm going to spare you the gory details and get right to the results. This post will focus on the number of dice you roll and the number of times you roll them. I haven't completely figured out the scoring distribution yet, so that may be a follow-up post.

The following table gives the probabilities for how many dice will remain after you remove all the dice which score points (but you might choose not to remove all). For instance, when you roll 6 dice, there is a 2.31% chance of not scoring at all, or a "Farkle" (marked in red), and 15.43% chance that you will be able to score exactly one die (a 1 or a 5). The percentages (marked in green) are the probabilities that all remaining dice can be scored, thus gaining a "new roll".

If you score one die and re-roll the remaining 5 dice, and there is (for example) a 30.86% chance that there will be 2 (three Sir) 3 dice left (and therefore the other two are 1's or 5's).
As you play, there is a choice to not score all of the dice. You might do this in hopes of getting a better throw on the next try, and you might use this table to consider the risk of choosing to re-roll 1's and 5's, instead of scoring them immediately.

That first table looked the the game from the "one roll at a time" perspective. The next table is a little more complicated because it "looks ahead" at what will happen if you score all the dice you can and keep rolling until you Farkle or get a new roll. Win/lose percentages marked in red and green are as before. Numbers in shaded gray are intermediate probabilities used in my calculations and have no simple interpretation.

These (red and green percentages) are conditional probabilities for what might happen. If you roll 6 dice, AND score one of those, AND roll the 5 remaining dice, there is a 1.19% chance of a Farkle.

My final table presents some some further conditional probability calculations, and gives the overall probabilities of either Farkle-ing or getting a new roll, if you score all possible dice and re-roll all that remain.

Starting with 6 dice, there is a 68.63% of Farkling before getting a new roll, assuming you choose not to "cash-in" your points first. If you want to think about the possibility of getting several re-rolls (thus scoring a large number of points), you can look at each group of 6 dice as a geometric series - a probability distribution which I have mentioned a few times before.

You can use these tables to inform yourself about the risk of Farkling as you play the game. This might help you understand the game, but by itself it probably won't help you achieve a high-score to beat all your friends. To do that will require an understanding of the relationship between the risk of losing your points versus the probability of achieving your target high-score. When I figure that out, I'll let you know.
[some revisions, 3/26/2011]

Pages