Simulating Expected Values

(Sections 6.1 and 6.3)

Random Variables and Expected Values

Definition of “Random Variable”

Random Variable

A number whose value depends on the outcome of a chance process.

Random Variable Examples

Random Process Random Variable \(X\)
selecting three balls # red balls drawn
ten years in Florida number of Category 5 storms that hit
one year in Florida total insurance claims filed
Kentucky Derby 2017 your winnings ($)
10 people toss hats # getting own hat

Expected Value

What can you expect \(X\) to be, on average, in the long run?

Expected Value of a Random Variable

The long-run average of the values of a random variable, where the underlying chance process is repeated many, many times.

Estimating an Expected Value

Monte Carlo simulation works here, too!

  • repeat the random process many times
  • each time:
    • find the value of \(X\)
    • store it
  • find the mean of the stored values

Example: A Dice Game

Game Scenario

You are about to play a game: you will flip a fair coin twice.

  • If you get Tails both times, you lose a dollar.
  • If you get exactly one Head, nothing happens.
  • If you get Heads both times, you win two dollars.

Let \(W\) be the number of dollars you will win.

\(W\) is a random variable. What is its expected value?

What Can Happen

When you flip a fair coin twice, there are four equally likely outcomes:

  • Tails/Tails (\(W = -1\))
  • Tails/Heads (\(W = 0\))
  • Heads/Tails (\(W = 0\))
  • Heads/Heads (\(W = 2\))

Therefore:

  • The chance that \(W=-1\) is 0.25.
  • The chance that \(W=0\) is 0.50.
  • The chance that \(W=2\) is 0.25.

We Know the Distribution of W!

Distribution of a Random Variable

A statement of the probabilities for a random variable to assume its various possible values.

\(w\) \(P(W = w)\)
-1 0.25
0 0.5
2 0.25

Model and Repeat

Back to Monte Carlo simulation. sample() is still good for this!

# model the process:
outcomes <- c(-1, 0, 2)
probs <- c(0.25, 0.50, 0.25)
# repeat and store results:
winnings <- sample(outcomes, size = 10, replace = TRUE, prob = probs)
winnings
 [1]  0  0  0 -1  0  0  0  2  0  0

Parameter prob specifies the chance of getting each of the elements of the first argument outcomes

Summarize

mean(winnings)
[1] 0.1

This is our estimate of the expected value of \(W\).

Better With More Repetitions

How about repeating 10,000 times?

winnings <- sample(outcomes, size = 10000, replace = TRUE, prob = probs)
mean(winnings)
[1] 0.2382

Encapsulate Into a Function

coin_game_sim<- function(reps) {
  possible_winnings <- c(-1, 0, 2)
  probabilities <- c(0.25, 0.50, 0.25)
  ## get the winnings all in a flash:
  winnings <- sample(
      possible_winnings,
      size = reps,
      prob = probabilities,
      replace = TRUE ## <-- very important!
  )
  mean(winnings)
}

Try It Out

coin_game_sim(reps = 100000)
[1] 0.2488

The exact expected value is 0.25.

Total Winnings

Suppose you could play the coin game two thousand times. Would you want to do this?

To answer this, consider what your NET winnings might be.

One Way to Reason: Using Expected Value

  • Expected value is 0.25.
  • So on average you win 0.25 dollar per play.
  • So in 2000 plays you should win about:

\[2000 \times 0.25 = 500 \text{ dollars}.\]

Another Way to Reason: Simulation

Simulate playing 2000 times and compute net winnings:

winnings <- sample(outcomes, size = 2000, replace = TRUE, prob = probs)
sum(winnings)
[1] 531

Try a Few More Times

winnings <- sample(outcomes, size = 2000, replace = TRUE, prob = probs)
sum(winnings)
[1] 519
winnings <- sample(outcomes, size = 2000, replace = TRUE, prob = probs)
sum(winnings)
[1] 434
winnings <- sample(outcomes, size = 2000, replace = TRUE, prob = probs)
sum(winnings)
[1] 559

Extra Feature: A Table of Simulations

table()

table() that will tally the number of occurrences of each different element of a vector:

some_people <- c(
  "Bettina", "Sita", "Raj", "Ram", 
  "Ram", "Sita", "Raj", "Bettina", 
  "Karl", "Sita", "Sita", "Raj"
)
table(some_people)
some_people
Bettina    Karl     Raj     Ram    Sita 
      2       1       3       2       4 

prop.table()

When you give this function a table, it will compute the proportion of times each element occurs:

my_table <- table(some_people)
prop.table(my_table)
some_people
   Bettina       Karl        Raj        Ram       Sita 
0.16666667 0.08333333 0.25000000 0.16666667 0.33333333 

Rounding

If you like, you can round off to fewer decimal places with the round() function:

round(prop.table(my_table), digits = 3)
some_people
Bettina    Karl     Raj     Ram    Sita 
  0.167   0.083   0.250   0.167   0.333 

Add Option for a Table

coin_game_sim <- function(reps, table = FALSE) {
  possible_winnings <- c(-1, 0, 2)
  probabilities <- c(0.25, 0.50, 0.25)
  winnings <- sample(
      possible_winnings,
      size = reps,
      prob = probabilities,
      replace = TRUE
  )
  if (table) {
    cat("Here is a table of our the simulated winnings:\n\n")
    prop_tab <- prop.table(table(winnings))
    print(round(prop_tab, digits = 3))
    cat("\n")
  }
  mean(winnings)
}

Compare Simulations to Distribution

Try it out:

coin_game_sim(reps = 10000, table = TRUE)
Here is a table of our the simulated winnings:

winnings
   -1     0     2 
0.252 0.505 0.243 
[1] 0.2341

Compare to the Distribution of \(W\):

\(w\) \(P(W = w)\)
-1 0.25
0 0.5
2 0.25

Law of Large Numbers says the two should be close.

Example: Hat Toss

Back to the Hat Toss

\(n\) people toss their hats and scramble.

Let \(X\) be the number who pick up their own hats.

What is the expected value of \(X\)?

Monte Carlo Simulation Again

hat_toss_sim_ev <- function(n, reps) {
  ## setup scenario as before:
  people <- 1:n
  hats <- 1:n
  ## set up results vector:
  results <- logical(reps)
  ## repeat process reps times:
  for (i in 1:reps) {
    ## toss and scramble:
    picked_up_hats <-
      sample(
        hats, size = n, 
        replace = FALSE)
    ## figure how many got their own:
    got_own <- sum(people == picked_up_hats)
    ## store the value:
    results[i] <- got_own
  }
  ## report estimated expected value:
  mean(results)
}

Try It

Try it on one hundred people:

hat_toss_sim_ev(n = 100, reps = 1000)
[1] 0.976

Estimating Expected Value With a Loop

In general, when we use a loop to perform a Monte-Carlo simulation to estimate an expected value, our code will follow this template:

results <- numeric(length = reps_desired)
for (i in 1:reps_desired {
  ## some code to make the random process happen once
  ## more code to find out the value that the random variable took on
  results[i] <- value_that_the_random_variable_took_on
}
mean(results)