Let's Make a Deal.

During my down time this Christmas I watched the 2008 film "21" with Kevin Spacey. The the movie protagonist , Ben Campbell, was endowed with the superpower of mind boggling arithmetic and uncanny statistical reasoning. Kevin Spacey's character, Professor Micky Rosa, discovered Ben's genesus during a class lecture after he challenged the class with the Monty Hall problem. Ben recited the optimal strategy for the problem and in doing so he caught the attention of underground MIT card counting club. The Monty Hall problem was popular with pop culture for it's apparent paradox. It was once purposed by a reader's letter in Parade magazine in 1990:

Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?

Ben's solution boiled down to this: It is always advantageous to switch doors. If one switches, then they have a 2/3 chance of guessing the correct door. If they stick with the original door, then they have a 1/3 chance of guessing the correct door. I trusted Ben's explanation but admittedly it didn't seem intuitive at first. How does removing one door change the probability? Shouldn't it stay 1/3 regardless? The logic started to make more sense as I started to think about the impact of opening a door after the first choice. Opening the door reveals information that was not available at the beginning of the game and this new information changes everythng. In his book, Naked Statistics- Charles Wheelan demonstrates this insight through exaggeration. Imagine instead of 3 doors there are 100. After picking one door, the host opens 98 doors with goats. The two remaining doors are 1) the original pick which has 1/100 chance of having the car and 2) another door that has a 99/100 chance of having the car. The 100 doors example helped bring clarity but still I wanted to test empirically. I ran 10,000 simulations using a strategy to never switch doors. The "stick with original door" strategy was correct 3,3392 out of 10,000 or 33.92% of the time.

orig1

library(ggplot2)

trials <- 10000
correct_picks <- 0

for(i in 1:trials){
  my_pick <- sample(1:3,1)
  new_car <- sample(1:3,1)
  ifelse(my_pick==new_car,correct_picks <- correct_picks+1,0)
  }

wrong_picks <- trials-correct_picks

sim_results <- data.frame(result=c("Incorrect","Correct"),Simulations = c(wrong_picks,correct_picks))

ggplot(data=sim_results, aes(x=result, y=Simulations,fill=result))+
 geom_bar(stat="identity")+
 coord_flip()+
 geom_bar(stat="identity", color="black", position=position_dodge())+
 theme_minimal()+ 
 scale_fill_manual(values=c('#34ffff','#ff6734'))+
 theme(axis.title.y=element_blank())+ 
 guides(fill=FALSE)+
 geom_text(aes(label=Simulations),hjust=0, vjust=0)+
 ggtitle("Stick with Original Door")


Next, I tested the strategy to always switch doors after the goat is revealed. No big surprise, we were right about 2/3 of the time

switch2

correct_picks <- 0
choices <- c(1,2,3)
resample <- function(x, ...) x[sample.int(length(x), ...)] 

for(i in 1:trials){
  my_first_pick <- sample(1:3,1)
  new_car <- sample(1:3,1)
  
  open_canidate <- choices[choices !=  my_first_pick & choices != new_car]
  
  open_door <- resample(open_canidate,1)
  
  second_pick <- choices[choices != my_first_pick & choices != open_door]
  
  ifelse(second_pick==new_car,correct_picks <- correct_picks+1,0)
}

wrong_picks <- trials-correct_picks

sim_results <- data.frame(result=c("Incorrect","Correct"),Simulations = c(wrong_picks,correct_picks))

ggplot(data=sim_results, aes(x=result, y=Simulations,fill=result))+
 geom_bar(stat="identity")+
 coord_flip()+
 geom_bar(stat="identity", color="black", position=position_dodge())+
 theme_minimal()+ 
 scale_fill_manual(values=c('#34ffff','#ff6734'))+
 theme(axis.title.y=element_blank())+ 
 guides(fill=FALSE)+
 geom_text(aes(label=Simulations),hjust=0, vjust=0)+
 ggtitle("Always Switch")

Is this game worth playing?

A game a chance is more fun when the outcomes aren't so transparent. It would be frustrating for me to watch anyone stick with the original door knowing they picked a losing strategy. Perhaps the rules of the game should be tweaked to make it more interesting. For example, what if instead of using two goats and one car they used two cars and one goat. Instead of opening a door to reveal a goat a door is opened to reveal a car. How would the strategy change? The strategy flips. Charles Wheelan's in reverse again helps illustrate the intuition. If a contestant has to pick from 100 closed doors, 99 of which have a car and one with goat, then he has a 99/100 chance of picking a car on the first guess. Switching after the 98 doors leaves us with a 1/100 chance of picking a car. Hence, we should stick with our original guess under the modified rules. Again, the simulations support this strategy.

simw2_3

correct_picks <- 0
sim_results <- NULL

for(i in 1:trials){
  
  my_pick <- sample(1:3,1)
  new_car <- sample(1:3,2)

  ifelse(my_pick==new_car,correct_picks <- correct_picks+1,0)

}

wrong_picks <- trials-correct_picks

sim_results <- data.frame(Strategy = "Keep Original",result=c("Incorrect","Correct"),Simulations = c(wrong_picks,correct_picks))



correct_picks <- 0
choices <- c(1,2,3)
resample <- function(x, ...) x[sample.int(length(x), ...)] 

for(i in 1:trials){
  my_first_pick <- sample(1:3,1)
  new_car <- sample(1:3,2)
  open_canidate <- resample(new_car[new_car != my_first_pick],1)
  open_door <- resample(open_canidate,1)
  
  second_pick <- choices[choices != my_first_pick & choices != open_door] 
  
  ifelse(second_pick==new_car,correct_picks <- correct_picks+1,0)
}

wrong_picks <- trials-correct_picks

sim_results <- rbind(sim_results,data.frame(Strategy = "Switch",result=c("Incorrect","Correct"),Simulations = c(wrong_picks,correct_picks)))

ggplot(data=sim_results, aes(x=Strategy, y=Simulations,fill=result))+
 geom_bar(stat="identity", color="black", position=position_dodge())+
 theme_minimal()+ 
 scale_fill_manual(values=c('#34ffff','#ff6734'))+
 theme(axis.title.y=element_blank())+ 
 ggtitle("Simulations Results with Two Cars")

New Strategy...

The new rules didn't make the game any more interesting per it just flipped the optimal strategy. It still has the problem that no one should rationally do anything other than stick with the original door. So, why not tweak the rules even further? Instead using two goats and one car or two cars and one goat why not randomly use either combination? Now the shrewd reader might point out that after one of the doors was open then the contestant would know which combination was in play and could apply the optimal strategy accordingly. So, I need to add one more rule. In addition to not knowing the initial ratio of goats to cars behind the three doors, the contents of the doors are not revealed until the end. Now that we've stripped information about the initial combination and content's of the original door there is no longer an optimal strategy. 50% of the time you will be using a strategy with 1/3 chance of success and 50% of the time you be using a strategy with 2/3 success rate. The result is that you will guess correctly 50% of the time regardless of which strategy (1/3 x .5 + 2/3 x .5 = 50%) The game becomes a glorified coin flip but at least the outcome isn't as predictable.

simfinal4

correct_picks <- 0
sim_results <- NULL
choices <- c(1,2,3)
resample <- function(x, ...) x[sample.int(length(x), ...)] 

for(i in 1:trials){
  my_first_pick <- sample(1:3,1)
  new_car <- sample(1:3,sample(c(1,2),1))
  
  if(length(new_car)==1){
    open_canidate <- choices[choices !=  my_first_pick & choices != new_car]
    
  } else {
    open_canidate <- resample(new_car[new_car != my_first_pick],1)
  }
  
  open_door <- resample(open_canidate,1)
  
  
  ifelse(my_first_pick==new_car,correct_picks <- correct_picks+1,0)
}

wrong_picks <- trials-correct_picks

sim_results <- data.frame(Strategy = "Keep Original",result=c("Incorrect","Correct"),Simulations = c(wrong_picks,correct_picks))



correct_picks <- 0
choices <- c(1,2,3)
resample <- function(x, ...) x[sample.int(length(x), ...)] 

for(i in 1:trials){
  my_first_pick <- sample(1:3,1)
  new_car <- sample(1:3,sample(c(1,2),1))
  
  if(length(new_car)==1){
    open_canidate <- choices[choices !=  my_first_pick & choices != new_car]
    
  } else {
    open_canidate <- resample(new_car[new_car != my_first_pick],1)
  }
  
  open_door <- resample(open_canidate,1)
  
  second_pick <- choices[choices != my_first_pick & choices != open_door] 

  ifelse(second_pick==new_car,correct_picks <- correct_picks+1,0)
}

wrong_picks <- trials-correct_picks

sim_results <- rbind(sim_results,data.frame(Strategy = "Switch",result=c("Incorrect","Correct"),Simulations = c(wrong_picks,correct_picks)))

ggplot(data=sim_results, aes(x=Strategy, y=Simulations,fill=result))+
 geom_bar(stat="identity", color="black", position=position_dodge())+
 theme_minimal()+ 
 scale_fill_manual(values=c('#34ffff','#ff6734'))+
 theme(axis.title.y=element_blank())+ 
 ggtitle("Simulations Results with One or Two Cars and Blind Removal")

Photo by Jonathan Petersson / Unsplash