Comparing loops by simulating a Squid Game scene
A few weeks ago I did an introductory R workshop where one of the students asked about loops and the difference of for vs while loops. This blogpost is about illustrating these two types of loops using a simulation example.
I think the majority knows the Netflix show “Squid Game”. In this series the contestants have to survive several deadly games. Here I want to focus on the 5. game of the show. There are 16 players who have to pass a bridge of 18 * 2 glass plates. At each step they have to decide on which plate they step. With a 50% chance they jump on the harder glass which is able to hold their body, the other 50% will mean their death. As I was watching the show I thought that this would be a perfect example to answer a statistical question by using loops and simulation.
The question I wanted to answer was:
How many players do we expect to survive the game?
Remark:
After opening twitter, I came across a tweet discussing John Helveston’s blogpost where he basically explained exactly what I wanted to do. I highly recommend his blog. I adapted his run_game
function and where he used data.table
I went with the tidy alternative.
We can solve the question about the number of survivors to expect by simulating the game. When using random simulation to answer statistical problems, this is called Monte Carlo Simulation.
First we load the necessary library and set a custom theme for our plots.
library(tidyverse)
library(microbenchmark)
plotutils::set_custom_theme(base_size = 32)
Then we create a dataframe as an input for the game. In this dataframe the alive
column is set to 1 as in the beginning obviously every player is alive.
# Define number of players
num_players <- 16
players <- tibble(player = seq(num_players),
alive = 1)
# Let's have a look at the dataframe
players
# A tibble: 16 x 2
player alive
<int> <dbl>
1 1 1
2 2 1
3 3 1
4 4 1
5 5 1
6 6 1
7 7 1
8 8 1
9 9 1
10 10 1
11 11 1
12 12 1
13 13 1
14 14 1
15 15 1
16 16 1
Now we are going to create the function for our game. This is a great example to look at the differences between for and while loops.
We start by creating a function using a for loop:
# Define a function for simulating one game using a for loop
run_game_for <- function(players, num_steps) {
lead_player <- 1
for (step in seq(num_steps)) {
# 50% chance that the glass is safe
if (sample(c(TRUE, FALSE), 1)) {
# It is safe, now the player can try the next one!
next
} else {
# The glass broke...
# Before continuing, check if any players are still alive
if (sum(players$alive) == 0) { return(0) }
# The lead player died
players$alive[lead_player] <- 0
lead_player <- lead_player + 1
}
}
return(sum(players$alive))
}
Then we create a function using a while loop. The setup is quite similar to the previously used run_game_for
function.
# Define a function for simulating one game using a while loop
run_game_while <- function(players, num_steps) {
# Initialize starting values
lead_player <- 1
current_step <- 0
game_running <- TRUE
while (game_running) {
# Let's see if the glass holds...
if (sample(c(TRUE, FALSE), 1)) {
# The glass holds and the player can go one step further
current_step <- current_step + 1
} else {
# Check if there are still players alive, if not end the game
if (sum(players$alive) == 0) { return(0)}
# Apparently the glass didnt hold and the current lead player dies
players$alive[lead_player] <- 0
lead_player <- lead_player + 1
# Anyway the player can go one step further
current_step <- current_step + 1
}
if (current_step == num_steps) {
# If they got to the last step, they did it and the game stops
game_running <- FALSE
}
}
# Return the number of remaining players
return(sum(players$alive))
}
Let’s give it a try and see how many survive in our game.
set.seed(001)
# Run one iteration of the game
single_game <- run_game_while(players, num_steps = 18)
single_game
[1] 9
We were interested in the expected value of the outcome. One iteration is not enough, but this is no problem at all. We can simply simulate our game multiple times.
# Set seed value to keep reproducibility and to give a hint who wins the game
set.seed(456)
# Define number of runs or games we want to play
n_runs <- 10000
# Create dataframe with outcome of each game
sims <- tibble(trial = seq(n_runs)) |>
rowwise() |>
mutate(while_loop = run_game_while(players, num_steps = 18),
for_loop = run_game_for(players, num_steps = 18)) |>
pivot_longer(-trial, names_to = "loop")
Have a look at the descriptive statistics:
summary(sims$value)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000 6.000 7.000 6.976 8.000 15.000
There were games where zero players survived the game and there were games where almost all players managed to survive.
Of course we can also visualize our distribution:
# Visualizing the resulting distribution
sims |>
ggplot(aes(x = value, fill = loop)) +
geom_bar(position = position_dodge()) +
scale_x_continuous(breaks = seq(0, num_players)) +
labs(title = "Distribution of numbers of surviving players",
x = "N surviving players",
fill = NULL)
The two different colours indicate which function was used to calculate the result. From this graph we directly see almost the exact same result from the two functions.
To answer our previously posed question: We would expect 7 players to survive the game.
Now we can also test the performance of the different functions against each other.
set.seed(001)
test <- microbenchmark(
run_game_for(players, num_steps = 18),
run_game_while(players, num_steps = 18)
)
autoplot(test)
Again there is not really a difference…
Of course we can not only simulate the game to get to our solution.
Here is the mathematical formula for n players:
\[ \sum_{i = 0}^{n-1} \binom{18}{i} * 0.5^i * 0.5^{18-i} * (n-i) \]
We can convert it to R Code and calculate the result.
n <- 16
expected_fun <- function(i) choose(18, i) * 0.5^i * 0.5^(18-i) * (n - i)
map_dbl(0:(n-1), expected_fun) |> sum()
[1] 7.000076
Et voilà! The result from the Monte Carlo simulation was confirmed.
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/L-Groeninger/lukas-r.blog, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Gröninger (2021, Dec. 5). Lukas R Blog: Looping over the bridge. Retrieved from https://lukas-r.blog/posts/2021-12-05-looping-over-the-bridge/
BibTeX citation
@misc{gröninger2021looping, author = {Gröninger, Lukas}, title = {Lukas R Blog: Looping over the bridge}, url = {https://lukas-r.blog/posts/2021-12-05-looping-over-the-bridge/}, year = {2021} }