Champions League – All English Ties?
We’re all eagerly anticipating the first legs of the 2019 Champions League quarter finals ties this week. But of course there was an even greater buzz of excitement leading up to the draw last month, with four English teams in the pot for the first time in 10 years. This marked only the third time in the history of the Champions League that four teams from the same country have reached the quarter final stage, following English success in both the 2007-08 and 2008-09 campaign. No other country has managed the feat (Italy, Germany and Spain also have four entrants to the Champions League), reaffirming every unthinking English football fan’s view that the Premier League is indeed the Best League in the WorldTM.
This prevalence of English teams naturally increased the probability of all-English quarter finals, to the interest of football fans, TV companies and the half-and-half-scarf producers alike. So the question all true fans were asking was what exactly was what was the probability of that all-English glamour tie occurring – in truth that’s pretty straightforward probability theory so our article also takes a look at the more interesting aspect of how can we efficiently simulate the draw.
Mathematical solution
The probability of all-English ties occurring can be worked out with some basic probability theory. There are 8 teams in total – four English and four non-English. There are no restrictions on which teams can draw each other. We assume that each team is equally likely to be drawn with any other – conspiracy theories of the big teams having heated balls will not be addressed in this analysis.
We first note there are only three possibilities for total number of all-English ties (0, 1 or 2), and treat these cases individually:
0 all-English ties
The key here is to note that this is equivalent to having all four quarter finals comprising of an English team drawn against a non-English team. So, for each quarter final, who is drawn first isn’t important; what matters is that we constrain who is drawn second.
For the first quarter final, it doesn’t matter who is drawn first. What matters is who is drawn second. Suppose an English team is drawn first. This leaves 4 non-English and 3 English teams in the pot, and to ensure 0 all-English ties, we need a non-English team drawn next. There is a 4/7 probability of this occurring. Similarly, had a non-English team been drawn first, we’d need an English team second, which again has a 4/7 probability of occurring. So, we have a 4/7 chance of the first quarter final not being all-English.
We take the same approach for the second quarter final, noting we now have 3 English and 3 non-English teams in the pot. Again, it doesn’t matter who’s drawn first, but we need to constrain who is drawn second. Following the first draw, there are 5 teams left in the pot, 3 of which will ensure the 2nd quarter final is not all-English. So, we have a 3/5 chance of the second quarter final not being all-English.
The same reasoning gives us a 2/3 chance of the third quarter final not being all-English. And once we’ve drawn 3 quarter finals consisting of an English team versus a non-English team, the 4th quarter final must also be an English team versus a non-English team, as they’re the only teams left in the pot.
Multiplying these probabilities, we get a total probability of:
P(0 all English QFs) = 4/7 x 3/5 x 2/3 = 8/35
2 all-English ties
The next easiest situation to consider is having 2 all-English quarter finals. Since there are only 4 English teams to start with, once an English team is drawn, we need to draw an English team to complete the tie. Consider first the event that the 4 English teams are the first to be drawn, and so we have two all-English quarter finals:
P(4 English teams drawn first) = 4/8 x 3/7 x 2/6 x 1/5 = 1/70
We now note that there are 6 combinations of how the 2 all-English quarter finals can be distributed amongst the 4 quarter finals, and multiply to get our probability of 2 all-English ties:
P(2 all English QFs) = 6 x 1/70 = 3/35
1 all-English tie
There is either 0, 1 or 2 all-English ties, and we know the probability of 0 or 2 occurring. So:
P(1 all English QF) = 1 - P(2 all English QFs) - P(0 all English QFs) = 1 -3/35 - 8/35 = 24/35
So by far the most likely scenario was that there would be one all-English tie, which of course is how things panned out.
Simulating the draw
We can also write some code (we’ve used VBA) to check the reasonableness of our mathematical solution above. Our code needs to copy the process of the actual draw itself:
- Place the correct number of English and non-English teams in the pot
- Mix the teams around in a random manner
- Produce a set of quarter finals
- Determine how many of these quarter finals are all-English
We’ve attached a spreadsheet that simulates the draw and compares the distribution of all-English ties with a mathematically derived model using similar ideas to the mathematical analysis above. You can play around with the number of teams (English and non-English) and number of simulations. For example, you can calculate the probabilities in a non-VAR (and better) world where PSG make it to the quarter finals instead of Manchester United.
How does the code work? We just use the same process listed above:
1. Place the correct number of English and non-English teams in the pot
noOfIterations = Range("noOfIterations") noOfTeams = Range("noOfTeams") noOfEnglish = Range("noOfEnglish") ReDim teamsArr(1 To noOfTeams) 'Populates array with "TRUE" for English teams and "FALSE" for non-English teams For counter = 1 To noOfTeams teamsArr(counter) = (counter <= noOfEnglish) Next counter
For our Champions League example, we have 4 English teams (represented by TRUE) and 4 non-English teams (represented by FALSE):
teamsArr = (TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, FALSE, FALSE)
2. Mix the teams around in a random manner
For iterCounter = 1 To noOfIterations
Randomize
'Shuffle array
For counter = 1 To noOfTeams
randPosition = Int((noOfTeams + 1 - counter) * Rnd) + counter
temp = teamsArr(counter)
teamsArr(counter) = teamsArr(randPosition)
teamsArr(randPosition) = temp
Next counter
Here, we used a variant of the Fisher-Yates shuffle to mix up the array. On the first loop through, when counter = 1, this makes randPosition a random number between 1 and the number of teams, and swaps the team from that position into first place. Next time, when counter = 2, randPosition will be a random number between 2 and the number of teams. This means the number we previously swapped into the first position stays where it is, while we choose a random number from the remaining set to go into second place. The shuffle carries on like this until we have gone through the whole array. For our example, we might now have an array that looks something like:
teamsArr = (FALSE, FALSE, TRUE, FALSE, TRUE, TRUE, FALSE, TRUE)
The Fisher-Yates shuffle has two important features:
- It ensures the teams are shuffled randomly; each permutation of the 8 teams is equally likely to be outputted by the algorithm. This is an important consideration – as some (potentially more intuitive) shuffling algorithms don’t have this randomness property. For those interested, this article goes in to more detail.
- It’s asymptotically efficient in terms of time complexity (ie is of O(n) time) and while not important in this example given the small number of teams, it is also memory space efficient as the array is reordered ‘in place’ rather than requiring a new array.
3. Produce a set of quarter finals
This doesn’t require much work – we’ll consider consecutive elements in the array as ties drawn. For example, the 1st and 2nd elements of the array represent the first quarter final, the 3rd and 4th elements represent the second quarter final, and so on.
4. Determine how many of these quarter finals are all-English
'Counts number of matches where both teams are English. Each consecutive pair (ie (1,2), (3,4), etc is considered a tie
noOfMatches = 0
For counter = 2 To noOfTeams Step 2
noOfMatches = noOfMatches + teamsArr(counter - 1) * teamsArr(counter)
Next counter
We loop through the set of ties produced (ie consecutive pairs of elements in the array) and count the number of all-English ties (represented by (TRUE, TRUE)).
Extensions to the simulation model
This simple model is robust enough to deal with a larger number of teams. For example, consider the FA Cup third round. 64 teams take part in the 3rd round draw (OK, more than that are in the draw because of second round replays having not been played, but you know what I mean), of which 20 are the Premier League teams entering the competition at this stage. So, we can reparametrise, rename ‘English’ to ‘Premier League’, and model the distribution of the number of all Premier League ties in the 3rd round. Here, I’ve used 10,000 simulations:
Number of all PL matches | Observed probability (from simulation) | Mathematical probability |
0 | 1.262% | 1.207% |
1 | 8.936% | 8.819% |
2 | 23.895% | 24.094% |
3 | 32.065% | 32.125% |
4 | 22.897% | 22.839% |
5 | 8.920% | 8.867% |
6 | 1.842% | 1.847% |
7 | 0.178% | 0.194% |
8 | 0.005% | 0.009% |
9 | 0.000% | 0.000% |
10 | 0.000% | 0.000% |
This year, there were 2 all Premier League ties, which is reasonably likely to occur according to our model. If we see 0 next year though, the conspiracy surrounding heated balls is sure to rear its head again …
The Excel model used in these projections can be downloaded here. Feel free to try out your own scenarios and let us know of any interesting observations or feedback on the model. On the published version we have protected the VBA code, but if you would like a copy of the unprotected version do get in touch with us via our contact page and we’d be happy to send you a copy.
Adam Smith
April 2019