One of my favourite sites is squiggle, mainly because I like to check out what other people have tipped, what their respective margins are and how they are aligned.
As some of you know James and I have been working on an AFL R package called fitzRoy
One of our main goals is to get more people to build out AFL models.
For me that’s mainly because I like reading AFL stats content, be it probably a bit too much!
Having done a presentation at useR, one of the main problems I found with getting people wanting to give it a go is that they feel as though building their own model is simply too complex, that they are out of their depth.
Well I am here to say, that’s not true. Anyone can give it a go.
I think there are many camps to the AFL stats fanbase. One of which I would like to try and include more.
That is those who feel as though they can’t contribute or can’t analyse simply because they don’t know R/How to build out models.
So while you might be reading other blogs and sure they are “coding” it up from “scratch”. I think a missing point to this thought is while that’s true, it also doesn’t have to be you. Lots of people develop packages and answer questions on sites like stackoverflow.
So if you have ever wanted to build your own ELO for AFL why not use this ELO package. That’s what it was designed for, for people to use it.
library(tidyverse)
## ── Attaching packages ───────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.0 ✔ purrr 0.3.2
## ✔ tibble 2.1.3 ✔ dplyr 0.8.2
## ✔ tidyr 0.8.3 ✔ stringr 1.4.0
## ✔ readr 1.3.1 ✔ forcats 0.4.0
## ── Conflicts ──────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(elo)
library(fitzRoy)
library(lubridate)
##
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
##
## date
results <- fitzRoy::get_match_results()
fixture <- fitzRoy::get_fixture(2019)
head(results)
## # A tibble: 6 x 16
## Game Date Round Home.Team Home.Goals Home.Behinds Home.Points
## <dbl> <date> <chr> <chr> <int> <int> <int>
## 1 1 1897-05-08 R1 Fitzroy 6 13 49
## 2 2 1897-05-08 R1 Collingw… 5 11 41
## 3 3 1897-05-08 R1 Geelong 3 6 24
## 4 4 1897-05-08 R1 Sydney 3 9 27
## 5 5 1897-05-15 R2 Sydney 6 4 40
## 6 6 1897-05-15 R2 Essendon 4 6 30
## # … with 9 more variables: Away.Team <chr>, Away.Goals <int>,
## # Away.Behinds <int>, Away.Points <int>, Venue <chr>, Margin <int>,
## # Season <dbl>, Round.Type <chr>, Round.Number <int>
results <- results %>%
mutate(seas_rnd = paste0(Season, ".", Round.Number),
First.Game = ifelse(Round.Number == 1, TRUE, FALSE))
head(results)
## # A tibble: 6 x 18
## Game Date Round Home.Team Home.Goals Home.Behinds Home.Points
## <dbl> <date> <chr> <chr> <int> <int> <int>
## 1 1 1897-05-08 R1 Fitzroy 6 13 49
## 2 2 1897-05-08 R1 Collingw… 5 11 41
## 3 3 1897-05-08 R1 Geelong 3 6 24
## 4 4 1897-05-08 R1 Sydney 3 9 27
## 5 5 1897-05-15 R2 Sydney 6 4 40
## 6 6 1897-05-15 R2 Essendon 4 6 30
## # … with 11 more variables: Away.Team <chr>, Away.Goals <int>,
## # Away.Behinds <int>, Away.Points <int>, Venue <chr>, Margin <int>,
## # Season <dbl>, Round.Type <chr>, Round.Number <int>, seas_rnd <chr>,
## # First.Game <lgl>
fixture <- fixture %>%
filter(Date > max(results$Date)) %>%
mutate(Date = ymd(format(Date, "%Y-%m-%d"))) %>%
rename(Round.Number = Round)
# Set parameters
HGA <- 30 # home ground advantage
carryOver <- 0.5 # season carry over
k_val <- 20 # update weighting factor
map_margin_to_outcome <- function(margin, marg.max = 80, marg.min = -80){
norm <- (margin - marg.min)/(marg.max - marg.min)
norm %>% pmin(1) %>% pmax(0)
}
# Run ELO
elo.data <- elo.run(
map_margin_to_outcome(Home.Points - Away.Points) ~
adjust(Home.Team, HGA) +
Away.Team +
group(seas_rnd) +
regress(First.Game, 1500, carryOver),
k = k_val,
data = results
)
as.data.frame(elo.data) %>% tail()
## team.A team.B p.A wins.A update elo.A
## 15528 Sydney Gold Coast 0.5849803 0.76250 3.550395 1497.135
## 15529 Collingwood North Melbourne 0.5848363 0.22500 -7.196726 1518.862
## 15530 Port Adelaide Footscray 0.5746598 0.34375 -4.618196 1502.893
## 15531 St Kilda Richmond 0.5271466 0.29375 -4.667931 1471.617
## 15532 Brisbane Lions Melbourne 0.5833187 0.70625 2.458626 1508.817
## 15533 Fremantle Carlton 0.5790813 0.47500 -2.081626 1502.480
## elo.B
## 15528 1460.405
## 15529 1503.730
## 15530 1489.859
## 15531 1492.071
## 15532 1475.459
## 15533 1481.227
as.matrix(elo.data) %>% tail()
## Adelaide Brisbane Lions Carlton Collingwood Essendon Fitzroy
## [2791,] 1513.852 1499.096 1482.224 1524.114 1504.841 1500
## [2792,] 1513.972 1500.577 1478.602 1522.139 1508.463 1500
## [2793,] 1516.079 1498.933 1480.246 1525.196 1508.463 1500
## [2794,] 1518.601 1498.933 1479.145 1525.196 1509.635 1500
## [2795,] 1518.601 1506.358 1479.145 1526.059 1506.024 1500
## [2796,] 1516.706 1508.817 1481.227 1518.862 1506.512 1500
## Footscray Fremantle Geelong Gold Coast GWS Hawthorn
## [2791,] 1491.370 1502.481 1533.124 1476.197 1517.994 1497.941
## [2792,] 1485.004 1504.456 1533.719 1468.234 1525.957 1496.460
## [2793,] 1485.004 1504.456 1541.934 1464.799 1523.850 1496.460
## [2794,] 1486.105 1506.316 1541.934 1463.956 1526.889 1495.288
## [2795,] 1485.241 1504.562 1540.387 1463.956 1526.889 1493.680
## [2796,] 1489.859 1502.480 1542.282 1460.405 1526.402 1492.532
## Melbourne North Melbourne Port Adelaide Richmond St Kilda Sydney
## [2791,] 1479.339 1492.084 1498.387 1502.193 1492.304 1487.083
## [2792,] 1479.219 1496.137 1507.824 1498.140 1482.867 1486.488
## [2793,] 1476.163 1499.573 1507.824 1489.926 1482.867 1491.976
## [2794,] 1476.163 1496.533 1505.964 1487.403 1483.710 1491.976
## [2795,] 1477.917 1496.533 1507.511 1487.403 1476.285 1493.585
## [2796,] 1475.459 1503.730 1502.893 1492.071 1471.617 1497.135
## University West Coast
## [2791,] 1500 1505.376
## [2792,] 1500 1511.741
## [2793,] 1500 1506.252
## [2794,] 1500 1506.252
## [2795,] 1500 1509.863
## [2796,] 1500 1511.010
final.elos(elo.data)
## Adelaide Brisbane Lions Carlton Collingwood
## 1516.706 1508.817 1481.227 1518.862
## Essendon Fitzroy Footscray Fremantle
## 1506.512 1380.902 1489.859 1502.480
## Geelong Gold Coast GWS Hawthorn
## 1542.282 1460.405 1526.402 1492.532
## Melbourne North Melbourne Port Adelaide Richmond
## 1475.459 1503.730 1502.893 1492.071
## St Kilda Sydney University West Coast
## 1471.617 1497.135 1412.936 1511.010
fixture <- fixture %>%
mutate(Prob = predict(elo.data, newdata = fixture))
head(fixture)
## # A tibble: 6 x 8
## Date Season Season.Game Round.Number Home.Team Away.Team Venue
## <date> <int> <int> <dbl> <chr> <chr> <chr>
## 1 2019-06-30 2019 1 15 St Kilda Richmond Marv…
## 2 2019-06-30 2019 1 15 Brisbane… Melbourne Gabba
## 3 2019-06-30 2019 1 15 Fremantle Carlton Optu…
## 4 2019-07-05 2019 1 16 Hawthorn Collingw… MCG
## 5 2019-07-06 2019 1 16 Essendon Sydney MCG
## 6 2019-07-06 2019 1 16 Gold Coa… Richmond Metr…
## # … with 1 more variable: Prob <dbl>
So if you run that script end to end you should have a working ELO model that you can play around with.
Somethings to note.
- This model has no HGA, yes you read that right!
HGA <- 0
its set to 0 here! - No mean reversion between seasons!
carryOver <- 1
I think thats right? - I set k=20 for no reason other than I read this blog post here 538 ELO, is that the best way to decide? What would you do differently?