Previous article can be found here
For Robert Nguyen, Monday night’s Brownlow Medal count will have a decided aspect of de ja vu about it.
You might remember Nguyen from the lead-up to the 2014 count. He’s the young Perth whiz who devised a statistical model to predict the winner of the Brownlow and was writing a university thesis on it.
Last year, Nguyen’s model tipped Gary Ablett (finished equal third) to poll strongly early and hold on in front of the ineligible Nat Fyfe (second) and Eagles midfielder and eventual winner Matthew Priddis.
This year, it’s predicting $1.95 favourite Fyfe to poll $26 votes and win on the back of his (and Fremantle’s) blistering start to the year - despite a downturn in form and fitness over the season’s second half.
Priddis (25 expected votes) is forecast to be runner-up, with departing Adelaide star Patrick Dangerfield (24 expected votes), Sydney’s Josh Kennedy and West Coast’s Andrew Gaff (both 22) also in the mix.
“Last year was one of the closest counts I could remember - and the three players my model calculated as having the best odds of winning filled out the top three,” Nguyen said.
“I really think Fyfe can do it this year - I think he can build a big enough lead. By my calculations, his odds should be $1.59 and he’s $2 [with some betting agencies] so that’s actually good value.”
A layman’s explanation of the Nguyen method goes a little like this: for each Brownlow count for the past five years, Nguyen has taken a “web scrape” of the preceding five years of stats from individual AFL games. The stats are then lined up with the Brownlow votes for each of those games, enabling the identification of statistical patterns that might correlate to votes awarded.
Using those patterns - and the statistical model they form the basis of - it is possible to then “run” the actual game stats for a given year and come up with “expected standardised game votes” to predict a Brownlow outcome. To ensure random fluctuations are minimised, Nguyen runs each season through his computer 10,000 times.
His last two attempts have identified the three placegetters (in 2013 in exact order) and the model has found four of the last six winners.
In 2014 Nguyen and his calculations had Priddis polling in 13 games (which turned out to be true) but didn’t expect him to poll as many three-votes (four) as he did.
That outcome makes fellow Eagle Gaff one of Monday night’s most interesting contenders.
“I’ve got Gaff polling in the most games of anyone - 14. So if he has a few best-on-grounds he could be a big danger,” Nguyen said.
Sydney’s Kennedy is also an attractive proposition. Teammate Daniel Hannebery ranks higher in Brownlow Medal markets but Kennedy - $2.80 with the WA TAB to poll the most votes of any Swan - fares better under Nguyen’s model.
“Against Fremantle in the first week of the finals, it was the 11th game in a row Kennedy had more than 30 possessions,” Nguyen said.
“And he finished first in the competition for contested possessions as well.”
Nguyen’s thesis earned him honours with distinction from his applied statistics course at the University of Western Australia.
He has since moved to Sydney to work for the Commonwealth Bank but hopes to one day embark on further study involving sports statistics.
The 25-year-old says he’s actively looking for drinking partners in the Harbour City, who are keen to talk footy - and footy numbers - over a beer.
2015 Nguyen Model predictions:
data.frame(stringsAsFactors=FALSE, Player = c("Nat Fyfe", "Matt Priddis", "Patrick Dangerfield", "Josh Kennedy", "Andrew Gaff"), Predicted.Votes = c(26L, 25L, 24L, 22L, 22L), Predicted.games.polled = c(12L, 10L, 12L, 11L, 14L), Nguyen.Odds = c(1.59, 3.98, 4.94, 6.3, 7.11), WA.TAB.ODDS = c(1.95, 4, 7.5, 14, 18) )
## Player Predicted.Votes Predicted.games.polled Nguyen.Odds ## 1 Nat Fyfe 26 12 1.59 ## 2 Matt Priddis 25 10 3.98 ## 3 Patrick Dangerfield 24 12 4.94 ## 4 Josh Kennedy 22 11 6.30 ## 5 Andrew Gaff 22 14 7.11 ## WA.TAB.ODDS ## 1 1.95 ## 2 4.00 ## 3 7.50 ## 4 14.00 ## 5 18.00
data.frame(stringsAsFactors=FALSE, Predicted.Finish = c("Gary Ablett", "Nathan Fyfe", "Matt Priddis"), Predicted.Votes = c("26 ", "25 ", "24 "), Actual.Votes = c("22 ", "25 ", "26 "), Actual.Finish = c("equal third", "second", " first") )
## Predicted.Finish Predicted.Votes Actual.Votes Actual.Finish ## 1 Gary Ablett 26 22 equal third ## 2 Nathan Fyfe 25 25 second ## 3 Matt Priddis 24 26 first
data.frame(stringsAsFactors=FALSE, Predicted.Finish = c("Gary Ablett", "Joel Selwood", "Dane Swan"), Predicted.Votes = c(30L, 26L, 24L), Actual.Votes = c(28L, 27L, 26L), Actual.Finish = c("first", "second", "third") )
## Predicted.Finish Predicted.Votes Actual.Votes Actual.Finish ## 1 Gary Ablett 30 28 first ## 2 Joel Selwood 26 27 second ## 3 Dane Swan 24 26 third