Tuesday, February 22, 2011

Quantifying the Bias of Arsenal's Referees

A few posts ago, I attempted to quantify the bias of Phil Dowd’s record over the last two years of officiating versus the bias of all other referees who have officiated at least three Arsenal matches over the same time period. Since then, I have received a good bit of constructive feedback, especially when it came to the sample sizes used in the study. None of the feedback indicated any major errors, but more regarding the subtleties associated with varying statistical theories that could be used as a substitute for my Mann-Whitney approach.

Nonetheless, the feedback indicated there was a good bit of demand for a more extensive analysis. Thus, I contacted Tim at 7AM Kickoff and made a deal – he would compiled the earlier seasons of data , and I would analyze the data using common statistical methods. This post is the output of that study.

Developing the Model and Collecting The Data

The approach I wished to pursue in this study was a general liner model (GLM) as it allows for the study of the impact of multiple factors (and their interactions) on various outcomes. In this case, Tim and I were interested in studying the effects of Premier League season, referee, and match venue on fouls, cards, and shots. This study grew out of two mutual interests: Tim’s desire to quantify perceived overall referee bias during this Premier League season when compared to previous ones, and our joint desire to understand the most- and least-favorable referees when it comes to our beloved Gunners. I decided to throw in the match venue impacts at the suggestion of a dedicated reader of my blog.

A GLM has one unique requirement that presents challenges when applying it to officiating data: it requires at least one sample of each unique combination of attributes. This means that to build an overall GLM, we would need a data point from each season where each referee officiated at least one home and one away game in each season analyzed. This immediately presents a challenge, because leagues purposefully randomize assignments to minimize the effects of officiating on match outcome, and thus have unbalanced officiating from year-to-year. Increasing the number of seasons in a desire to increase sample size ends up limiting the types of GLMs that can be created. Tim and I settled on pulling data from the 2006-2007 seasons through the current season – it provided enough balance in sample size and attribute combinations to allow a two-phase study of officiating of Arsenal’s matches.

A table displaying the count of each referee’s matches officiated over the last 4+ seasons is shown below – 178 matches in all (current through the Wolves game on February 12th). The columns across the top indicate the season, where the second half of each season is used to denote the full season (thus, the 2006-2007 data is found in the column labeled “2007"). Each referee has three rows associated with their name – a row indicating their count of home matches (1), away matches (-1) and total number of matches. The column on the far right, labeled “Grand Total”, shows the total number of home and away matches officiated by each referee. Click on table to enlarge it.

What becomes immediately clear is that a GLM of seasons 2007 through 2010 by official and by match venue would have an extremely limited data set – only Atkinson, Bennett, Webb, and Wiley have officiated at least one home and one away match during that time period. Most importantly, a study of Dowd’s officiating would be left out of such a model. Such a GLM can be useful in studying a few of the large effects and their interactions, but not in directly evaluating a wider set of referees.

What’s also interesting is that the some of the numbers in the “Grand Total” column are greatly skewed. Over time, Bennett, Dowd, Foy, and Riley seem to have officiated more Arsenal home matches than away matches, while Clattenberg, Dean, Marinner, Riley, and Webb have experienced the inverse in their assignments. If Scorecasting’s study on home pitch officiating bias holds true in the Premier League, we might have some confounding of individual referee performance with general bias against a visiting team.

Two GLM’s were constructed given the match count shown in the table above:
  1. A wider GLM that looked at the effects of the 2007-2011 seasons and match venue. This will help confirm or deny Scorecasting’s general conclusion regarding referee bias against visiting teams as they apply to the Premier League.
  2. A GLM of the 2007-2010 seasons using the data from Atkinson, Bennett, Dean, Dowd, Foy, Halsey, Webb, and Wiley while ignoring the aspect of match venue. This will help answer the question as to which referees are the most biased for and against Arsenal.
Each GLM will look at four main attributes that officiating can impact: shots taken, the ratio of shots-on-goal to shots taken, fouls, and Premier League points for yellow and red cards (see my last post on this topic for an explanation).  Each of these attributes is expressed as a differential. To be consistent with the direction of the differential in the first post, a negative differential in any attribute indicates Arsenal had the advantage (took more shots, had a higher ratio of shots-on-goal, fewer cards etc.), while the opposite indicates the opponent had the advantage in the match.

In the end, the two GLM’s should help us understand where the bias lies, and a few of its potential causes.

Addressing the Home Pitch Bias of Referees

The first GLMScorecasting, exists the Premier League.

The GLM used all data from the 2007 through 2011 seasons for all referees, categorizing it as home and away while ignoring the contributions of individual referees. It turned out that none of the interaction effects were statistically significant, so the analysis focused on the main effects.  Main effects plots for each of the four attributes are shown below, with commentary below each plot.

Note: The key to reading main effects plots is to look for the center line traveling across the middle of the graph. This indicates the overall average value for that metric, with the average values associated with the individual levels of the factors (x-axis) are indicated by the discrete points on the graph.

It is clear that while match venue has little impact on the foul differential, Arsenal’s beneficial foul differential has shrunk to nothing over the last 4+ seasons. In fact, this shrinkage is one of the rare statistically significant factors not aligned with match venue in any GLM in this study. Perhaps it’s Arsenal’s increasingly tough response to “kick them off the pitch” tactics that has generated this shift? Whatever the case, Arsenal has gone from being on of the cleanest teams to middle of the pack when it comes to fair play.

When it comes to throwing cards, 2011 does represent a new high point for Arsenal but the trend is not statistically significant. Clearly, though, match venue has a huge impact (statistically significant, in fact!). The average home match (value of 1) sees Arsenal acquiring one less Premier League fantasy soccer penalty point (essentially one less yellow card) than the opposition, while away from home (value of -1) they acquire a similar amount of penalty points as their opponents. Scorecasting’s biased referee observations are alive and well!

The behavior witnessed in the shots metric is similar to that seen in the cards category. Arsenal holds a pretty steady seven shots per game advantage over the competition over the last three years, but a statistically significant gap exists between home and away matches.

Finally, it seems Arsenal are improving their effectiveness at putting shots on target versus the competition. The last two seasons have seen the Gunners turn around what was a deficiency into a benefit – not only do they take more shots on average, but they also put a greater percentage of them on target. The fact that the difference between home and away performance is not statistically significant ensures that they are using their reduced shot advantage when away from the Emirates in a manner consistent with home performance.

Ultimately, Arsenal seems to be doing pretty well in the shots department, getting worse when it comes to fouls, and consequently suffering from expected referee bias against away teams when it comes to cards. This final conclusion is especially important given the skewed home/away officiating opportunities afforded several of the referees highlighted in the next section.

Identifying Referees Who Are Biased For and Against Arsenal

The second GLM involved using 2007 through 2010 data to observe any possible bias in the following referees: Atkinson, Bennett, Dean, Dowd, Foy, Halsey, Webb, and Wiley. These referees have officiated 56% of Arsenal’s matches over the last four seasons. Main effects plots for each of the four attributes are shown below, with commentary below each plot. In general, none of the factors studied is statistically significant per the standard GLM tests. What is interesting is that the average value for every metric shrinks compared to the full 2007 through 2011 study performed above. Perhaps the big name matches that guys like Webb officiate provide a better balance between the teams, or maybe it's an effect of the "better" referees getting more matches. Either way, the average gap between Arsenal and their opponents closes when these eight referees are involved.

In general, the average foul differential is tilted slightly in Arsenal’s favor when these eight referees officiate a match compared to the overall average seen in the previous section. This may be due to the exclusion of 2011 to achieve a balanced GLM in the first example, where we observed that overall they've seen their foul differential disappear. It also seems as if Dowd is one of the more generous referees when his 2011 performance is dropped and his 2007 through 2009 performance is added. This may indicate a shift in his average officiating in 2011, a subject I will explore later in this post. Conversely, Dean, Webb, and Wiley have their bias against Arsenal confirmed in the foul department. They also are the top three referees when it comes to the number of Arsenal matches officiated, with Wiley and Webb having a pretty even home/away split while Dean has a 2/13 home/away split that may be unduly influencing the results of his officiating in Arsenal matches.

While Wiley has the highest foul differentials against Arsenal, he has one of the lowest card differentials. On the other hand, Webb and Dean follow through on the fouls with the two highest card differentials. With an average card differential of nearly -0.50 with these eight referees, Dean’s and Webb’s nearly 0.5 card differential against Arsenal represents nearly 1 extra yellow card per match for the Gunners. Again, Dean’s results may be biased based upon his high ratio of away matches, but Webb’s officiating is nearly evenly split between home and away matches.

As with the other responses, average shot differential goes down compared to the total sample of 2007 through 2011 matches. Dean and Webb again lead the pack in terms of officials showing an anti-Arsenal bias. Bennett and Foy appear to be the most pro-Arsenal referees, with Atkinson and Dowd not far behind.

The shots-on-goal ratio shows perhaps the biggest average shift when compared to the full 2007 through 2011 GLM. Webb and Dean rank close to the average, with Dean even showing slight favor to Arsenal. Matches where Foy officiates show the biggest disadvantage for Arsenal, with nearly a 10% gap between their ratio of shots-on-goal versus the opponent’s.

The Increasing Bias of Phil Dowd

So where does this leave Phil Dowd in relation to the Gunners, especially given my last post?  It would appear from the data above that he's actually biased for, and not against, Arsenal!  A closer examination of the data suggests otherwise.

It seems as if there is a shift afoot in Mr. Dowd’s officiating when it comes to the Gunners.  We can see this shift when looking at the interaction plot of year vs. referee from the second GLM that was created.  The interaction plot shows the average value of each combination of factors - year and official.  The key element for which to look is non-parallel lines, which indicate bias by one of the officials.  The interaction plot is shown below (click it to enlarge).

Looking at the lower left graph shows how nearly every referee's average points based upon yellow and red cards went down from 2009 to 2010.  The only referee's score to go up was Dowd's.  Things have gotten worse in 2011 as well, with Dowd's average over the three games he's officiated coming in at a whopping 2.33.  Dowd is the most biased against Arsenal in the 2011 season and has seen his card point total increase by nearly three points (a full red card!) from last season.  In four seasons Dowd has gone from a -2.0 card point total in 2008 to a 2.33 total in 2011 (a full read and yellow card swing).  No other referee exhibits this kind of shift. Over that time period Dowd has officiated seven home matches and four away matches, clearly bucking the trend of biased officiating coming against away teams.


Clearly Webb and Dean are the referees that generally make more calls against Arsenal, with Webb the only one of the two with a reasonably balanced home/away officiating record that would eliminate away team bias from consideration.  Phil Dowd is rapidly approaching Webb's level of bias, with a massive shift in the way he's called matches since 2008, and a higher proportion of home matches officiated that eliminates away team bias as an excuse.

Foy and Atkinson may provide the most reliably pro-Arsenal calls.  Both provide the best combination of foul and red/yellow card advantages to Arsenal after Dowd is eliminated for his time-based shift in officiating results.

While Arsenal have consistently had a penalty and card advantage in years past, such an advantage is almost non-existent this season and last.  However, they have maintained their advantage in number of shots and percentage of shots on goal.

As a Gooner, I hope that we draw guys like Atkinson and Foy the last quarter of the season.  This gives the Gunners the best chance to close the gap to Manchester United.  I'll come back to this topic once the season is complete and we hopefully have a higher number of referees with home and away match officiating opportunities.  That would allow a more complete GLM of officials vs. venue and year.

Of immediate concern to Gooners is the fact that Dean will be officiating this Sunday's Carling Cup final.  Here's hoping he shows far less bias on the neutral pitch of Wembley than he has in Arsenal's away matches.


  1. Have have noticed similar things...this is a very good blog, well researched and really should have more attention.

    I do statistical analysis on referee's pre-match for Arsenal games - it seems that the likes of you and me have to do the job of the FA, PGMOL and the world of sports journalism for them!

    My latest post is here:


    I'm sure you could do a guest spot on the site to draw some traffic your way? I think you would be most welcome!

  2. Dog -

    Great post at your site! It appears that you have a wider data set of referees and how they've called matches for multiple teams. How many seasons and teams does it cover?

    Let me think about what analysis I might do with such data, and I'll get in contact with some ideas of what a guest post might contain.

  3. No worries Zach - I have mountains of data for all referees over all matches going back 10+ years (although I only have betting data for the last 5 seasons or so), maybe we should discuss some analysis techniques?

    Send me an email :)

  4. This comment has been removed by a blog administrator.