The top spots in the Castrol rankings provide few surprises, but the important decisions in soccer are usually more difficult than evaluating whether van Persie is a good player. Further on down the line players have a multitude of strengths and weaknesses, varying playing time, and may get lost in the crowd. Once past the top stars, assessing performance becomes more challenging.
Thus, we look to the performance measure for assistance in evaluating performance for the entire population of EPL players. However, our chosen measure has a weakness in describing performance for part-time players. Specifically, a player’s minutes are highly correlated with his Castrol Rating score (a correlation of 0.88). With such a strong correlation, the influence of playing time may drown out actual performance. Opta, the creators of the Castrol Index, admit that the scores of those players with insufficient playing time are punished, with the downward scoring starting with players at about 60% of the overall minutes available
.
Using the 60% standard as a guide, the relationship between playing time and performance scores above and below the 2070 minute threshold is studied (equal to 23 games of the 38 possible in the EPL season, or about 60% of the available minutes). For the 164 players with more than 2070 EPL minutes (the “2070+ group”), there is no significant correlation (a correlation of 0.019) between minutes and Castrol score. For the group below 2070 minutes (the “< 2070” group), the relationship remains strong (correlation of 0.89). Thus, it seems there is a threshold of playing time required before one can get an unbiased evaluation of performance.
Adjusted Index Scores
Consequently, the < 2070 group’s scores are adjusted so that an unbiased comparison can be made between them and the unbiased group of 2070+ minutes. An expected value for the < 2070 group (with a single regression equation) was created to determine the expected score for a given amount of playing time. Once whether or not a player’s Castrol Index score exceeded expectations could be calculated the question became by how much.
To evaluate this, the entire EPL population was divided into roughly a dozen equal groups, and the range of scores was compared around the average for each group. The range around the mean for our unbiased groups was set as the goal and the < 2070 groups were adjusted to have similar variability. Finally, the player’s relative performance in relation to their playing time was taken into account and they were put in the spot of a player with 2070 minutes – the cutoff for unbiased evaluation by the Castrol Index. In short, the goal of the adjustment process was to allow direct comparisons between the < 2070 group and the 2070+ group.
All in all, the adjusted scores are not related to minutes with a correlation of .019; which compares favorably to the correlation of 0.88 for minutes and the Castrol Index ratings. Further, the range of players’ scores is reduced since playing time no longer artificially discriminates amongst performance levels. Finally and most notably, the adjustment process unsettles some of the player rankings and provides a new perspective on performance.

When one turns to the individual players, the adjusted scores seem to better reflect conventional opinion as well. On its face, it is tough to argue that Cissé, Drogba, or Balotelli were not difference makers on the pitch. And with some players moving up due to the adjustment process, some players see their rankings fall in comparison. For example, Newcastle forward Demba Ba falls from 10th in the Castrol rankings to the 21st ranked forward under the Adjusted Index. Theo Walcott moves from the 21st Midfielder to 40th. Some may debate the movements of individual players, but the aggregate movement of the players in each position seems to square better with performance adjusted for playing time.
One should keep in mind that a good score on the index, or the adjusted index, does not eliminate normal variability in performance. One good year does not guarantee a standout performance the next. A low rating may have an explanation that soccer fans can readily identify with such as an injury, or playing out of position, or trouble acclimating to a new team. But a good performance is still a good performance, and awareness of a good performance, even if only for a limited period of time, is useful.
Utility of the Adjusted Index
Using the adjusted formula, one can then locate players that have put in an objectively superior performance, but may not be as recognized due to a relative lack of playing time. For example, even with limited time on the pitch, the time spent by Ryan Giggs was impressive. With the adjustments, his performance score went from 570 to 730 under the Adjusted Index. The position rank goes from 62nd to 15th, a dramatic increase. And while the professionals will still need to predict how a player would fit in, and the likelihood for better or decreased performance; the ability to quickly highlight players that deserve extra attention can save time and increase the scout’s success rate. Ryan Giggs’ age may prevent him from going a full 90 minutes in each league match, but the data as re-envisioned does suggest that Manchester United should look to utilize him as much as physically possible.
Some may use the scores as an independent assessment of roles within a team. For example, Dimitar Berbatov has publicly expressed dissatisfaction with his time at Manchester United. His Castrol score is 241, good for 328th overall in the EPL and 58th among EPL Forwards. He has reason to argue that his poor showing is due to lack of playing time. In contrast, his Adjusted Index score is 687, which is 81st in the EPL and 22nd among EPL Forwards. In general, it is a decent showing for the amount time on the field he was granted. Perhaps the bad news for Berbatov is that he is still fifth among Manchester United Forwards, behind the adjusted scores for Rooney (837), Welbeck (804), Young (796) and Chicharito (708). Thus, we are not surprised that his pleas for more time have not been answered.
The Winning Formula in EPL
In this section, the objective performance data – i.e., the Castrol Index and the Adjusted Scores – is used to analyze the relative contribution of the different positions towards team success, and to model the value of players. Midfielders and Defenders have higher average Castrol Index scores, an initial suggestion that these positions might have more value for winning.

Considering forwards are the highest paid, investing in defenders may be a more productive use of resources. However, we have already discussed the Castrol Index scores relationship to playing time. And typical playing time differs among the positions because managers are more likely to pull forwards mid-game than their back line. Indeed, forwards average about 400 minutes less per season than defenders. With the Adjusted scores, which are unrelated to playing time, we see that forwards are higher on average, a result in direct contrast to the Castrol Index averages.

Position Ratings and Relationship
Using both the Castrol Index and the adjusted scores for EPL players, I set about replicating the Leinwand- Anderson analysis of MLS data6. Their analysis created a regression equation using a team’s average Castrol Index rating for each position and the team’s league points (i.e., a mathematical model roughly summarized as: a Constant + Fwd Rating + Mid Rating + Def Rating + GK Rating = Expected League Points). Replicating the analysis using EPL data resulted in an R2 value of .835, indicating the average position ratings for a team can explain a healthy 83.5% of the variability in league points. But for the individual elements of the equation, only the midfielder and goalkeeper coefficients are statistically significant in this model. Thus, strictly speaking, we would be unable to justify using the model for forwards and defenders.
Using the adjusted ratings instead, the R2 value was 0.857; which adds another 2% over the model using the original Castrol Index ratings. In this version, though, only goalkeepers were statistically significant. Consequently, the adjusted ratings were a slightly better predictor of team performance, but of less value for individual positions.
Regression Weighted by Playing Time
The next revision of the model sought to address a unique soccer feature of the regression equation. The Leinwand-Anderson equation treats Forwards, Midfielders and Defenders as equal units. But it is known that there are uneven numbers at each position depending on team formation and game situation. Thus, the equation was modified to account for the relative time contribution of each position. To do this, average position score was used from the first analysis, and weighted by that position’s contribution to the overall team minutes. For a mathematical illustration, the Tottenham forwards may have an adjusted index average of 738, and contribute 12% of the team’s minutes. The revised model would then report 0.12 * 738 or 88.56 as a value of that position's point contribution to the team.
By considering playing time with each position, the multiple regression equation for the position's adjusted point contribution was significant for Forwards, Midfielders, and Defenders, with a non-significant p-value of 0.13 for Goalkeepers. Thus, this model permits analysis of all field positions. In this model, forwards have a larger regression coefficient than midfielders, with both larger than defenders. This would support the notion that forwards contribute more to wins than other positions. Further, the R2 value was 0.887; which, by explaining 88.7% of the variance in league minutes, represents the model with the greatest explanatory power (compared to 0.835 for the Leinwand-Anderson model and 0.857 for the Leinwand-Anderson using adjusted index ratings).

Considering which equations are the best fit for the 2011/12 EPL data; the trends are that weighting the ratings with minutes is better than a straight position average, and that the adjusted scores are better than the Castrol Index. Of these options, the weighted-adjusted model is the best fit, which mirrors the conclusion reached for the MLS data7.
Applying the Model to Individual Examples
Although this mathematical model shows each position’s contribution to league points, the model is clumsy for team use. Contracts are determined player by player, and not by position group. Thus, in order to make the equation useful, the regression equation coefficients need to reflect the value of one player. Consequently, it was determined forwards accounted for 16.4% of all league minutes, midfielders 38.9%, defenders, 35.4% and goalkeepers 9.2%. With eleven players, it was calculated how many "players" were assigned to a particular position group. Using forwards for example, this position had 1.8 players' worth of league minutes applied to this position's regression coefficient.
As a result, the estimated impact of inserting a field player with a higher adjusted index score on the team's league points can be calculated. Recalling our model, the forward position group had the highest contribution to league points being about 1.5 times the contribution of midfielders. But when comparing a single player to a single player, the value of a forward was three times greater than that of a midfielder: due in large part to the midfielder group consisting of more players than forwards.

Which Positions Are Most Important in the EPL
However, value of a player is not only defined by one’s contribution to wins, but also in the scarcity of that talent. If there were dozens of van Persie clones seeking transfers to EPL teams, the teams would rightly concentrate on other areas to improve their team (after picking up one of the van Persie clones on the cheap). To illustrate, the adjusted index scores was revisited to provide some examples for the value of each position by considering the benefit of an identical player upgrade for each position. Specifically, what is the value of replacing a median position player with the 80th percentile player for one full season at that position?
Upgrading from a median forward to an 80th percentile forward (represented by F. Torres) would expect to yield an additional 18.68 league points. Upgrading from a median midfielder to an 80th percentile midfielder (S. Pienaar) would expect to yield an additional 4.21 points. And upgrading from a median to an 80th percentile defender (L. Baines) would expect to yield an additional 3.03 points. While the hypothetical 50th percentile to 80th percentile upgrade ranges between 70 to 100 points, the equation indicates that this hypothetical upgrade would have a much larger impact if it was a forward upgrade.
Team-Specific Use of the Model
For another use of the model, a team may weigh the benefits of a transfer on league results by applying the difference between a desired player and the current incumbent. For example, Clint Dempsey is listed by Opta as a midfielder and has an adjusted score of 742. Dempsey played nearly every minute of the EPL season for Fulham in 2011/12. If Dempsey, and his minutes, replaced an average Liverpool Mid (633), then Liverpool would expect an additional 6.1 league points under the model. In a person for person switch, Dempsey in place of C. Adams (and at Adams lower level of minutes) would lead to an expected additional 3.33 points. Still a benefit, but a smaller effect due to Adams higher performance level (650) and the lessened impact due to an assumption of lesser playing time.
If Arsenal considered acquiring Dempsey, the same process can be applied using team-specific data. Dempsey replacing an Arsenal average Midfielder, and with Dempsey’s 2011 minutes, would lead to an expectation for an additional 2.63 points for Arsenal. Should Dempsey replace Theo Walcott and his minutes, Arsenal would expect an additional 3.33 points. Consequently, the value of Dempsey varies depending on the team making the assessment and the intended use. The model would explain some of the varying assessments of value, and is flexible enough to make some adjustments for playing time and role.
And in a pinch, a team’s evaluation staff can insert an “expected” performance level for an unknown (i.e., one can say the target can fulfill a “John Terry” role, thus insert Terry’s adjusted value of 720 into the equation). Alternatively, again relying on technical expertise, one can use a high and low benchmark provided by the staff to create a range of expected performance for a new player (e.g., the staff may conclude the target’s ceiling may be Dempsey (742) and likely no worse than Scholes (686)). Applying the predicted range can provide feedback on whether the transfer costs are worth the expected performance gain and risk. Finally, it’s worth pointing out that although the model relies on the Castrol Index as the basic metric, this analytical process can be repeated with any available continuous variable that a team believes predicts team results.
Limitations of the Data, and Next Steps
This data is based on one season of EPL data. And while the process yields similar results to the MLS analysis, the fact remains that any decisions for EPL players is based on 20 observations (the EPL teams of 2011/12) summarizing the play of over 500 players. Even with strong relationships, the addition of future data will only improve the model. For example, with one season’s worth of data, the variability of a player’s performance is not captured. Two players may have the same average performance rating for the season – but teams may value a player with large highs and incredible lows differently than a player that puts in the same performance game after game. This model does not account for that variation. Further, with the adjusted scores, the relationship to playing time is minimized. However, a player that can remain at a high level over a whole season should rightly be more desirable than a player with a solid run over five or six games. The published Castrol Index attempts to take playing time into consideration, but this model does not for the reasons given above. Thus, while there are benefits to this analysis, one would still need to make their own subjective adjustment for playing time, based on their experience and judgment.
With the caveats aside, an imperfect model does not make the model useless. This is an attempt to determine the value of players based on relationships found in the data. If one uses the adjusted index scores as a quick screening tool, the obligation to follow-up and evaluate the player remains. The screening tool is not meant to replace good scouting, only to efficiently assign resources so that a scout’s hit rate might increase.
Further, when one considers whether a potential replacement player would contribute to winning, the cost of that player (which is not part of the EPL model) needs to be considered against team resources and roster-wide performance consequences. Obviously money matters, but the model creates an analytical framework – based on data – to decide whether a club are getting sufficient value for their money (in additional points). It provides a unique independent data point worthy of consideration.
The hope is that this analysis adds to the debate, and supports the incorporation of analytical processes in team decision-making.
Dave Laidig is a corporate attorney responsible reviewing and negotiating contracts. Prior to attending law school, he indulged his quantitative predisposition by earning a Masters in Psychology, focusing on research methods, statistics, and measuring human factors. He resides in Minnesota and is a season ticket holder for his Minnesota Stars FC.
In addition to the performance metric, the calculations also require basic player information such as player name, team name, position and minutes played. This information is widely reported by many media outlets. For this analysis, however, I used the player info associated with the Castrol scores and the playing time data available by subscription from the EPL Index. See
The Castrol Rating also includes European competition, which is weighted slightly more than league games. The Castrol Index is meant to determine which European players had the best year, and the authors of the index determined that performance in European competition should be emphasized, which is a reasonable determination for that purpose. Here, we offer an analysis of league results, which renders the Castrol Index a less-than-perfect performance measure for teams with extended European campaigns. However, imperfect data may still be useful, and the analysis continues.
. This calculation relies heavily on the work of Benjamin Leinwand and Chris Anderson as I attempted to replicate their work, and then extend to new areas. I appreciate those willing to share results with the public, allowing for debate to improve and continue.
I'm a bit confused with regards to paragraph 2 of "Adjusted Index Scores". How were the groups made exactly and how did you control the variability?
ReplyDeleteThe adjusted index scores reflect my attempt to compare a player's Castrol rating to the expected rating of someone with similar playing time. Using playing time as the independent variable and Castrol score as the dependent, one gets a line with the actual data points spread around it. Visualizing the graph, the actual points are clustered near the predicted line for the players with little playing time, and quickly spread out to a somewhat equivalent distribution after that. Noting the difference between the time-predicted score and actual score is not uniform between 0 and 2070 minutes, I broke them into groups for analysis of the variability by minutes played.
DeleteI organized the players by minutes played, and created groups of 40 or so (from memory, 40 +/- 3) throughout the data set. As a matter of organization and convenience in scanning through the data, I decided to use groups that ended on multiples of 90 minutes (and ensuring an n=40 +/-3). Next, the descriptive data for each subgroup showed the mean and range of data for each subgroup. The 2070+ subgroups had a similar range of actual scores, and the <2070 subgroups had similar ranges save for the subgroups closest to 0.
Based on the initial analysis and a lack of relationship between playing time and scores, the subgroups representing 2070 minutes and above served as the standard. The range for the <2070 groups were set to mimic the range for the 2070+ groups through the use of an appropriate multiplier. Consequently, the adjusted score equals the time-predicted score + ((actual score minus time-predicted score) * subgroup multiplier).
Admittedly, this is my personal attempt to determine the value of a particular Castrol rating given the minutes played, and I realize others may take another route. However, this approach led to an increase in explanatory power over straight Castrol scores for both the MLS data (2011 season) and the EPL data (2011/12 season). Should another approach be desired, I would expect that it adds some incremental validity to the process.
So you split them into the groups. When you are saying you made the variation the same, was this by making the range of each group about the same or by using a multiplier that made the standard deviation/variance the same? So after you have found the multiplier, you multiplied it by the difference between the actual value and the predicted value for the given minutes? And then how did you adjust all of these to make them 2070 minutes? Sorry, just trying to make this recreatable with some other data to see if it comes out to a similar idea. Thanks!
DeleteFor the calculation of the multiplier for the subgroups, I focused on the range, and having the avg range of the <2070 groups mirror the avg range of the 2070+ groups (with the groups closest to 0 having their own multiplier, which has a much more truncated range that typical <2070 groups). I did not adjust based on the Std dev of the subgroups.
DeleteAs the scores for 2070+ group are not related to playing time, I did not adjust scores for players in this group. Thus, once an appropriate multiplier is determined, the adjustment process is applied only to scores for players below 2070. The process is summed up as the multiplier times the delta (actual - expected) plus the expected value. And this adjustment led to the language that the <2070 is put in a position on par with the 2070+ group.
And as evident by the steps listed above, each league would have its own specific adjustment. The MLS formula is different (but the EPL R-squared is higher).