Prior to last week's vacation I cranked out a number of posts at Forbes on Manchester City's Performance Analysis department. The first two were a transcription of an interview with the club's head of performance analysis, Gavin Fleig. The nearly 6,000 words he and I exchanged provided a good bit of information on how the club uses analytics from the youngest youth sides all the way up to the senior first team. The third post outlined how the recently launched MCFC Analytics site is attempting to jump start the public sphere soccer analytics movement by making a good bit of Opta data freely available to the public that was previously available only via a fee. Have a look at all three posts if you're interested in understanding how last year's Premier League champions use analytics to stay on top and give something back to the soccer analytics movement at the same time.
Sunday, August 26, 2012
Tuesday, August 14, 2012
The following is a guest post by Dave Laidig1.
Measuring 2011/12 EPL Player Performance
Soccer is known as the beautiful game, an art form expressed on grass fields instead of canvas. But the results-oriented nature of elite competition removes artistic illusions. Performance matters when literal fortunes are on the line. Teams invest a great amount of resources to identify and nurture players with the greatest ability to perform. They and their fans want to know: who is going to do the most to help the team win?
Historically, this assessment is made by wise old managers that are experienced in the game and are often players in their former lives, and by the legions of supporters who never fail to share their opinion. In the modern game assessments have begun to be quantified, allowing for more objective descriptions of player performance. This analysis tests whether a publically available performance statistic actually relates to winning, and considers the possible uses of that information.
The key component of this analysis is the performance measure2. The Castrol Edge Rating, published by Opta, which assigns a score to players in all five of the major European soccer leagues was selected for this analysis3. In addition to the benefit of being publically accessible, this score is designed to reflect the individual’s overall contribution to winning and identify which players had the best season among the top five European leagues.
The Castrol Rating metric is based upon observations of game data in the 2011/12 EPL season and Opta’s application of an internal calculation incorporating many types of game events and relating them to winning games. While the exact formula for creating the statistic is the proprietary information of Opta, the important thing to know is that the number is intended to represent overall performance (and not just one aspect of the game like goals scored, or tackles, etc.). It should allow for meaningful comparisons between different players and positions.
A review of the Castrol Index scores for 2011/12 shows us that Arsenal forward Robert van Persie earned the highest rating for his work last season, with a rating of 9624. If the acclaim of sportswriters is indicative, van Persie deserved the top spot in the EPL. Scanning the top spots for each position, we see very recognizable players that are not likely to surprise most fans.
The top spots in the Castrol rankings provide few surprises, but the important decisions in soccer are usually more difficult than evaluating whether van Persie is a good player. Further on down the line players have a multitude of strengths and weaknesses, varying playing time, and may get lost in the crowd. Once past the top stars, assessing performance becomes more challenging.
Weaknesses of the Castrol Rating
Thus, we look to the performance measure for assistance in evaluating performance for the entire population of EPL players. However, our chosen measure has a weakness in describing performance for part-time players. Specifically, a player’s minutes are highly correlated with his Castrol Rating score (a correlation of 0.88). With such a strong correlation, the influence of playing time may drown out actual performance. Opta, the creators of the Castrol Index, admit that the scores of those players with insufficient playing time are punished, with the downward scoring starting with players at about 60% of the overall minutes available5.
Using the 60% standard as a guide, the relationship between playing time and performance scores above and below the 2070 minute threshold is studied (equal to 23 games of the 38 possible in the EPL season, or about 60% of the available minutes). For the 164 players with more than 2070 EPL minutes (the “2070+ group”), there is no significant correlation (a correlation of 0.019) between minutes and Castrol score. For the group below 2070 minutes (the “< 2070” group), the relationship remains strong (correlation of 0.89). Thus, it seems there is a threshold of playing time required before one can get an unbiased evaluation of performance.
Adjusted Index Scores
1Dave Laidig is a corporate attorney responsible reviewing and negotiating contracts. Prior to attending law school, he indulged his quantitative predisposition by earning a Masters in Psychology, focusing on research methods, statistics, and measuring human factors. He resides in Minnesota and is a season ticket holder for his Minnesota Stars FC.
2In addition to the performance metric, the calculations also require basic player information such as player name, team name, position and minutes played. This information is widely reported by many media outlets. For this analysis, however, I used the player info associated with the Castrol scores and the playing time data available by subscription from the EPL Index. See http://www.eplindex.com/.
5See the FAQ section for the Castrol Index.
6Available at http://www.soccerbythenumbers.com. This calculation relies heavily on the work of Benjamin Leinwand and Chris Anderson as I attempted to replicate their work, and then extend to new areas. I appreciate those willing to share results with the public, allowing for debate to improve and continue.
7See my earlier analysis of MLS data, “Objective Player Analysis, Part 2”, at http://footiebusiness.com/2012/04/10/objective-player-analysis-part-ii/
Monday, August 13, 2012
My latest post at Forbes quantifies which national teams were the biggest over and under performers at the recently completed Olympic men's soccer tournament. There are the standouts during group play (Senegal, Brazil, Honduras), an evaluation of whether or not Team Great Britain actually realized a home pitch advantage, and a breakdown of how the Mexican squad beat the Brazilians.
Monday, August 6, 2012
Over at my Forbes blog I have spent the last three weekdays rolling out a three-part series explaining why Seattle Sounders FC have achieved record US attendance figures. The first post explains the unique history of soccer in the Seattle area, and how the NASL Sounders built a special connection between the city and the Sounders brand. The second post tells the story of how semiprofessional players kept the sport alive in the city without the draw of top flight soccer, and how the Sounders FC organization laid the groundwork for success before the team played its first game. The third-and-final post provides an explanation of how Sounders FC found unique ways to appeal to both old school Sounders fans and newer fans of the international game. It also attempts to answer the question of “can this be replicated elsewhere?” Have a look if you're interested in learning more about how the best-attended US soccer club achieves such success.
Friday, August 3, 2012
A week ago I posted my predictions for the likely outcome of the group play stage of the men's Olympic soccer tournament. The overall predictions were based upon a compilation of the Soccernomics model as well as four other sources across the soccer writing community. The result was a "wisdom of crowds approach that got six out of the eight quarterfinalists correct, as none of the sources used foresaw Uruguay's and Spain's failures to make it through to the knockout stages. I'll return to this topic at the conclusion of the tournament to provide a numerical look at which teams over and underachieved versus expectations.
Thursday, August 2, 2012
I am still catching up on being way behind on cross posting from my Forbes account. A few weeks ago I had the privilege of interviewing the team at Chimu Solutions that is responsible for building the FootballrRating.com site. The site is based upon a player rating system developed in the aftermath of Euro 2008 by a group of Northwestern University researchers, and now those researchers are working with Chumu Solutions to monetize the model. This post profiles the model's background as well as the group's ambitious efforts to make the model useful to clubs and agents while also understandable to the casual user who may be looking for an edge in their fantasy soccer league. Have a look at what they're working on, and see if you can supply some feedback to them on the site's user interface or how they evaluate players. That's the goal of their work - take the risk of putting a model out there, and collect the critical feedback needed to improve it.
I am still catching up on a number of posts that I have made at other sites lately. This post explained why no matter how much of a genius Arsene Wenger may be, Arsenal must spent nearly £120M to compete with the Manchester Uniteds, Machester Cities, and Chelseas of the world. It was more a post that served as a response to one particular critic's quibbles with data I provided for a post at 7AMKickoff.com. Read on if you'd like to understand a bit more about my statistical models and philosophy, as well as how far behind the competition the Gunners really are.