Monday, May 30, 2011

Using Ordinal Logistic Regression to Predict EPL Match Outcome Probabilities

I just wrapped up a three week break from making statistics related posts.  While I was not posting on the blog, I certainly wasn't sitting idle.  I treated the break much like a professor might treat a sabbatical - as a chance to explore things the daily routine does not allow, all in the pursuit of improving a core skill set or level of understanding of a topic.  I used the break as a chance to begin compiling new statistics databases as well as understand new statistical analysis techniques.

One of those statistical techniques will be featured heavily in the coming weeks, especially when it comes to Premier League match analysis.  In the past I have used binary logistic regression (BLR) to look at how match statistics impact the likelihood of a team winning or not winning a match.  A BLR can only predict one of two outcomes, which provides a bit of a limitation when soccer matches can end in one of three outcomes - loss, tie, or win.  I was challenged by several commenters to explore other statistical models that would allow the prediction of probabilities for all three match outcomes.

Such a modeling environment is called ordinal logistic regression (OLR) (see this example for a  more mathematical, but readable, treatment beyon what Wikipedia provides).  As the name suggests, this type of regression model uses the order of the outcomes (low/medium/high, loss/tie/win, etc.) to build the model based upon the factors' impacts on the likelihood of falling in to one of the outcome "bins".  The assumptions that must be satisfied and the math behind the model are a bit more complex than a BLR, but the insights an OLR can provide are far more powerful.  When applied to the EPL match data that I have courtesy of DogFace, a model showing the probability of losing, tying, or winning a match can be predicted based upon how a team performed relative to the opposition in that match.

An example of such analysis can be found in the two plots below.  These plots utilize data from the 2005/06 through 2009/10 seasons for all clubs in the EPL, and set the values for shot, shots-on-goal, corner, and foul differentials to their averages by venue (home vs. away).  A sweep from the minimum to maximum values for red and yellow card fantasy points is then performed.  The result is the two graphs below that show the effects on the probability of losing, tying, or winning a match based upon card differential (home on top, away on bottom).  Click on either to enlarge.

These two graphs give us a much clearer picture of what goes on in matches home and away.  In home matches, the crossover point where the odds of losing actually exceeds the odds of winning doesn't happen until a differential of about 8 fantasy points (the equivalent of nearly three yellow cards or a yellow and a red card).  For away matches it happens a lot earlier, and in fact on the exact opposite side of the neutral line at -8.  Clearly, venue plays a large roll in determining the odds of winning.

Even more important than the OLR itself is the subsequent calculations that can come from it.  If we know the odds of all three outcomes an expected point total can be calculated from the following equation.

Expected Points = 3*P(win) + 1*P(tie) +0*P(loss)

This now means match statistics can be boiled down to a language with which we're all comfortable: match points.  This is much more intuitive than odds of losing, tying, or winning and provides a very direct comparison between teams, referees, or other factors of interest.

Applying the above equation to the graphs for home and away performance generates a much simpler graph - what was two graphs with three lines each now becomes a single graph with two lines.  Regression equations for the nearly linear relationship between fantasy points and expected math points are also shown.

The regression equations now provide a direct relationship between cards and points.  For every yellow card (3 fantasy points) a team's expected match points are lowered by 0.1, and for every red card (6) fantasy points) a team's expected match points are lowered by 0.2.  The percentage reduction in points will vary based upon the current fantasy point differential, but let's look at the example of a team playing to a neutral fantasy point differential.  Playing at home they would expect 1.7 points while away they would expect 1.1 points.  This means that for a home team playing to that level, an additional yellow card represents a 6% reduction in expected points while a red card lowers expected points by 12%.  For away teams, the impact is even bigger.  An incremental yellow card reduces their expected points by 9% and each red card reduces the expected points by 18%.

Keep in mind that this data does not include the most recent EPL season.  DogFace has been gracious enough to provide me with such data, so I will be updating my database for each club's OLR terms.  Over the coming weeks of the off season I will make several posts exploring the impacts such match statistics have on a number of team's match outcome odds, as well as update my referee analysis to reflect the new data and the new approach I have outlined here.

Stay tuned...

Friday, May 27, 2011

Friday Night Links

I took the second week of my three week vacation very seriously and didn't post a Friday Night Links, so this week's post will be longer than normal as it covers two weeks worth of material. Let's get to it.

New Blogs, Magazines, and Books

This week I have two new things I am reading.

  • Issue 1 of The Blizzard came out on Thursday. I snapped up a copy for myself and am eagerly awaiting its electronic delivery (unfortunately, the guys had a bit of trouble with their ordering system on launch day that they're resolving now). I grabbed Issue 0 when they first published it, and loved every one of its 187 pages. If you don't know about the Blizzard, head over to their website and check it out. I love it so much that I harbor a not-so-secret desire to have an article published in their wonderful quarterly magazine at some point. A boy can dream...
  • Dan Kennett turned me on to Lewis Tong's Squeezing My Skull blog last week. The blog is new and only has a few posts so far, but it has covered topics near and dear to my heart - the econometrics of soccer. Check it out and give Lewis some feedback on his efforts.
Links of the Week
It's a long weekend here in the US, which serves as our unofficial start of summer. I will enjoy the conclusion to the European soccer season like everyone else on Saturday, and then pivot to focus on the US game over the next several months. We have the middle third of our domestic league season to contend with before Europe returns to play in August, as well as the Gold Cup and a slew of friendlies featuring European clubs. It's going to be a good summer.

Next week I also return to regular posts of original statistical analysis. I'll use next week to give you some previews of what I am working on, and ease back into regular posts over that week. I've got a slate of posts planned throughout the summer, so even though the EPL is on break I will not be. Thanks for sticking around and waiting out my much needed three week break. I am convinced it will make my posts over the next several months all the better. Have a great weekend, and I will see you guys on the back side.

Tuesday, May 24, 2011

Google Translate of Blog Now Available

I just added the Google Translate widget in the column to the right of the blog posts.  I hope that it assists my readers enjoy the blog in their native language if English is not their first choice.  Enjoy!

Sunday, May 22, 2011

A Week Of Great Soccer in the Pacific Northwest

I have made it through the second week of a break from statistical analysis on this blog.  While my Gunners have limped their way to a 4th place finish in the league and face yet another off season of insufficient spending, I have been able to enjoy two great weekends of live soccer here in the Northwest via my Seattle Sounders.  Below are my highlights from the two matches.

Seattle Sounders FC vs. Portland Timbers

I have to admit - I was hopping for the full three points in this derby.  Our back line and goal keeper fell asleep on one free kick, and that cost us the win.

No matter the result, the atmosphere was awesome and I can't wait to see this match on an annual basis.  The crowd was jacked from the get go, with the major supporters groups stoking the flames throughout the match.  Below are a few images and videos that do the best job of capturing the feeling, although nothing is quite like being there.

My hope is that MLS continues to grow and that teams genuinely create rivalries.  I don't know how successful they will be at this given the geographic size of the US and the lack of history many clubs have. I just feel very lucky to be a supporter of a team who has such a storied history with a rival that we don't need to create the derby atmosphere - it existed long before MLS awarded our two cities our franchises.

Here's some tifo from the north end of the field.  The upper right hand corner is the tifo from the 550 away supporters from the Timbers Army.  There were many other Timbers supporters spread out throughout the stadium, a group of which were above us and routinely solicited "PORTLAND SUCKS!" chants from our end of the field.

Here's what the tifo on the south end of the field looked like.  Truly epic!  Grant Wahl referred to it as a "half acre" of tifo.  Four Sounders legends plus Freddy Montero are in the lower bowl, "Decades of Dominance" with a sounder fist crushing a Timbers logo in the upper bowl, and the smiling face in the bottom left is Portland nemesis Roger Levesque.  Tifo that is five sections wide and two bowls high - try and top that!

Here's a video of what the tifo looked like as it came out.  It's too bad the video has some Bach music in the background - far more impressive was the "Welcome to the Jungle" that was played while those banners were flying.  It may be a 25 year-old cliche sporting event song, but it added so much to the mood in the stadium.

More importantly, the videos below capture the scoring action.  The scene after the Fernandez goal was simply electric.

While I was not happy about the equalizer from Portland, the video does a great job capturing 500+ supporters that made the trip up to Seattle and were packed in to the Northeast corner of the stadium.

In all, it was a wonderful derby weekend here in the Northwest, and I am sure we set a new standard for MLS rivalries.  I just wanted those two extra points!

Seattle Sounders FC vs. Sporting Kansas City

In some way you have to feel for the players of Sporting KC.  They've been on one heck of a road trip to begin this season, playing nearly three months on the road while their new stadium is completed in time for the June 9th inaugural match.  The team is certainly investing for the long-run, but it may be doing so at the expense of the near term with a 1-6-1 record and four points to show for it. is showing they have a less than 20% chance of making the playoffs after last night's loss, although my Sounders had even lower chances at a much later stage of last season and still rallied to make the playoffs.  Sporting KC will need something similar this season, and it's not impossible given the benefits of a second half schedule heavy on home matches.

More important to me than the three points was that this was the first Sounders match my older daughter has attended.  She's been watching a number of the matches on TV for nearly a year, so I bought tickets to the match against the Union in July.  This week that match was moved to October to accommodate the Union's friendly against Real Madrid, and I was none too pleased.  I lucked out as I had a friend who had to get rid of tickets to this weekend's match, so I picked them up and my older daughter and I went to it.  The pictures and videos below serve as highlights of a fun evening that I think maybe, just maybe, hooked my older daughter on the live experience.

Below are the two US Open Cups the Sounders have won since moving up to MLS.  It's a bit blurry since the older daughter took it, but it was an awesome experience.

Below is my daughter enjoying the music from Sound Wave, our team's band that plays at every match.

Here's the stoppage time goal from Jeff Parke, who hadn't scored in seven years.  It was an amazing experience - my daughter was picked up off her seat and went for a ride with me as we jumped up and down, high-fived a few people, and generally reveled in an additional two points earned by our boys.

Get Microsoft Silverlight

These kinds of endings seem to be par for the course when Kansas City comes to town - they lost last year's match due to a Mike Fucito stoppage time goal.

Post-match, my daughter and I had a fun time leaving the stadium.  On the way out the door we ran in to a guy who had a giant green foam Z for our fallen winger, Steve Zakuani.  The Spirit of #11 lives as he rehabs from that horrific leg injury he has suffered.

At the end of the walk, we ran in to Sound Wave who was celebrating the win with a rendition of Muse's Knights of Cydonia.  The guy with the giant Z makes another appearance here as well.

My daughter loved every minute of it.  Of course, the match not ending in a nil-nil tie nor the drenched state of the Portland match were definitely plusses.  She's a bit disappointed that she must wait until October for her next one, but I'm sure she'll get over it.  It's all I could ask for with a seven year old - not absolute rejection, and a good interest in going to another match.

Saturday, May 14, 2011

The Resumption of the Cascadia Derby

It may not have the history of the Merseyside Derby or the level of hatred of the Old Firm, but today marks the resumption of American soccer's penultimate derby - the Cascadia Derby.  Tonight I will be at Qwest Field when the two loci of the US soccer universe meet for the first time in MLS history.  My Seattle Sounders FC will host our neighbors from 174 miles (280 km) to the south, the Portland Timbers, in front of 36,000 supporters and a nationwide TV audience.

I enjoy visiting the city of Portland, often taking a weekend every six months to travel there with my wife.  I've found that our cities are much like siblings - similar enough to get on each other's nerves, yet different enough to greatly dislike each other.  It starts at the city level, and just gets all the more intense when it comes to the soccer teams that represent them.

For those unfamiliar with this weekend's game I offer you three takes on it.
  1. The serious.
  2. The cliched.
  3. The tongue-in-cheek.
I will be there tonight, just outside the south end where the Emerald City Supporters will be their loudest.  I'll have no voice at the end of the night, and I hope we come away with a full three points.  If the 11 PM EST kick isn't past your bed time, I would suggest checking the match out.  You will witness where every MLS fan hopes the rest of the league can get to in terms of attendance, vocal support, and intense rivalries.

Friday, May 13, 2011

Friday Night Links

Here are my favorite links from the last two weeks that were soccer...

Wednesday, May 11, 2011

2011 MLS Salaries: Down from 2010, Up 60% Since 2006, and in Total Equal to the Torres Transfer Fee

So I am only a few days in to my vacation from the blog, and I already feel compelled to break my vow and make a post.  Who can resist analyzing the 2011 MLS player's salaries when such a treasure trove of information is so neatly presented by the MLS player's union?  The topic has been pretty well covered already, but I've covered this topic in the past, am integrating the data into a wider MLS database, and haven't seen too many neat graphs yet.  So why not make a quick post on the topic?

Rather than just focus on the change versus 2010 salaries, I've chosen to focus on all the publicly available data I have that goes back to 2006.  The trends and associated conclusions are interesting, and shed light on how MLS is attempting to manage its financial position within the US sport's landscape.

Player Salaries

MLS's player salaries are very non-normal, so using an average value to make comparisons is not appropriate.  Rather than use the average, the median is used from each season to make comparisons.  The graph below (click on it to enlarge) shows the median player salary by season, the per cent change versus the previous season, and the overall inflation per season versus 2011.

The annual rate of change from one year to the next peaked in 2009 at 33%, just prior to the latest CBA. The terms of the CBA and expansion since it was signed seem to have reversed the growth trend to the point the median salary has dropped by 13% in 2011 versus 2010.  While salaries are down 13% from last year, the median pay is up 60% since 2006.

Salaries aren't the only thing has changed since 2006 - the number of games in a season has also changed.  The 2006 season saw 32 games, 2007 through 2010 had 30 matches, and the 2011 season will have 34 matches.  While not a perfect predictor of the effort required in a season, the number of matches can serve as a proxy for how hard a player is working for their pay.  The graph below (click on it to enlarge) re-plots the data from the graph above, but normalizes it to a per match basis by season.

Given that the 2007 season saw a reduction of 2 matches versus 2006, the pay per match actually went up 7% while gross pay remained flat according to the first graph.  The next three seasons see pay changes identical to the gross numbers shown in the first graph as the number of matches does not change.  The final season (2011) saw a drop in per match pay even bigger than the gross data - 24% reduction versus 13%.

Jeremiah Oshan at SB Nation details some of the reasons why the median player salary has moved down, so I won't recount them here.  Whatever the reason, the median player is earning less than 2010 and working harder to earn it.

Team Payroll

So what is each team spending on payroll over the years?  The graph below (click to enlarge) shows team payroll from 2006 to 2011.  The numbers have been unadjusted for MLS median player salary (more on that later).

The graph shows a steady increase in the team payrolls from just under $2M in 2006 to over $3M in 2011.  Four teams exceeded $4M in 2011 payroll - Chicago, New York, Los Angeles, and Toronto, although the move of Juan Pablo Angel from NY to LA and Dwayne De Rosario from Toronto to NY have lowered the peaks those four teams experienced in 2010.  Clearly, the addition of roster spots in 2011 has boosted the bottom line cost of fielding a squad in 2011 even if the median income of a player has gone down.

So what happens if we baseline everything in 2011 MLS dollars?  Similar to Pay As You Play, a player pay inflation factor (in this case based on wages and not transfer fees) is required.  The change in median player salary is used for such an inflation factor.  The inflation factor for each season shown in the first graph was applied to each player's salary in that season to provide what their equivalent 2011 salary would be.  Those salaries in 2011 MLS dollars were then totaled by team, and are presented in the graph below (click on graph to enlarge).

The inflated 2011 wages from 2006 onwards show two clear conclusions:
  • MLS per team payroll has not increased significantly since 2006 (although the number of teams making such a payroll has).
  • MLS per team payroll was declining from 2008 until this season.
What's also interesting is that the real time payroll differential of LA versus the rest of the league shrinks over time with each year of David Beckham's contract.  Adding Landon Donovan's $2.3M DP contract helped stem the rate of decline in 2010, but it still didn't stop NY from passing LA as the most expensive team in the league.  Of personal note to Sounders fans is the virtual erasure of any payroll cost differential they had the last two years.  They've gone from leading the pack of teams outside the Big Four spenders to falling in line with the rest of that pack.  Parity not only rules on the field, but it rules on the payroll as well.

League Wide Expenditures

So where does this leave the league overall?  The graphs below present two summaries of the same date for the league (click on either to enlarge).  The first looks at the sum of guaranteed contracts over time (unadjusted) and the total count of players with contracts, while the second looks at the rate of change of the total expenditure and number of contracts from one season to the next.

Since the last year of the last CBA (2009), overall league expenditures on pay have risen greatly - an increase of 58% to be exact.  Over the same time, the number of players in the league has increased 34%.  From 2010 to 2011, league pay expenditure increases could not keep pace with the addition of two teams and roster spots on every team meaning that overall pay increased by only 12.5% while the number of players increased by 26%.  This was the first time in the six years of recorded salary data that increases in total team payroll were outpaced by the growth in the total number of contracts.  This is likely a one-time event that we won't see again until another two-team expansion like we witnessed with Portland and Vancouver this year.


The above data confirms that MLS's single entity model provides for a very interesting balancing act - almost like a social democratic economic model.  On the one hand, they need to have some provisions to attract older, top talent from around the world as well as keep up-and-coming talent in the league.  This is their method for fueling league growth and exposure.  At the same time, they're clearly trying to grow the professional sport in the United States in a controlled manner to avoid an NASL-style meltdown.  In the last few years, they also seem to be taking the approach of improving the bottom end of the salary range by a modest amount, and are willing to keep wage growth in the median contracts to a minimum.  It's almost as if the league has taken a "reasonable income for the greatest number approach", as there is not doubt an increasing number of players are able to make a modest income in MLS by playing the game they love.

At the same time, the league wide payroll numbers indicate just how far the league has to go to be competitive on the world stage.  The 2011 league-wide payroll of $80.2M equates to the transfer fee Chelsea paid for Fernando Torres.  That's right - the rights to negotiate on a contract with one player in the EPL cost more than the entire wages of MLS for a season!  MLS has more than doubled what they can spend on wages since 2006 - a sign that both average pay and the size of the league have grown.  But they still have a long ways to go if they wish to compete for top talent on the world stage, and attract enough worldwide attention to be able to pay for such talent.

Monday, May 9, 2011

Taking a Three Week Break

It has been an amazing six months since I restarted my blog after an extended break last fall.  I have averaged 12 posts a month, have interacted with some other awesome bloggers and stats guys, and have seen readership and interaction on social media go through the roof.  I am truly grateful for being able to write on such a topic as soccer statistics, be trusted with so much data, and be able to interact with everyone on a daily basis.  This blog has led me to have regular conversations with a Cornell professor studying similar statistics as me, a local blogger I've followed for several years, a phenomenal Liverpool writer who has entrusted me with his precious data, an independent data miner who has the Gunners and the game's integrity in his heart, and a manager of a sports desk for a major European stats bureau showing me around Utrecht and eager to collaborate with me.  Not to mention the daily Twitter conversations with all of my followers interested in any facet of the game.  Truly a wonderful experience.

It's with those experiences in mind that I am taking a break for the next three weeks.  I know what you're thinking - if I am having so much fun why am I taking a break?  And how will taking a break enhance such experiences?  Let me explain the several reasons why I feel taking a break will be a positive thing.

  1. It will allow me to focus on a work-related certification for the next three weeks that is critical to my job.  I do all this blogging recreationally, and occasionally my job demands enough of my time that I must forgo other activities during my free time.  My recent long hours in Europe are one example of this, and the upcoming three weeks are another example.  If I am not happy or too stressed at work, it makes blogging difficult.  So focusing a number of my free hours on this certification will help with that.
  2. Everyone needs to take a break to recharge, and it appears this is my best chance to do so.  The main team I follow in the EPL - Arsenal - pretty much has their table position secured at this point.  Most of my analysis is longer-term rather than match-to-match, so I don't have to stay involved in the final two weeks of the season.  And as one Twitter user pointed out, Arsenal seem to have already gone on their summer break so why shouldn't I?  The challenge for me is that I already have several large analyses lined up over the summer months, so I don't see much of an opportunity for a break there.  That's why I am taking my break now.
  3. Beyond recharging, I am looking to upgrade a few of my databases over the next several weeks.  The desire to publish two quality original pieces a week drives the need for good databases.  I've had a good run of luck in acquiring a number of them this season, but as the season is closing down I need to focus on upgrading them for the next season.  That takes time, and it is time I need to take away from making original posts to accomplish the task.
Taking the break from making posts full of original content will allow me to focus on recharging, reloading data, and think about subject material I would like to explore in the upcoming summer and season.  My emphasis is on not making posts from original material.  I will see be on Twitter, and will make my Friday Night Links posts of my favorite material from the week that was soccer journalism/blogging.  I won't totally unplug, but I will unplug enough to come back refreshed, recharged, and certified at work and be able to focus on top quality posts.

Thank you for making the last six months very successful in the blog's growth and coverage.  Enjoy the last few weeks of whatever European league you're following, and I will see you in the month of June!

Friday, May 6, 2011

The Curious Case of Corners

In May of 2010 I attended a Seattle Sounders match where they were beat 4-0. In an attempt to find a silver lining to such a horrible loss, the one of the local commentators pointed out that the team had earned a franchise record 12 corners on that day. The obvious implication was that while the defense was porous that day and the offense unable to score any goals, the team was at least making progress by attacking the goal and getting corners.

Fast forward nearly a year, and we're now at the end of yet another EPL season. While Chelsea stumbled through the first two-thirds of the season, they have certainly come on strong over the last third and are now in the improbable position of challenging Manchester United for the championship. As part of this resurgence, Opta noted the following statistic:

220 - Chelsea have won the most corners (220) in the Premier League this season and conceded the fewest (110). Opposite.
What if all of this commentary on corners were wrong? What if the more corners a team got compared to their opponents, the less likely the team was to win a match? This is certainly the case in the EPL, and in fact the effect is even bigger than that identified in my previous post on the effects of shots. This post will quantify how much increasing corner differential lowers the average EPL team's odds of winning a match, and compare the effects of such corner differentials for the Big Six clubs.

The Data and League Model

As in previous posts, I am utilizing data from the 2005/06 through 2009/10 EPL seasons compiled by DogFace. The raw match data has been transformed into differentials for shots, shots-on-goal, corners, fouls, and fantasy points for yellow and red cards. Match venue (home/away) is also noted. Each of these variables is then placed into a binary logistic model (BLR) to assess their effect on the odds of winning a match. BLR's were created for the league, as well as each of the individual teams that played in the league during the years analyzed. In the case of the league, BLR terms with a p-value of 0.05 or less were kept in the model. When it came to individual teams, reduced sample sizes required the use of BLR terms with a p-value of 0.10 or less to ensure a significant number of terms were retained in each team's model.

Comparisons at the league and team level were made once the BLR's were created. The graph below represents the league average home and away performance for the range of recorded corner differentials in the data set and utilizing the average values for shot, shots-on-goal, and fantasy points by venue. The dashed lines represent the lower and upper bounds of the 95th percentile prediction interval for the BLR lines. Click on the image to enlarge it.

The slopes of the lines are remarkably similar. In the -5 to +5 corner differential range, where most of the data lies, a home or away team playing to their average form experiences about a 2% reduction in their odds of winning the match for every additional corner they earn versus the competition. Notice that the two lines are separated by about 0.3 at -5 and 0.25 by the time they reach +5. This means that given each additional corner earned by a home team playing to their average form closes the gap to the away team playing their average average form by about 0.04 in this region, or 13% to 16% of the gap.

Magnitude of Corners On Match Odds

In the last post I compared the effects of shots-on-goal to shots to demonstrate that while shot differential leads to lower odds of winning, it's effects on lower those odds are only 1/3 as strong as the power of shots-on-goal to raise a team's odds of winning. The table below plots a similar relationship for corners to shots-on-goal. Teams with significant BLR terms for corners were included in the table, and it has been sorted from lowest (best) to highest (worst) ratio of coefficients. All terms are negative given the positive coefficient for shots-on-goal and the negative coefficient for corners.

Take note of the number of teams in the table - 16. This represents a 60% increase from the number of teams included in the similar table for shots differential. Corner differential has the second highest team count of any predictor - only shots-on-goal has a greater number of teams. The league average effect is that shots-on-goal have nearly three times the effect on a team's odds of winning a match than do corners. To put it another way, for every three corners that a team has compared to the opposition they must record one additional shot on goal to not negatively impact their odds of winning a match.

How does this compare to the effect of shot differential? The table below summarizes just such a comparison. Readers of my last post on shot differentials will recognize the table from that post, which has had a column added it to capture the teams' corner differential coefficients.

What becomes immediately clear in studying the table is that on a league-wide basis corner differential is far more damaging to a team's odds than shot differential is - more than 2.5 times as damaging. Also note a few clubs take a bigger hit than others. Chelsea and Manchester United realize a lower benefit from shots-on-goal vs. corners than any of the teams in the table, and see the biggest falloff compared to their shots-on-goal to shots coefficient ratio. The other teams in the table stay relatively even or go up.

Impact on the Big Six's Odds

The graphs below are just like the first one in this blog post. They utilize each team's average home and away form in the other BLR variables to allow a sweep of each team's corner differential by venue. Click on either graph to enlarge it.

A few general conclusions can be drawn from the graphs:
  • Arsenal and Liverpool pretty much follow each other's form on average, both home and away. They also represent the middle of the pack in both venues.
  • In both cases Manchester City has the lowest overall odds of winning a match. This is likely due to lower average performance in the other BLR attributes that would drag down their overall odds regardless of the corner differential. Recall that in the first table in this post they also had the highest SOG-to-corner differential coefficient ratio of any team. This is due to their low BLR coefficient for corners, which provides their line with the shallowest slope of any team, home or away.
  • Manchester United's convex curves both home and away represent relatively consistent odds at negative corner differentials with quickly diminishing ones as corner differential turns positive. Of all the teams when playing to their average form, Manchester United is the most sensitive to corner differential in their favor.
  • Chelsea's odds of winning a match when they have positive corner differentials playing to their average form away from home is simply amazing. Their odds of winning a match are nearly double those of any other team as corner differentials approach 10 to 15 corners, and they still have the better than 1-in-3 random chance of winning a match. Their odds of winning still go down, but they are the most robust to the effects of corner differential. Perhaps the Opta tweet would have been far more insightful if they had mentioned that instead of the meaningless statistic regrading overall corner differential.
Overall Conclusions

The data above makes sense if we think about what corners represent within a match. It can be assumed that corners are generated one of two ways:

  1. The goal keeper or a defender makes a play on a shot-on-goal that knocks the ball out of play.
  2. Under intense pressure from the opposition, the goal keeper or defender intentionally knocks the ball out of play to stop the attack before a shot can be taken.

Either way, the defense has either stopped a shot-on-goal or play that had a high likelihood of generating a shot on goal. If we know that shots-on-goal are likely to generate goals, and goals generate wins, than denying the shot attempt in the first place is likely to lower the opposing team's chances of scoring a goal as well as their likelihood of winning a match.

The higher penalty for corner differential vs. shot differential may also make sense if we think of match play in this manner. Shots often represent attempts by a player to score a goal where the shot had little chance of resulting in such a goal as it was not on target (i.e. not a shot-on-goal). Thus, the opportunity lost from simply taking an errant shot may not be that large. But if most teams avoid knocking balls out of play and generating corners for the opposition because of the dangerous nature of set plays from the corner, it stands to reason that they will not look to put a ball out of play for a corner unless it is a defense of last resort or an unintentional consequence of their actions. Thus, a corner for a team represents a bigger deprivation of a scoring opportunity than a random shot does.

What this all says is that passing, pressing, and shots have some merit but they don't matter much in a match unless they translate into shots-on-goal. And if such pressure is met with a resistance that constantly puts the ball out of play when the club on defense is at most risk, the attacking is all for nought. If the attackers keep plugging away, ensuring that their shots are actually shots on goal and that any corners are coming from shots-on-goal and not random bad passes or ricochets from the defense they will break through. That is because for every shot-on-goal that results in a corner, the team's odds of winning the match go up in a "two-and-a-half steps forward, one step back" method according to the statistics.