In my last post I used binary logistic regression (BLR) to show the impact of various EPL match attributes on the likelihood of winning a match. Soon after beginning the work I thought of another good use for a BLR model - identifying the factors that truly matter in predicting (as compared to retrospectively assessing) first round MLS playoff success. Regular readers know I am no fan of MLS's playoff system that seems to buck the trend of most leagues using a top-of-the-table format to determine their champion. I took a pass at explaining the phenomenon of teams with fewer games played winning a disproportionate share of first round series since 2003, but at the time could not prove that it was one of the few statistically significant predictors from the many identified by Climbing the Ladder (CTL). Since then I have used CTL's lineup database to construct a BLR model of the major factors identified by CTL and isolated the statistically significant factors. The results suggest that MLS has a lot of changes to make to continue to improve their playoff format they seem to love so much.
The Data Set and a Few Summary Statistics
I utilized CTL's lineup database to create the following statistics:
- Difference in the number of games played
- Overall goal difference
- Difference in coach experience (MLS games only)
- Difference in seeds
- Home record difference
- Away record difference
- Regular season goal differential between the two teams
- Games played difference: The maximum difference came in the 2008 NY/Columbus series where the Crew played 14 more games than the Red Bulls. This was due to the Crew playing in both the old and new format for CONCACAF Champions League (CCL). The median differential is 2 games.
- Overall goal difference: 2005 through 2007 saw some of the highest overall goal differentials between first round playoff participants - one series each with 22 and 23 goals, and two series with 27. The median goal differential was 9 goals.
- Difference in coach experience: Sigi Schmid is the king of mismatches in manager experience with the largest gap of 196 games realized when his 2008 Columbus Crew defeated the Kansas City Wizards. The median difference is 68 matches.
- Regular season goal differential between the two teams: The peak of regular season goal differential between two teams was witnessed in the 2006 through 2008 seasons, when an unbalanced schedule saw teams play each other up to four times during the regular season. Chicago's 2008 first round series versus New England witnessed the largest such goal differential - 8 from four regular season games. Chicago went on to win that first round series, while the median value for the first round series is 2.
The Results of the BLR
After compiling the data for every first round playoff series from 2003 through 2010, a BLR was constructed to predict the likelihood of a team winning the series. Dummy variables for each year were constructed to ensure no special causes were missed. Terms having a p-value greater than 0.05 were successfully eliminated until only two terms remained: manager experience differential and games played differential. All other terms - overall goal difference, difference in seeds, home record difference, away record difference, and regular season goal difference - were not significant by a mile (most p-values were equal to or greater than 0.40).
Plots of the the changing odds with various manager experience and game differentials is shown below.
More interesting is the intuitive relationship between the game played differential and the likelihood of winning the first round playoff series. The equation provides a very clear relationship between the game played differential and resultant odds. The relationship is largely linear from -5 to 5, meaning that it's about a 7.5% change in odds for each incremental game difference. As game differential approaches more than five matches, the incremental benefit of increased or reduced game differential is minimal.
What does this mean for the 2011 playoffs? The most direct assessment is that MLS missed a golden opportunity to correct this imbalance by not going to a two game series in the first round of the playoffs. I commented here how I would have liked to have seen such a re-balancing prior to MLS's announcement of the 2011 format, as well as my reaction to the announcement that the first round would be a single match while the second round will have the usual two-match series. If history is any indication, MLS essentially gave the lower seeds a 7.5% advantage in odds of advancing to the conference final by not making them play an extra game in the opening round.
We'll see how things play out in the 2011 playoffs and will update this study once they're complete, but I am not holding out hope. With my Sounders having one of the league's more experience managers and their desire to go deep in the US Open Cup and CCL, I think it will be another year without an MLS Cup.
Perhaps MLS really does like this parity that borders on complete unpredictability. Allowing MLS clubs to buy championships like Chelsea is not what I want, but the fact that no prior rational metric correlates to first round playoff success suggests MLS has some major adjustments to make. It seems as if MLS has swung the pendulum so far to the side of parity that no club's supporters know what to expect from regular to post-season, let alone year-in and year-out. Ultimately, this holds back the professional game's success and growth in this country.