Saturday, December 24, 2011

An Existential Debate on the Meaning of a Club

I haven't been posting much lately, mostly due to the amount of writing and data processing required for two larger projects on which I am working.  The first is now largely completed - a final draft of an 8,000 word article on the soccer culture in Seattle that has been submitted to The Howler for publication.  It addresses the unique relationship the city has with its professional franchise, Seattle Sounders FC, given the city's often strained relationship with its other professional sports franchises.

A discussion on the same topic happened between Jason Davis and the Sounder at Heart Twitter account  on the same day as I submitted article to The Howler .  It centered around the common conflict for North American soccer fans - to either call their local team they support a club or a franchise.  This is due to the key differences between North American and European leagues.  Teams in Europe are clubs because that's how they start out when they enter the lowest league in their nation's soccer ladder - an association of amateurs and semi-pros bonded by their love of the sport and the desire to create a sense of family within it.  Teams in the US, whether they be a soccer team or some other major professional sports team, are a franchise.  They are granted the exclusive permission to operate as a team within a closed league that determines who is admitted (and rarely contracted) and sometimes where team moves when the league and the team feel they're not getting a fair shake by their local fans. Just like franchises in the rest of the business world, the will of the individual team is subjugated to the will of the league to the extent that sales of franchises must be approved by the other owners.

Nevertheless, distinctions like these are not nearly as clear cut for North American fans, many of whom are attracted to soccer as a direct rejection of everything to do with American professional sports.  The outlandish salaries.  The threats of owners to move teams when the local population won't shell out $500M+ for a new stadium every twenty years.  The garish broadcasts where the infographics and instant replays are often more important than the actual game being played.  It all produces a feeling that the fans are lucky to have such a club, rather than the other way around.

Soccer fans in the US and Canada know that their local teams, whether they be in MLS or one of the lower tiers of the North American ladder, need them as much as the fans need the local teams.  As the two grow with each other over the years, a local team takes on a virtual Schrodinger's Cat existence: it is both a franchise in a literal sense and a club in practice.

Such duality is tough for some to accept, and can lead to the types of discussions that Sounder at Heart highlighted the other night.  That led to Jason Davis to make the following statement that perfectly sums up how we should look at the duality.

"The franchise is what the owner owns. The club is what the fans and players build around it." (click here for a more detailed discussion)
Some opinions not withstanding, it is a club atmosphere that we supporters in Seattle are continually creating.  It's what every supporter wants, even if the desire is based upon soccer hipsterism.  The reality is that a club atmosphere created by the supporters is what puts butts in the seats, fills lines in local newspapers, and drives bandwidth on blogs.

Franchises can become more like a club as years pass by.  No one could ever imagine the New York Yankees, Boston Red Sox, Pittsburgh Steelers, or LA Lakers playing anywhere else in the United States.  Their history is too deep, their identity to wrapped up in the cities they represent, and their fans too fervent.  Save for the reality of each of them being franchises in their respective leagues, they've taken on the look of iconic clubs the world over because of the history created by the players, coaches, and fans.

I hate the US closed league/franchise model.  I find that the players wages are suppressed more than other leagues, and the players are the ones doing most of the work and generating the bulk of the TV revenue.  The fact that a team like the LA Clippers in the NBA or Seattle Mariners in Major League Baseball can fail for so long and still get their share of the league's revenue is absurd.  It denies other people the ability to invest in the sport they love, and just as importantly it denies the fans the great team they deserve.  More often than not, the focus is more on the league's success, and not creating a genuine bond between franchises, players, and fans.  Save for the rare examples above, the reality is that whenever the relationship between the franchise and the city goes sour, the league simply looks to move the franchise.  There is no loyalty to the local fans nor the club atmosphere they've created while the club was operating in or near the city.

This has real implications for newer leagues and expansion franchises.  Passionate supporters get a league or a franchise started, and the large barriers to entry in the North American sports market make such a founding moment both an exciting and precarious event at the same time.  What North American soccer fans need more of is a appreciation of true club atmospheres within their game when they do exist.  The modern professional game is still young at only fifteen years old, and we quickly forget the league went through a near-death contraction only a decade ago.  It's future is bright, but it's still a business where its franchises must be supported by fans who desire a club culture and eventually make money.  We may not be able to get the same business structure as those found in other soccer leagues, but we can create a culture around our sport that rivals those found at the clubs in those leagues. We supporters, the league, and the franchises must give time for that culture to develop.  The alternative is a league that really is just made up of faceless franchises rather than culturally deep clubs,

Tuesday, December 13, 2011

Academic Reasearch on Premier League Managers

At the top of the managerial heap,
no matter how you cut it...

I came across an interesting research paper last week courtesy of fellow statto Paul Tomkins.  It's entitled The Performance of Football Club Managers: Skill or Luck? and is authored by Adrian R. Bell, Chris Brooks, and Tom Markham.  All three are attached to the ICMA Centre at the University of Reading's Henley school of business.  In the paper the three examine which Premier League managers over and under perform when a number of factors (injuries, suspensions, extra games, transfer spend, wages, etc.) are taken into account.  The paper is a fascinating revelation of managerial performance.  If you can get access to the full text, I highly recommend a read of it.  Some of the highlights include:
  • Based upon the researchers' model, managerial over and under performance can be evaluated by their tenth match in charge of a club.
  • The authors found that net transfer spending year-to-year is not a statistically significant predictor, but readily admit this is likely due to the fact that they do not account for the cumulative effect of transfer spending year-over-year.  This was a criticism I made of the similar conclusion in Soccernomics, and one I've routinely addressed via models based upon the Transfer Price Index which doe take such year-over-year benefits in transfer spending into account.
  • The authors note that a team increasing their wage bill by £100M equates to an increase of 0.8 expected points per match.  Given the 2009/2010 wage bills in the Premier League, only one team was capable of such a gap against more than half of the teams (Chelsea) as the median wage bill was £54M.  Most teams competed with much less of a financial advantage within a match.
  • It turns out the total number of matches played is significant in one direction and insignificant in another (you'll have to read the paper to get the details).  The authors observe several reasons for the dichotomy, and mention that "large number of non-league games are only an issue for teams that are successful in those competitions..." This is a similar criticism I have made of MLS's playoff format and how it penalizes teams that are successful in non-league competitions.
  • At the top of the managerial heap, even when factoring in the expectations that come with player wages, availability, and transfers are Alex Ferguson, Guus Hiddink, Arsene Wenger, Jose Mourinho, Rafa Benitez, Sam Allardyce, David Moyes, Steve McClaren, and Martin O'Neill.  These men get/got the most out of what was available to them.
  • The authors identify Steve Wigley (Southampton 2004/05), Mick McCarthy (Sunderland 2005/06), and Aidy Boothroyd (Watford 2006/07) as managers who under performed early enough in their tenures that that should have been sacked much earlier than they eventually were.
  • There are also a number of other managers singled out as being sacked when such a sacking wasn't warranted by the authors' model.  Included in this group are Glenn Roeder (Newcastle United 2006/07), Chris Coleman (Fulham 2006/07), Martin Jol (Tottenham 2007/08), Avram Grant (Chelsea 2007/08), and Sven-Goran Eriksson (Manchester City 2007/08)
Statistical models won't ever override the passions of the supporters or the demanding nature of owners who want trophies sooner rather than later.  Perhaps they can at least inform decision makers as to when such passions are warranted or unwarranted, especially when the tools at a manager's disposal are taken into consideration.  Sackings and constant changes in managerial direction can be very disruptive to everyone's end goal of championship glory.  This paper contributes much to that understanding - I highly recommend giving it a read.

Monday, December 5, 2011

Considering a Switch to SPSS: What are Pitfalls and Benefits?

My well worn 4 year old Compaq desktop, from which I do most of my statistical analysis, is finally getting slow enough that I need to get a new computer.  The rest of the computers, phones, and tablets in my house are Apple product, so I am looking at getting a Mac Mini.  I'll be reusing all the rest of the peripherals from the Compaq for now as they work fine and I am not hung up on getting their Mac equivalents.  However, this switch in computer hardware and thus OS is presenting an interesting opportunity for me.

I have an older version of Minitab that I run for most back end analysis for this blog.  I then export Minitab's results to Excel 2010 to make the graphical presentation of the data look a whole lot better.  I've learned to use Minitab via my Six Sigma training, and it's very useful for what I do.  The trouble is, I only have a Windows version of it and would really prefer to not have to constantly switch over to the Windows OS via Boot Camp on the Mini whenever I want to do analysis and/or blog.  It kind of defeats the purpose of getting the Mini.  I'd like to do the analysis and blogging while running Mac OS.  Here's the trouble - a Minitab license for a Mac does not exists, so purchasing an update for my Mac is not possible.

Luckily, I can get a steeply discounted (and legal!) copy of SPSS for Mac OS or Windows.  We're talking less than $100 for a package that retails for $2700!  The package available to me comes with the following SPSS modules:

  • SPSS Statistics Base
  • Advanced Statistics
  • Custom Tables
  • Forecasting
  • Regression
  • Tables Original
  • Trends Original
A basic description of each module can be found here.

My concern is at a minimum maintaining the availability of analysis in Minitab.  Losing functionality I use routinely today is not an attractive thought to have  My most frequently used functions in Minitab are:
  1. Descriptive statistics: normality plot and check, mean, median, standard deviation, quartiles, etc.
  2. Normal Sample comparisons: two and one sample t and p
  3. Correlation tests
  4. Box-Cox Transforms
  5. Non-parametric tests such as Mann-Whitney
  6. Linear and Non-Linear Regression with ability to generate prediction (PI) and confidence interval (CI) data off of "new observations"
  7. Binary Logistic Regression (with PI and CI data generation capabilities)
  8. Ordinal Logistic Regression (with PI and CI data generation capabilities)
My read through of the module descriptions indicates that I'd have much, if not all of Minitab's functionality.  Items (6) through (8) seem to be pretty clearly called out in the Regression module.  What I can't seem to find much on is (4) and (5), which are critical given the non-normal nature of many data sets with which I work.  Does anyone know if the SPSS modules I listed contain features (4) and (5)?

Finally, beyond the actually calculations available, I am interested in understanding the user interface for SPSS.  My father used SPSS 30 years ago when completing his operations research masters.  It has a good bit of legacy code associated with it, much like Minitab.  That can sometimes limit the GUI overlay capabilities - Minitab can have some frustrating limitations due the legacy code.  Having no experience with SPSS, I am concerned I might run in to different limitations.  Does anyone know if SPSS's UI is as good as Minitab's or better?

Thank you to anyone who does provide feedback, and let me know of any other pitfalls or benefits I may have missed.

Thursday, December 1, 2011

More on Manchester City

The Red Devils Are Always Lurking Close Behind...

Yesterday's post on Manchester City's start to the season and its implications for season ending form prompted a good bit of discussion on Twitter, a few emails to me, and even the rare comment on my blog.  I've responded to nearly every one personally, but there's just too much good data to not follow up with another quick post.

As I noted in my original post, my binary logistic regression wasn't very elegant and that there are far more sophisticated models out there for predicting City's odds of winning the league.  Two of them put City's odds at much more conservative, lower estimates than my quick assessment.
  • As of December 1st, Statto.com has City's odds of finishing at the top at 1.67:1.  Manchester United's odds are listed as 3.25:1.  The next closest team according to Statto is Chelsea (15:1), although they're way off the point pace.  This means that Statto.com sees City's odds of winning the EPL as nearly two times greater than United's.
  • Euro Club Index, which looks at things like future opponents' strength and a club's prior performance over time, rate City's and United's odds of finishing at the top of the table even at 44% a piece.  The next closest club is Tottenham with an 8.5% chance.
Tommy Burke asked how Manchester United might respond in the final two-thirds of the season given City's early lead.  Luckily, friend of this blog and fellow Soccer Analyst Omar Chaudhuri  has quantified Manchester United's late season performances at his 5 Added Minutes blog.  It turns out that Manchester United really turn things on in the winter months.  In fact, they're quite pedestrian when it comes to the final 9 to 10 matches of the season (scroll down to the comments section of the linked post to see the numbers).  So, if Manchester United is going to catch and pass City, they'll likely do it by the time we get to the beginning of March if history is any indication.

Finally, Syntese commented on the last post and wondered what kind of points lead Manchester City should have at this point in the season given their record breaking form after 13 matches.  One way to look at this would be a regression of the PPM differential between 1st and 2nd place teams vs. the PPM of the first place team, both taken at match day 13.  Unfortunately, the statistical test for correlation is failed for such a relationship.  I have plotted the data below to make a graphical comparison between Manchester City and Manchester United as well as the previous 16 seasons' 1st and 2nd place pairs.  A regression plot with an associated R-squared value is shown just to demonstrate how bad a fit it is.


Nonetheless, we can draw a few conclusions from the graph and associated data.  Manchester City has the second highest overall PPM differential to the second place club (+0.38).  Only the 2005/06 Chelsea squad, who started out the first 13 games with a +0.53 differential, would start with a bigger lead.  There have been four teams who have had a 0.25 PPM advantage or greater after thirteen matches - 1995/96 Newcastle United, 1997/98 Manchester United, 2005/06 Chelsea, and 2008/09 Chelsea - and only two of them - 2005/06 Chelsea and 2009/10 Chelsea - ended up finishing first in the final table.  The other two squads finished second each of their years.  This is one of the reasons why a binary logistic regression model that uses PPM differential as the single input variable to predict odds of finishing top of the table at season's end results in the relationship not being statistically significant.

So, perhaps Manchester City is right where they should be - both in total points and gap to second - but they cannot rest as the perennial winter performers, Manchester United, are only five points back.