October 18, 2014

125 Years of English Football!



    So the Barclay's Premier League is back since quite some time, and all the other leagues are back in action too. And for the first time, we have an Indian football league too! The ISL! ( I wish it changes Indian football forever!)

     Football league (as it was known back then) was created in 1888 by Aston Villa director William McGregor. Since then English Football has evolved into thousands of teams which play under hundreds of leagues. The BPL is just the tip of the iceberg. This blog gives complete information of the hierarchy of English Football.

    So, James Curley, assistant professor of psychology at Columbia University, in his free time, cobbled up data from a lots of sources and compiled all of them together, to make, what's probably the best collection of English football scores. Sitting silently on this Github page are scores of nearly 200,000 games played in the top 4 leagues since 1888. These 14 megabytes can tell us remarkable stories about 125 years of English Football!

    I have used R to perform all the manipulations on the data. The below code shows how to load data into R.

    Take the most common scoreline, for example, in 188,060 games, there were 13,475 0-0 draws. And the most common scoreline is 1-1, accounting for roughly 21000 (11%) games.
Top Five Full Time scores
Now, lets talk about goals! 

In 188,060 matches played in 125 years, a total of 542,288 goals were scored!
About 330,000 goals were scored by the home team and remaining by the visiting team.




We see that average home goals have reduced significantly and away goals keep oscillating.
Now this drop in average home goals and rise in the away goals in past twenty years explains the below graphs.
So, the home wins have greatly reduced to about 44% while away wins are on a gradual rise. This means that home matches won't matter as much as they used to matter earlier and slowly home dominance will begin to fade away.

Average goals per game have also reduced.


We see huge shifts in the average goals around the years 1925 and 1965. And the reason for is rule changes.


1958 - Substitutions were allowed for the first time
This roughly corresponds with the beginning of a steep decline in scoring in the 1960s. This could make for a plausible causal explanation: Perhaps playing with an injured player left teams extremely vulnerable on defense, leading to many goals. The addition of the substitute may have mitigated these effects.

The reduction in goals in the late 1920's isn't well explained. But it is believed that this majorly happened due to tactical changes. (Teams used to play many forwards, but later, defensive and midfield players increased.)


All the code used for plotting above charts and manipulation data can be found here.


Hope that it was a good read! 
Suggestions and feedback are always welcome! 

Happy Coding :D