PSL 2K17

A cricket tournament visualisation

Pakistan Super League (PSL) is a cricket league played in Pakistan. Cricket is a sport played between two teams in which each team gets 120 balls (20 overs) to score as many runs as possible. The team which scores the most at the end of 20 overs wins.

Each team takes turns batting and bowling. Teams have 11 players who serve either as a batsman specialist (person who bats) or a bowler (person who bowls the ball).

I wanted to anaylze the performance of each of the teams in the Pakistan Super league played in 2017. Five teams competed in the 2017 edition:

I broke down the analysis by batting performance, bowler performance and finally the overall team performance.

Batting Performance

A batsman's objective is to score as many runs as possbile by hiting the ball as far as possible. The further he hits the ball the more runs he scores. I analyzed individual batting peformances of each batsman based on the following criteria:

  1. Average runs: Average runs scored in each game - Higher the better
  2. Strike rate: Runs scored per 100 balls (How fast a batsman scores) - Higher the better

The plot below, compares runs scored vs the strike rate for each batsman for each team. As we scroll left to right, the runs increases and bottom to top strike rate increases. The farther right and up the better - top right quadrant. Hover on the chart to see the values and respective teams.

I divided the graphs into quadrants. The top right quadrant has batsman wigh greater than average runs and stike rate. Zooming in to the quadrant, we notice that all teams have roughly about 7 players that perform better than the average batsman. Peshawar has the most with 8 and karachi the worst with only 5 players.

Next, let's compare the distribution of average batsman runs scored across teams. Meaning how are the batsman scoring within each team. The greater number of batsman with higher scores the better. The box plot shows the distribtuion of the the 3 quartiles as well as the mean and median of runs scored.

Quetta has a huge spread meaning a very wide dispersion of runs scored. Lahore on the other hand has a much smaller dispersion in the middle so players are scoring relatively similarly. Overall, Peshawar has the highest mean and median with 9 players scoring more than 13 runs while Karachi has the lowest with the median runs of only 10 runs.

Bowling Performance

A bowlers job is to stop batsman from scoring runs. They try to bowl such that either the batsman can't hit the ball or get's out trying to hit it (batsman's wicket).

  1. Economy: Average runs conceded in an over - Lower the better
  2. Total Wickets Total wickets taken - Higher the better

Similar to the batsman scatter chart, the bowler scatter chart compares the bowlers on the economy (average runs they concede) and the total wickets they take. As we move along the axes, number of wickets increase on the xaxis and ecoonomy increases along the y axis.

The bottom right quadrant has bowlers with lower than average economy and higher than average wickets. Zooming in to the quadrant, we can see that Peshawar has 3 players, Lahore has 2 and everyone else has just 1 player in the quadrant.

Like, the batsman runs distribution, I compared bowlers economy distribution across the teams. In T20, a bowler can bowl a maximum of 4 overs (24 balls) so each team needs atleast 5 bolwers to finish the 120 balls. Often times teams employ more than 5 bowlers to rotate their bowling and rest players. Thus, a team with lots of bolwers with low economy gives them plenty of options.

We see that Karachi and Quetta has the smallest dispersion but the bowlers are expensive as they concede on average 7.6 and 7.4 runs respectively per over. Peshawar on the other hand is a clear standout as it has the lowest median and 6 bowlers who concede less than 7 runs in an over. The closest teams, Islamabad and Lahore, have only 3.

Overall

Let's combine everything together to give all the players a cumulative score based on their batting and bowling performance. We can assign a score from 0 - 3 on each of the following categories and sum them to land a final score for each player

  1. Average runs scored: Average runs a players score in a game
  2. Strike rate: How many runs a person score per 100 balls faced
  3. Variance of batsman runs scored: Measures how consistently a batmsan scores
  4. Average Bowling Economy: Average runs a bowler conceded in a match
  5. Number of wickets taken: Total number of wickets taken by a bowler
  6. Variance of bowler ecnonomy: Measures the consistency of a bowler's economy

Let's chart a histogram to compare the distibution of players scores between each team

Histogram paints an interesting picture. Most teams have a positive skew and fat left tails meaning more low scoring players. Islamabad has the fatest left tail while Peshawar has long right tail. Karachi has a balanced, symmetrical distribution with most players in the middle range.

Finally, let's view the heatmap to see a side by side comparision of the score distribution. The lighter a color the higher a players score.

Similar to the histogram, we notice that Karachi has the lowest number of low scoring players and Islamabad the highest with a ling shade of purple. Karachi has a small team though with only 13 players compared to Islamabad which has 17. Peshawar is once more a solid team. It has some purple players but it the highest number of players with a score greater than 10. Not suprisingly, Peshawar won the championship that year!