Pay and Performance in Major League Sports

Name

Institution affiliations

Course

date

**Pay and performance in major league sports**

**Purpose Statement and model**

“The dependent variable **revenue** is determined by independent variables **average salary**, **matches won**, **number of goals** and **performance (Number of points earned)”**

The dependent variable, revenue, has been chosen because it is the nature of any business or any profit-driven organization to consider revenues. Revenue mostly the focus of many organizations (Stephen Hall, 2003).

“The most important independent variable in this relationship is **performance (number of points earned)** because **this determines the position of the team**”. In UEFA and Laliga leagues, the performance of the team determines its position and its revenue (Stephen Hall, 2003). We shall look at this in detail.

The general form of the regression model is written as below:

Revenue=Average Salary+ Matches won +number of goals+ performance

**Definition of variables**

The dependent variable in this case is revenue. This can also be interpreted as the predicted variable. i.e., with the independent variables, we can find a linear relationship which can forecast the future value of revenue. Any change in the independent variables should influence revenue. For example, the performance of a team directly influences the amount it receives from the trophy.

The first and most important independent variable is performance. Performance, in this case, is measured by the number of points earned in the whole league (Bernd Frick, 2003). A win in each match earns the team three points, a draw earns it two points while a defeat earns it zero points. Hence, if a team has won all the games, it will have the maximum number of points. Hence, if a team has the maximum number of points, it gets the trophy and earns a token. To this end, this is the most important independent variable because it determines the position of the team. The coefficient of this variable is expected to be positive since performance directly increases revenue. As performance rises, we expect the revenue to increase.

The second independent variable is average salary. In this case, average salary refers to the salary received by players. As a hypothesis, the club which pays the highest amount of salary is likely to attract the best players. When the best players are in a team, this will make them to win many games and influence the general performance of the team causing it to receive a higher reward (Bernd Frick, 2003). Also, these best players can be used to attract more fans to the club causing the stadium to be filled whenever there is a match. This directly generates revenue. These players are also used in advertisement hence generating revenue for the club. For this variable, we expect the coefficient to be negative since as salary increases, the revenue of the club reduces since salary is an expense (Bernd Frick, 2003).

The third independent variable is number of goals. The higher the number of goals scored, the higher the chances of being the best performer. Take for example a situation where, for example, Manchester United has won eight matches with 8 goals and Manchester City has won equally eight matches but with 24 goals. Determining the winner here will not depend on the points but on the goal difference and hence it will be better. Hence, the number of goals determine the revenue of a club. In addition, there are trophies which are paid depending on the number of goals earned hence increasing the revenue. This variable is expected to be positive. This is because as number of goals increase, there is a high possibility for revenue to increase.

The fourth and the final independent variable is matches won. Matches won are defined as the number of times a team beats another. The number of matches won greatly influences the performance which later influences the revenue. Subsequent wins guarantee the team the overall win. This variable is expected to have a positive coefficient since as wins increase, revenue is expected to increase too (Bernd Frick, 2003).

**Data Description**

The data used for this analysis shall be based on English Premier League season 2017/2018 and Laliga league season 2017/2018. These data shall be extracted online from the football data website which is an open source for football data. Where the data is not organized, it shall be exported to excel and organized by way of bringing only the desired columns together and then prepared for analysis. The data have already been archived in these websites by the end of May 2018 and shall be extracted from the date of this research. The limitation in this data is that it has been combined for two leagues, Laliga and UEFA since it we needed at least thirty observations to make the data more representative. If we were using one club, it could have given more precise results.

**Presentation and Interpretation of Results**

The results for the data are presented in the figure below:

Figure 1

From figure 1 above, the regression equation will be presented as follows:

**Revenue=-138.34-28.51(Matches won) +10.31(Points) +3.77(Goals) -2.20(Salary)**

From the same table, the adjusted R2 is 0.55. This value reveals that the independent variables can explain(predict) the independent variable up to 55%. This is relatively low since the expected value is 100%. From the regression results, the p-value is very small (4.52E-05). The significance test value is normally 0.05. Since the calculated p-value is less than the level of significance 0.05, we reject the null hypothesis in favor of the alternative. This means that at least one of the means of the independent variable is different (Statista, 2018).

The coefficient of the first independent variable, matches won, is different from the earlier expectation. We expected it to be positive but it has turned out to be negative. This might be brought about by reality of the data, that it is always innocent until proven guilty. The t statistic for this variable is -1.75 with a p-value of 0.09. Since this value is greater than the significance level, this variable is not significant.

The second variable, points, has a positive coefficient of 10.31. This is as per our earlier hypothesis. This variable has a test statistic of 1.69 with a corresponding p-value of 0.1. Since the p-value is greater than our significance value, the variable is not significant (World Football, 2017/2018).

The third variable, goals, has a positive coefficient. This shows that our earlier assertion has been satisfied. The variable has a test statistic of 2.87 with a corresponding p-value of 0.008. Since the p-value is smaller than our test statistic, this variable is statistically significant.

The fourth and last independent variable, salary, has a negative coefficient (Statista, 2018). This is as per our earlier assertion that this variable will have a negative coefficient. This variable has a test statistic of -2.38 with a p-value of 0.02. Since the p-value is less than our significance value, this variable is statistically significant.

From these results, the variable with the highest significance is goals, whose p-value is tending towards zero.

The results for the correlation analysis are presented in figure 2 below:

Figure 2

Multicollinearity is a statistical phenomenon in which the independent variables are highly correlated with each other. There are three independent variables which are highly correlated with each other. These are Matches won, Points and Goals. The correlation between matches won and points is 0.994, which is tending to 1. Matches won and goals has almost equal correlation as that between goals and matches won. This multicollinearity might have been caused by the data being too closely related.

Multicollinearity causes the effects of the independent variables to be confounded.

**References**

Stephen Hall, Stefan Szymanski and Andrew S. Zimbalist (2002). Testing casualty between team

performance and payroll: The cases of Major League Baseball and English Socker; *Journal of sports Economics*, vol. 3 issue 2, pp 149-168, https://doi.org/10.1177%2F152700250200300204

Bernd Frick, Joachim Prinz, Karina Winkelmann, (2003) Pay inequalities and team performance: Empirical evidence from the North American major leagues”, *International *

*Journal of Manpower*, Vol. 24 Issue: 4, pp.472-488.

Statista. (2018, May). *Average annual player salary in the English Premier League in 2017/18, by team (in million U.S. dollars)*. Retrieved from statista: https://www.statista.com/statistics/675303/average-epl-salary-by-team/

World Football. (2017/2018). *Premier League 2 – Division 1*. Retrieved from World Football: http://www.worldfootball.com/s/7988/england/premier-league-2-division-1/2017-2018

** Data**

The data set is found in the accompanying excel sheet attached.