Unit 8: Term Project Report Due: Jul 29, 2018 at 10:59 PM Assessment Rubric for this paper is available under Course Information, Assignments and Grading link. Please make sure that your paper conforms to APA style requirements. Purpose Statement and Model 1) In the introductory paragraph, state why the dependent variable has been chosen for analysis. Then make a general statement about the model: “The dependent variable _______ is determined by variables ________, ________, ________, and ________.” 2) In the second paragraph, identify the primary independent variable and defend why it is important. “The most important variable in this analysis is ________ because _________.” In this paragraph, cite and discuss the two research sources that support the thesis, i.e., the model. 3) Write the general form of the regression model (less intercept and coefficients), with the variables named appropriately so reader can identify each variable at a glance: Dep_Var = Ind_Var_1 + Ind_Var_2 + Ind_Var_3 For instance, a typical model would be written: Price_of_Home = Square_Footage + Number_Bedrooms + Lot_Size Where Price_of_Home: brief definition of dependent variable Square_Footage: brief definition of first independent variable Number_Bedrooms: brief definition of second independent variable Lot_Size: brief definition of third independent variable [Note: student of course replaces these variable names with his/her own variable names.] Definition of Variables 4) Define and defend all variables, including the dependent variable, in a single paragraph for each variable. Also, state the expectations for each independent variable. These paragraphs should be in numerical order, i.e., dependent variable, X1, then X2, etc. In each paragraph, the following should be addressed: How is the variable defined in the data source? Which unit of measurement is used? For the independent variables: why does the variable determine Y? What sign is expected for the independent variable’s coefficient, positive or negative? Why? Data Description 5) In one paragraph, describe the data and identify the data sources. From which general sources and from which specific tables are the data taken? (Citing a website is not acceptable.) Which year or years were the data collected? Are there any data limitations? Presentation and Interpretation of Results 6) Write the regression (prediction) equation: Dep_Var = Intercept + c1 * Ind_Var_1 + c2 * Ind_Var_2 + c3* Ind_Var_3 7) Identify and interpret the adjusted R2 (one paragraph): Define “adjusted R2.” What does the value of the adjusted R2 reveal about the model? If the adjusted R2 is low, how has the choice of independent variables created this result? 8) Identify and interpret the F test (one paragraph): Using the p-value approach, is the null hypothesis for the F test rejected or not rejected? Why or why not? Interpret the implications of these findings for the model. 9) Identify and interpret the t tests for each of the coefficients (one separate paragraph for each variable, in numerical order): Are the signs of the coefficients as expected? If not, why not? For each of the coefficients, interpret the numerical value. Using the p-value approach, is the null hypothesis for the t test rejected or not rejected for each coefficient? Why or why not? Interpret the implications of these findings for the variable. Identify the variable with the greatest significance. 10) Analyze multicollinearity of the independent variables (one paragraph): Generate the correlation matrix. Define multicollinearity. Are any of the independent variables highly correlated with each other? If so, identify the variables and explain why they are correlated. State the implications of multicollinearity (if found) for the model. 11) Other (not required): If any additional techniques for improving results are employed, discuss these at the end of the paper. Works Cited Page 12) Use the proper format to list the works cited under two headings: Research: two sources Data: a separate citation for each of the variables used in the paper. Refer to Term Research Paper Resource for guidelines on writing the paper.

 

Pay and Performance in Major League Sports

Name

Institution affiliations

Course

date

 

Pay and performance in major league sports

Purpose Statement and model

“The dependent variable revenue is determined by independent variables average salary, matches won, number of goals and performance (Number of points earned)”

The dependent variable, revenue, has been chosen because it is the nature of any business or any profit-driven organization to consider revenues. Revenue mostly the focus of many organizations (Stephen Hall, 2003).

“The most important independent variable in this relationship is performance (number of points earned) because this determines the position of the team”. In UEFA and Laliga leagues, the performance of the team determines its position and its revenue (Stephen Hall, 2003). We shall look at this in detail.

The general form of the regression model is written as below:

Revenue=Average Salary+ Matches won +number of goals+ performance

Definition of variables

The dependent variable in this case is revenue. This can also be interpreted as the predicted variable. i.e., with the independent variables, we can find a linear relationship which can forecast the future value of revenue. Any change in the independent variables should influence revenue. For example, the performance of a team directly influences the amount it receives from the trophy.

The first and most important independent variable is performance. Performance, in this case, is measured by the number of points earned in the whole league (Bernd Frick, 2003). A win in each match earns the team three points, a draw earns it two points while a defeat earns it zero points. Hence, if a team has won all the games, it will have the maximum number of points. Hence, if a team has the maximum number of points, it gets the trophy and earns a token. To this end, this is the most important independent variable because it determines the position of the team. The coefficient of this variable is expected to be positive since performance directly increases revenue. As performance rises, we expect the revenue to increase.

The second independent variable is average salary. In this case, average salary refers to the salary received by players. As a hypothesis, the club which pays the highest amount of salary is likely to attract the best players. When the best players are in a team, this will make them to win many games and influence the general performance of the team causing it to receive a higher reward (Bernd Frick, 2003). Also, these best players can be used to attract more fans to the club causing the stadium to be filled whenever there is a match. This directly generates revenue. These players are also used in advertisement hence generating revenue for the club. For this variable, we expect the coefficient to be negative since as salary increases, the revenue of the club reduces since salary is an expense (Bernd Frick, 2003).

The third independent variable is number of goals. The higher the number of goals scored, the higher the chances of being the best performer. Take for example a situation where, for example, Manchester United has won eight matches with 8 goals and Manchester City has won equally eight matches but with 24 goals. Determining the winner here will not depend on the points but on the goal difference and hence it will be better. Hence, the number of goals determine the revenue of a club. In addition, there are trophies which are paid depending on the number of goals earned hence increasing the revenue. This variable is expected to be positive. This is because as number of goals increase, there is a high possibility for revenue to increase.

The fourth and the final independent variable is matches won. Matches won are defined as the number of times a team beats another. The number of matches won greatly influences the performance which later influences the revenue. Subsequent wins guarantee the team the overall win. This variable is expected to have a positive coefficient since as wins increase, revenue is expected to increase too (Bernd Frick, 2003).

Data Description

The data used for this analysis shall be based on English Premier League season 2017/2018 and Laliga league season 2017/2018. These data shall be extracted online from the football data website which is an open source for football data. Where the data is not organized, it shall be exported to excel and organized by way of bringing only the desired columns together and then prepared for analysis. The data have already been archived in these websites by the end of May 2018 and shall be extracted from the date of this research. The limitation in this data is that it has been combined for two leagues, Laliga and UEFA since it we needed at least thirty observations to make the data more representative. If we were using one club, it could have given more precise results.

Presentation and Interpretation of Results

The results for the data are presented in the figure below:

Figure 1

From figure 1 above, the regression equation will be presented as follows:

Revenue=-138.34-28.51(Matches won) +10.31(Points) +3.77(Goals) -2.20(Salary)

From the same table, the adjusted R2 is 0.55. This value reveals that the independent variables can explain(predict) the independent variable up to 55%. This is relatively low since the expected value is 100%. From the regression results, the p-value is very small (4.52E-05). The significance test value is normally 0.05. Since the calculated p-value is less than the level of significance 0.05, we reject the null hypothesis in favor of the alternative. This means that at least one of the means of the independent variable is different (Statista, 2018).

The coefficient of the first independent variable, matches won, is different from the earlier expectation. We expected it to be positive but it has turned out to be negative. This might be brought about by reality of the data, that it is always innocent until proven guilty. The t statistic for this variable is -1.75 with a p-value of 0.09. Since this value is greater than the significance level, this variable is not significant.

The second variable, points, has a positive coefficient of 10.31. This is as per our earlier hypothesis. This variable has a test statistic of 1.69 with a corresponding p-value of 0.1. Since the p-value is greater than our significance value, the variable is not significant (World Football, 2017/2018).

The third variable, goals, has a positive coefficient. This shows that our earlier assertion has been satisfied. The variable has a test statistic of 2.87 with a corresponding p-value of 0.008. Since the p-value is smaller than our test statistic, this variable is statistically significant.

The fourth and last independent variable, salary, has a negative coefficient (Statista, 2018). This is as per our earlier assertion that this variable will have a negative coefficient. This variable has a test statistic of -2.38 with a p-value of 0.02. Since the p-value is less than our significance value, this variable is statistically significant.

From these results, the variable with the highest significance is goals, whose p-value is tending towards zero.

The results for the correlation analysis are presented in figure 2 below:

Figure 2

Multicollinearity is a statistical phenomenon in which the independent variables are highly correlated with each other. There are three independent variables which are highly correlated with each other. These are Matches won, Points and Goals. The correlation between matches won and points is 0.994, which is tending to 1. Matches won and goals has almost equal correlation as that between goals and matches won. This multicollinearity might have been caused by the data being too closely related.

Multicollinearity causes the effects of the independent variables to be confounded.

 

References

Stephen Hall, Stefan Szymanski and Andrew S. Zimbalist (2002). Testing casualty between team

performance and payroll: The cases of Major League Baseball and English Socker; Journal of sports Economics, vol. 3 issue 2, pp 149-168, https://doi.org/10.1177%2F152700250200300204

Bernd Frick, Joachim Prinz, Karina Winkelmann, (2003)  Pay inequalities and team                 performance: Empirical evidence from the North American major leagues”, International

Journal of Manpower, Vol. 24 Issue: 4, pp.472-488.

Statista. (2018, May). Average annual player salary in the English Premier League in 2017/18, by team (in million U.S. dollars). Retrieved from statista: https://www.statista.com/statistics/675303/average-epl-salary-by-team/

World Football. (2017/2018). Premier League 2 – Division 1. Retrieved from World Football: http://www.worldfootball.com/s/7988/england/premier-league-2-division-1/2017-2018

Data

The data set is found in the accompanying excel sheet attached.

 

Still stressed from student homework?
Get quality assistance from academic writers!