Nyke shoe companyFinal ProjectSTA-201: Principles of Statistics10/25/14 Contents TOC o “1-3” h z u Introduction: PAGEREF _Toc402043151 h 1Data: PAGEREF _Toc402043152 h 2Analysis & Results: PAGEREF _Toc402043153 h 3Conclusions: PAGEREF _Toc402043154 h 5Appendix: PAGEREF _Toc402043155 h 5Table 1: Descriptive statistics for Shoe Size and Height PAGEREF _Toc402043156 h 5Table 2: Descriptive statistics for Gender. PAGEREF _Toc402043157 h 5Table 3: Regression analysis considering Shoe Size as dependent variable. PAGEREF _Toc402043158 h 6Table 4: 95% Confidence interval for Shoe Size PAGEREF _Toc402043159 h 6Table 5: Counts for Shoe Size PAGEREF _Toc402043160 h 7Graph 1: Histogram and Box plot for Shoe Size PAGEREF _Toc402043161 h 7Graph 2: Histogram and Box plot for Height PAGEREF _Toc402043162 h 7Graph 3: Scatter plot of Height vs Shoe Size and Sex vs Shoe Size PAGEREF _Toc402043163 h 8Introduction:Statistics and Data analysis is one of the most important topics in any real life situation.
Especially when all the corporations are trying to use the methods of statistic to analyze their data to make significant and efficient decisions.
The main target of each and every firm is to maximize their revenue and thus maximizing their profit. So statistics counts as a vital tool as statistical methods gives us some really good recommendations on real life cases based on the past data. And as statistics is completely based on data so nobody can ignore the results.In this example we have the data for the Nyke Shoe Company. Due to financial hardship the Company feels they only need to make one size of shoes, regardless of gender or height. This is to minimize the cost of production. As different sizes need different machines or adjustment of production (like cutting adjustment of leather etc.) so that increases the production cost. But if one can identify the one size fits all shoe size then they can minimize production costs without losing sales, which will increase the profit. This and thus they will eventually overcome from the financial problem.The main goal of a statistician is to analyze the data to see whether there exists a shoe size, which can be used to make shoes regardless of gender or height and if such size exists giving the proper recommendations as well.Data:Here we have the dataset containing 3 variables Shoe Size, Height and Gender. We have 35 data points in our dataset. Based on this data we are performing all our analysis. Though the variables Shoe size and Height may be given in integers (fractions for shoe size by interval 0.5) i.e. in discrete format to analyze more properly, I am considering these variables to be continuous.Here our target is to find out the shoe size i.e. shoe size is our interest variable so we can conclude that shoe size is the dependent variable and the other two height and gender are the independent variables.Moreover, gender is a qualitative variable so we need to convert it into a quantitative variable to do further analysis. For that reason I am creating the following dummy variable which will solve this situation.I am defining,Sex = 1 if Gender = Female = 0if Gender = MaleAll the analysis is shown the following parts considering the above variables and newly defined dummy variable.Analysis & Results:Before doing the more detailed analysis we need to explore the data. The best step for that would be analyzing the descriptive statistics of the data. The obtained tables are given in Appendix (table 1&2).From these two tables we can see the basic characteristic of the data. The mean, median and mode for Shoe size is 9.1429, 9 and 7, and height is 68.9429, 70 and 70. The variability also suggests that the variability is more in Shoe size as compare to Height (compare to respective means).Table 2 suggests that almost 50% of the shoes are made for Females and 50% are for males, the deviation from 50% (51.43%) is very small. Thus we should consider both the gender’s shoes for better analysis.We also need to look at the corresponding graphs to get better understanding about the data. Both histogram and Box-plots are good idea to use. The obtained output is given in Appendix (Graph 1 and 2).From the graphs, we can see that both the distribution of Shoe size and Height are not normal. The box plot suggests that the distribution of Shoe size is a little positively skewed (supported by the skewness coefficient value in table 1), whereas the distribution of Height seems negatively skewed which is observed both from the chart and the table.Next we need to draw scatter plots considering Shoe size as the dependent variable. The obtained plots are given in Graph 3 and 4 in Appendix.From the above-mentioned graphs, we can see that shoe size varies both for height and sex. A larger height in general needs larger shoe size, and also males (sex = 0) need a larger shoe size. Thus we can expect some relationship between these variables and shoe size. To be sure we need to run a regression analysis. The result is given in table 3. The regression analysis makes our doubt true. From the outcome we can see that the p-value for individual t-tests for both Height and sex is very small suggesting both these variables have significant impact on shoe size. Thus choosing a one size, which fits all, is not possible. Or to be simple, each person, in general, needs different shoe sizes based on the sex and height as prominent from the outcome. But due to the financial hardship the company can no longer afford to produce all sized shoes so we need to find one point. The best idea would be to use a point estimate and confidence interval for the shoe sizes. The obtained outcome is given in Table 4. I am using a 95% confidence interval.The confidence interval is (8.256, 10.030), even this is not giving us the proper result because table 5 is showing us that only 5 out of 35 shoe sizes falls in this interval.So at the end from table 5 we can see that maximum number of shoe sizes is 7 so as all the test didn’t gave us very good result we have to go with size 7 as the best outcome.Conclusions:From the initial outcome we can see that height and gender both have significant impact on shoe size. Thus there is no single shoe size, which is acceptable for all height and all sex. So getting a perfect result without losing the revenue is not possible. But as we needed a recommendation so I proceeded forward with the tests and that gave me that size 7 has the maximum frequency of shoe sizes so that should be the recommended shoe size to produce.Appendix:Table 1: Descriptive statistics for Shoe Size and Height Shoe Size Height Mean 9.1429 68.9429Standard Error 0.4366 0.6810Median 9 70Mode 7 70Standard Deviation 2.5827 4.0289Sample Variance 6.6702 16.2319Kurtosis -1.0831 -0.3355Skewness 0.3666 -0.2334Range 9 17Minimum 5 60Maximum 14 77Sum 320 2413Count 35 35Table 2: Descriptive statistics for Gender.Gender Count of Gender % of GenderFemale 18 51.43%Male 17 48.57%Total 35 100.00%Table 3: Regression analysis considering Shoe Size as dependent variable.SUMMARY OUTPUT Regression Statistics Multiple R 0.9465 R Square 0.8959 Adjusted R Square 0.8894 Standard Error 0.8590 Observations 35 ANOVA df SS MS F Significance F Regression 2 203.1724 101.5862 137.6663 0.0000 Residual 32 23.6133 0.7379 Total 34 226.7857 Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Intercept -15.3534 3.2377 -4.7421 0.0000 -21.9484 -8.7584Height 0.3735 0.0453 8.2475 0.0000 0.2812 0.4657Sex -2.4329 0.3598 -6.7623 0.0000 -3.1657 -1.7000Table 4: 95% Confidence interval for Shoe SizeOne-Sample T: Shoe Size Variable N Mean StDev SE Mean 95% CIShoe Size 35 9.143 2.583 0.437 (8.256, 10.030)Table 5: Counts for Shoe SizeRow Labels Count of Shoe Size 5 16 26.5 47 57.5 48 19 19.5 210 210.5 211 311.5 112 313 113.5 114 2Grand Total 35Graph 1: Histogram and Box plot for Shoe SizeGraph 2: Histogram and Box plot for HeightGraph 3: Scatter plot of Height vs Shoe Size and Sex vs Shoe Size