This article explains the relationship between revenue collected from individual Big Grocery stores by location and Big Grocery store square footage by location. It also shows how to estimate the revenue of any grocery store without any square footage.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
Running Head:DATA-DRIVEN DECISION MAKING1 Data-Driven Decision Making Student’s Name Institutional Affiliation
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
DATA-DRIVEN DECISION MAKING2 Data-Driven Decision Making According to Mulholland & Jones (2013) for a simple regression model, the equation is written as;Y= b0+b1X1+ u Where Y-is the Predicted variable, b0andb1–are constants, u = Random Error and X1- is the predictor variable. In this analysis, we are interested in finding out the relationship that exists between Revenue collected from individual Big Grocery stores by location and Big Grocery store square footage by location. In this case, it is assumed that the revenue collected from the grocery store depends on the square footage of the store. Therefore, revenue is the dependent variable while size (square footage) is the independent variable. Here, we will try to unravel if the revenue collected is influenced by the size of the grocery store. Below is the sample data for analysis. Table 1: Sample Data LocationSquare Footage (x)Revenue (y) Location 148,720.39$23,665,319.22 Location 240,778.72$20,066,838.98 Location 321,654.19$23,508,691.46 Location 433,344.11$11,748,300.32 Location 5116,006.40$33,450,105.86 Location 644,655.98$18,248,754.69 Location 78,549.08$10,943,196.86 Location 8157,424.48$32,934,788.04 Location 963,075.32$16,821,187.57 Location 1053,256.79$19,285,241.45 From the dataset above, a scatter plot was drawn using excel to show how the data points were distributed in the chart. This shows a visual display of the direction and likely strength of the relationship between the two variable. According to figure 1, the data points are clustered from the left side to the right side, which shows a positive slope (positive relationship).
DATA-DRIVEN DECISION MAKING3 -20,00040,00060,00080,000100,000120,000140,000160,000 $0 $5,000,000 $10,000,000 $15,000,000 $20,000,000 $25,000,000 $30,000,000 $35,000,000 $40,000,000 Revenue Collected Against Square-Footage Size (Square Footage) Revenue ($) Figure 1: Revenue versus Store Size (Ft)2Scatter plot The figure 2 below shows the scatter plot above data fitted with the line of best fit. This is commonly referred to as the trend line. The trend line affirms the existence of a positive relationship between the two variables. The regression equation for this experiment can be written as; Revenue=b0+b1(Square Footage) + u
DATA-DRIVEN DECISION MAKING4 -20,00040,00060,00080,000100,000120,000140,000160,000 $0 $5,000,000 $10,000,000 $15,000,000 $20,000,000 $25,000,000 $30,000,000 $35,000,000 $40,000,000 f(x) = 140.30906805577 x + 12824569.3242446 R² = 0.681726997806569 Revenue Collected Against Square-Footage Size (Square Footage Revenue ($) Figure 2: Revenue versus Store Size (Ft)2line of best fit The estimate from the Simple Linear Regression model can be written as ^Revenue=$12,824,569.32 + 140.31 (Square Footage) + u The regression equation above implies that the estimated revenue of for any grocery store without any square footage is$12,824,569.32. It is however quite skeptical to use this estimate some of the data points with 8500 to 33,350 square footage have values of revenue lower than $ 12M.The study also is geared towards the finding of the relationship between the two variables rather than the constant term. It is also not easy to estimate the constant term accurately. The slope of the model was found to be 140.31. This implies that a unit increase in the size of the store by a square foot would lead to an increase in the expected revenue by $ 140.31.This is in tandem with our expectation. The
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
DATA-DRIVEN DECISION MAKING5 positive relationship shows that the revenue amounts increases as the size of the store (square footage) increases and vice versa. The R-Squared value of the model was found to be 0.68. This implies that 68% of the variation in grocery stores revenue can be attributed to the changes in the stores' footage (Berenson, et al, 2012). The remaining 32% of the variation is attributed to errors and other factors outside the model that affect revenue such as distance, securities and other social amenities available in the area. In summary, we can categorize the findings into; ï‚·Linear in Parameters In regression analysis, the regression line rarely passes through each point in the data plotted until there is a perfect correlation (Cox, 2018). Considering that the y values are normally predicted, and the data that is normally used is actual observed items, it results in a difference that arises between the values of y that are predicted and observed. The differences between the two are known as residuals (observed y - expected y). The points lying above the line of best fit will result in positive residuals while they that lie below are negative residuals. In figure 2, both positive and negative residuals are observed. According to the scatter plot in figure 1, there is a strong correlation between the revenue and the size and square footage by location. The distances to the line of best fit are minimal. The assumption of linearity is met from the analysis above since the data points are clustered along the line of best fit (Larson, & Farber, 2019). ï‚·Random Sampling This is a method of probability sampling where each item in the population has equal chances of being selected in the sample. From an overview of the data set provided, randomness was
DATA-DRIVEN DECISION MAKING6 observed when selecting the sample data to be used. A clear positive relationship can be observed easily even prior analysis since the high revenue was realized in areas with huge square footage. The data set comprised of 10 locations that seem to have been selected randomly in a city since there seems to be a fluctuation. If this were not the case, a similar pattern would be observed in the data set from location 1 to location 10. ï‚·Sample Variation in the Explanatory Variable Considering the sample size of 10, which is relatively small, instances in data variation can be observed. As the sample size increases, the sample approaches the population hence data variation reduces. There is a lot of missing data between 60,000 to 120,000 square foot there could have influenced the direction of the relationship. Nonetheless, a line of best fit can be fitted with at least 3 points hence the data is still sufficient estimator of the impact of store square foot on revenue. ï‚·Zero mean of the error term conditional on the independent variable The error term (u) suggest that there exist numerous variables that affect grocery stores revenue apart from the square foot. The R-Squared value of the model was found to be 0.68, which shows that only 68% of the variation in grocery stores revenue can be attributed to the changes in the stores' footage. The remaining 32% of the variation is attributed to errors and other factors outside the model that affect revenue such as distance, security among others (Sullivan & Verhoosel,2013).This being the case, one would not take the estimate of revenue too strictly. However,68% estimate is statistically viable (Gupta and Kapoor, 2019).
DATA-DRIVEN DECISION MAKING7 In conclusion, the model is a sufficient estimator of revenue. A high correlation between revenue and square foot was realized. The two variables were also positively related. References Berenson, M., Levine, D., Szabat, K. A., & Krehbiel, T. C. (2012).Basic business statistics: Concepts and applications. Pearson higher education AU. Cox, D. R.(2018).Applied statistics-principles and examples. Routledge. Groebner, D. F., Shannon, P. W., Fry, P. C., & Smith, K. D. (2013).Business statistics. Pearson Education UK. Gupta, S.C. and Kapoor, V.K. (2019),Fundamentals of applied statistics. Sulthan Chand & Sons. Larson, R., & Farber, B. (2019).Elementary statistics. Pearson. McClave, J. T., Benson, P. G., Sincich, T., & Sincich, T. (2014).Statistics for business and economics. Boston: Pearson. Mulholland, H., & Jones, C. R. (2013).Fundamentals of statistics. Springer. Napier, C., & Maisel, J. W.(2015), Principles and Procedures of Statistics: a Biometric Approach. McGraw Hill Book Company, New York. Sullivan, M., & Verhoosel, J. C. M. (2013).Statistics: Informed decisions using data. New York: Pearson.
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.