This document provides an overview of business modeling and analysis. It covers topics such as descriptive statistics, confidence intervals, hypothesis testing, correlation and regression. The report also includes a conclusion and limitations section. The content is relevant to the subject of business modeling and analysis.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
MODELLING1 Business Modeling and Analysis Name of Author Name of Class Name of Professor Name of School State and City of School Date
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
MODELLING2 Table of Contents Executive Summary.....................................................................................................................................3 Introduction.................................................................................................................................................4 Descriptive Statistics....................................................................................................................................5 Confidence interval.....................................................................................................................................8 Hypothesis Testing....................................................................................................................................10 Correlation and Regression.......................................................................................................................12 Conclusion and limitations........................................................................................................................15 References.................................................................................................................................................16
MODELLING3 Executive Summary The findings of the report illustrate the relationship of variables in term of their own case and in terms of variables to variables in different categories as chosen by the requirement of the report. The hypothesis tests show how the variables that hypotheses are to be conducted upon are related to one another via the hypotheses created for all the variables to be tested. The sample data which is used for descriptive statistics, frequency distribution and graphical representations actually show the actual picture of how the entire population would be as it is a sample that is used in reference data for a whole population. The results will display the business solution that was being sorted for during the formulation of the tasks to be conducted in regards to specific problems.
MODELLING4 Introduction Countries around the world face several challenges in their strife to meet UN sustainable goals. Because of this, there was an urge to developing a study that helps understand what UN sustainable goals are and how to make the world a better place by 2030. To do this the importance of the social progress index was studies by comparing countries based on their performance at each subcategory of this index (Motesharrei et al. 2016). The datasets to be used in the analysis that would help us compare countries is a Social Progress Index data set. The dataset has 50 variables (column variables or otherwise known as variables) under 12 categories and the total number of countries that were to be used for the study was totaling to 182. The countries originated from different continents and these are AFRICA, AMERICA, ASIA, EUROPE and the OCEANIA region. Of the 12 categories only a total of 6 were to be used for analysis and of these six categories there are 4 variables in each category. Single variables per category will be picked and this brings the variables to be analyzed minus the countries variables to seven. The variables picked are; Depth of food deficit, Availability of affordable housing, Adult literacy rate, Life expectancy at 60, Private property rights, Religious tolerance. The analysis is to be conducted on the six variables that were pulled out of the categories and proper analysis and explanation of results, graphs and tables are to be done. The purpose of the proper explanation after the analysis is to aid those that are not from the statistical background in understanding the very results with a lot of ease. Therefore the sections that will be contained in the report are on descriptive statistics, confidence intervals will be used to estimate means, a correlation and a regression analysis will be done and finally conclusion and limitations section.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
MODELLING5 Descriptive Statistics Before running any descriptive analysis it is important to note that the dataset must be a sample form of at most 100 observations or rows. The variables are to be chosen using the random stratified sampling method in excel where the data rows are to be picked randomly and in terms of ratios as they appear in the original dataset (Björk, 2017). The stratified random sampling function in excel is run in sheet 1 of the excel document (Book1). There is a creation of a random sample column which is denoted as Rand. This column is placed as the last column after all the column to help affect the entire dataset by rearranging the whole dataset after which rational picking will be done. Some of the rows have empty cells and therefore must be deleted in order not to affect the analysis results. The deletion will bring the total number of variables down from 100 and this is acceptable. There are two types of variables in the sample dataset and they are ordinal and interval variables. The interval variables are Depth of food deficit and Private property rights, the reason is that one can find the difference between the former from the latter and vice versa (Malik & Hussain, 2018). The nominal variables on the other hand numbers are used to make representations of the degree of something occurring; the actual ordinal variables are Availability of affordable housing, Adult literacy rate, Life expectancy at 60 and Religious tolerance (Kent, 2015). For the ordinal variable graphs and the actual frequency, tables should be presented in percentages. The figures are as below;
MODELLING6 The above table shows how intervals are expressed into intervals and a frequency table is created out of that. This is done by the use of pivot tables (Visser, Seelye, Cassady & Collard, 2018). The graph developed for the above percentage frequency table will be a histogram; The columns are in percentages illustrating the degree of availability of affordable housing by intervals. Different graphs are to be drawn for different frequency tables as per the variables. For the ratio variable graphs and summary statistics will be appropriate for use because of their difference in nature (Lenth, 2016). The focus will be on one variable as the remaining variable
MODELLING7 will be included in the appendix section. The variable chosen for this section is private property rights and the descriptive or summary is as below The modal value is 40 meaning most individuals have higher rights of owning property. The value of 40 shows the number of large capital assets that an individual is entitled to own. The actual number varies by individual and therefore the mean is slightly above 35. This shows that averagely each and every individual can own up to a total of 35 capital properties in number. The mean and the median values stand at 35.5 and 32.5 respectively with the standard deviation value at 17.7. The difference between the mean and the median values to together with the variance value are used to show how far the data points in a variable deviate from the mean. The mean and the median difference, in this case, is larger showing substantial deviations of the data points from the central value of the data points in the variables, a statement that is also supported by how large the standard deviation value is (Chambers, 2017). The confidence interval shows the value that is used to estimate the mean value and it is 4.440171078. discussion of the confidence
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
MODELLING8 interval comes later in the sections. The positive value for skewness shows that the data points are skewed to the right (Celikoglu and Tirnakli, 2015). The pie chart above shows the quantity of those who own a certain number of properties. As from the earlier section, one can see clearly that the modal value of the number of units that can be owned is 40 and this, therefore, supports what has been mentioned. More individuals are entitled to own more properties as the larger values of properties numbers that can be owned take bigger potions in the pie charts as smaller numbers take smaller portions. More individuals own more properties with bigger potions on the 50, 40, 30 and 20 number of units with a higher number of properties own taking smaller portions. This might be because of the fact that it is more expensive to own more capital properties. The society, from which this dataset is extracted from, is a society in which most people are working very hard.
MODELLING9 Confidence interval The confidence interval that we will use to estimate the means of the two cases is at a value of 95% for both cases. The variables that we are to estimate their averages are drawn from categories 3 and 11 and as the sample was being picked the variable that was taken from category 3 are Availability of affordable housing and the variable that was taken from category 11 is Religious Tolerance. From the snipped image above the value of the mean of the entire column is 3.153409 and the confidence interval value is 0.137206. The confidence interval value when subtracted from the mean gives us a lower confidence value for a minimum mean value at 3.016203268 and when it is added it gives us an upper confidence value at 3.290615. These two values show us the range at which average values can be estimated from and with comfort (Morey et al. 2016). The next variable to be scrutinized using confidence interval is the Availability of affordable housing variable and the mean and the confidence values are as below; From the value above when the confidence interval is added onto the mean, we get upper confidence mean value and when the confidence value is subtracted from the mean value we get
MODELLING10 lower confidence mean values which are at 0.449533271 and 0.497564 respectively (Burgess & Thompson, 2015). Hypothesis Testing In this section concerns were raised regarding social functions and developments across countries in different continents and for this hypothesis tests will be conducted to address the relationship of social functions and developments in different countries found in different continents. The very first hypothesis to be tested is on the difference in the level of access to basic knowledge among American countries and African countries. In this case, the data variable was sectioned into two to aid in the hypothesis testing; there was the African column and the American column. Remember that the variable that was chosen is the level of literacy in adults. After the split, the formula that was used for the testing the hypothesis is two-sample t-test assuming unequal variances (Yang, 2018). The result from the analysis is;
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
MODELLING11 In accepting or rejecting the null hypothesis, there are two values that are used. Initial value that is used is the t Stat value and is compared to the t Critical two-tail since this is a two-tailed test. The t stat value is at 7.626 and the t Critical two-tailed is 1.996. The t stat is greater than the t Critical two-tailed and from his, we definitely reject the null hypothesis (Park, 2015). This can also be supported by the P two-tailed value which is at 1.13428E-10 and is lower than the alpha which is 0.05 hence rejecting the null hypothesis (Wong, 2016). Therefore the level of basic education is not higher in America than in Africa. The method used is strongly justified as the variance values are completely different and unequal. The next hypothesis test is on the difference Personal Rights between Asia and European, in this case, two-tailed test t-test assuming unequal variances was used and the results were; The P value two-tailed is less than the 0.5 alpha values hence rejecting the null hypothesis and therefore there are no differences in Personal Rights between Asia and European. From the variance values, there is actually a difference in the variance values that hence the justification for the method used.
MODELLING12 The last hypothesis to be tested is on the difference in terms of health and wellness between European and American countries. The splits were done as usual to get the respective regions for which tests are to be run clustered together. The actual method used in this case is a two-tailed t- test assuming unequal variances. The results from this will be; From the t-test results, the variances are totally unequal and therefore the method used to test the hypothesis is strongly justified. He t Stat value stand at 1.2917 and the t Critical values as well stand at 1.99713. Comparing both is evident that the t Stat is smaller than the t Critical two-tailed and therefore we accept the hypothesis that we were testing. Supporting this decision is the use of the p-value and alpha as it stands the p-value is at 1.6686 and the alpha value is 0.05 which is less than the p-value hence the acceptance of the null hypothesis. Correlation and Regression In the very first case, there is a need to find the regression relationship between variable selected in category 1 and the variable selected in category 7 as per the sample dataset that was created. The variables are Depth of food deficit which serves as the independent variable and Life expectancy at 60 serves as the dependent variable. The actual regression results are;
MODELLING13 The scattergram that results from the relationship is;
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
MODELLING14 The values of the R-squared which is the coefficient of determination are similar in both the scattergram and the results of the regression and are at 0.1849, meaning that the percentage variability in distance explained by the regression line is only 18%. The variables only relate 18% of the time which is a poor indication. There is a very week relationship presented by this. To get the The correlation coefficient, we get the square root of the coefficient of determination value. The estimate of the linear regression model is as in the scatterplot. From the scatter plot the variables are negatively related and the hypothesis value which is given by the significance F values at 0.000390448 shows that the value is lower than 0.05 hence there is no relationship between dependent and independent variables (Woodward, 2016).
MODELLING15 The second case the relationship will be between adult literacy (category 5) independent variable and Private property rights (category) dependent variable. Regression results are; The scattergram figure will be as
MODELLING16 The actual R squared value gives a value of 0.1189 showing that the relationship is only 11% of the time and therefore this is poor results. The relationship is positive though. The model of the relationship is as represented in the scattergram (Raidou, Gröller and Eisemann, 2019). Conclusion and limitations From the findings, it is clear that there is a sense of relationship and a sense of non-relationship in different variables. The major limitation in the analysis is the dataset had cells with empty variables which had to be thrown away hence dropping entire rows with valid data points that would improve the quality and analysis results.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
MODELLING17 References Björk, J., Malmqvist, E., Rylander, L., & Rignell-Hydbom, A. (2017).An efficient sampling strategy for selection of biobank samples using risk scores.Scandinavian journal of public health, 45(17_suppl), 41-44. Burgess, S., & Thompson, S. G. (2015).Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. American journal of epidemiology, 181(4), 251-260. Celikoglu, A. and Tirnakli, U., 2015.Comment on “Universal relation between skewness and kurtosis in complex dynamics”.Physical Review E, 92(6), p.066801. Chambers, J.M., 2017. Graphical Methods for Data Analysis: 0. Chapman and Hall/CRC. Kent, R.A., 2015. Analysing quantitative data: Variable-based and case-based approaches to non-experimental datasets. Sage. Length, R.V., 2016.Least-squares means the R package means. Journal of statistical software, 69(1), pp.1-33. Malik, M. S. I., & Hussain, A. (2018). An analysis of review content and reviewer variables that contribute to review helpfulness.Information Processing & Management, 54(1), 88-104. Morey, R.D., Hoekstra, R., Rouder, J.N., Lee, M.D. and Wagenmakers, E.J., 2016.The fallacy of placing confidence in confidence intervals.Psychonomic Bulletin & review, 23(1), pp.103-123. Motesharrei, S., Rivas, J., Kalnay, E., Asrar, G. R., Busalacchi, A. J., Cahalan, R. F., ... & Hubacek, K. (2016). Modelling sustainability: population, inequality, consumption, and
MODELLING18 bidirectional coupling of the Earth and Human Systems. National Science Review, 3(4), 470- 494. Park, H.M., 2015. Hypothesis testing and statistical power of a test. Raidou, R.G., Gröller, M.E. and Eisemann, M., 2019.Relaxing Dense Scatter Plots with Pixel- Based Mappings. IEEE transactions on visualization and computer graphics, 25(6), pp.2205- 2216. Visser, A., Seelye, M., Cassady, S., & Collard, D. (2018).Making Usage Stats Usable: Evaluating Usage Stat Tools and Maximizing Excel/Pivot Tables. Wong, K.K.K., 2016.Mediation analysis, categorical moderation analysis, and higher-order constructs modelling in Partial Least Squares Structural Equation Modeling (PLS-SEM): A B2B Example using SmartPLS. Marketing Bulletin, 26. Woodward, P., 2016.The Bayesian analysis made simple: an Excel GUI for WinBUGS. Chapman and Hall/CRC. Yang, H., Burdick, R.K., Cheng, A. and Montes, R.O., 2018.Statistical considerations for the demonstration of analytical similarity. In Biosimilars (pp. 431-468).Springer, Cham.