Business Data Analysis
VerifiedAdded on 2023/01/18
|9
|1076
|43
AI Summary
This document provides an overview of business data analysis, covering topics such as survey methods, sampling methods, variables, histograms, scatter plots, numerical summaries, correlation, confidence intervals, hypothesis testing, and linear regression.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
Business Data Analysis
Computer Assignment
Student’s Name
Institution Affiliation
Computer Assignment
Student’s Name
Institution Affiliation
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Business Data Analysis
Part 1
Research to investigate this relationship the number of rooms in a house and the valuation of a
house (the price of a house).
1. Type of survey method the researcher
Questionnaire. With this method, it will be easier to collect data as it's easier to design and
people can respond to the questions in their free time.
2. The sampling method that can use to select the sample
Stratified random sampling. This will ensure equal representation of genders, social classes, and
groups of people representation in the sample (Thompson, 2012).
3. Variables researcher will consider
Number of rooms in a house
Price of the house
The house price will be the response variable(Y) while Number of rooms will be the explanatory
variable(X).
4. Issues that can be faced
Some targeted people may fail to answers the questionnaire. Others will respond but give
wrong/inaccurate answers which can lead to wrong results (Singh, & Mangat, 2013).)
Part 1
Research to investigate this relationship the number of rooms in a house and the valuation of a
house (the price of a house).
1. Type of survey method the researcher
Questionnaire. With this method, it will be easier to collect data as it's easier to design and
people can respond to the questions in their free time.
2. The sampling method that can use to select the sample
Stratified random sampling. This will ensure equal representation of genders, social classes, and
groups of people representation in the sample (Thompson, 2012).
3. Variables researcher will consider
Number of rooms in a house
Price of the house
The house price will be the response variable(Y) while Number of rooms will be the explanatory
variable(X).
4. Issues that can be faced
Some targeted people may fail to answers the questionnaire. Others will respond but give
wrong/inaccurate answers which can lead to wrong results (Singh, & Mangat, 2013).)
Business Data Analysis
PART 2
5. Histogram for each variable
The following are histogram for preparation time and Mark respectively
25-34
35-44
45-54
55-64
65-74
75-84
85-94
0
5
10
15
20
25
30
Histogram for Preparation Time
Preparation Time
Frequency
The histogram above is not below shapes indicating that the preparation time is not normally
distributed.
PART 2
5. Histogram for each variable
The following are histogram for preparation time and Mark respectively
25-34
35-44
45-54
55-64
65-74
75-84
85-94
0
5
10
15
20
25
30
Histogram for Preparation Time
Preparation Time
Frequency
The histogram above is not below shapes indicating that the preparation time is not normally
distributed.
Business Data Analysis
25-34 35-44 45-54 55-64 65-74 75-84 85-94 95-104
0
5
10
15
20
25
30
35
Histogram for Mark
Marks
Frequency
The histogram above is not below shaped suggesting that Mark is not normally distributed.
6. Plot to explain the relationship between the Preparation Time and Mark
Below is a scatter plot to explain the relationship Preparation Time and Mark. Preparation time is
the X variable while Mark is the Y variable. Generally, the time taken to prepare for an
examination influence the numbers of marks that one will attain, suggesting that Mark is a
response variable(Y) while Preparation Time is an explanatory variable(X).
25-34 35-44 45-54 55-64 65-74 75-84 85-94 95-104
0
5
10
15
20
25
30
35
Histogram for Mark
Marks
Frequency
The histogram above is not below shaped suggesting that Mark is not normally distributed.
6. Plot to explain the relationship between the Preparation Time and Mark
Below is a scatter plot to explain the relationship Preparation Time and Mark. Preparation time is
the X variable while Mark is the Y variable. Generally, the time taken to prepare for an
examination influence the numbers of marks that one will attain, suggesting that Mark is a
response variable(Y) while Preparation Time is an explanatory variable(X).
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Business Data Analysis
20 30 40 50 60 70 80 90 100
0
20
40
60
80
100
120
f(x) = 0.583053973782169 x + 28.984277492772
MARKS VS PREPARATION TIME
Preparation Time
Marks
7. Numerical summary report: mean, median, range, variance, standard deviation, smallest and
largest values, and the three quartiles, for Preparation Time and Mark
The following is a table to summarize the result of computations involving data of the two variables
PREPARATION
TIME
MARK
Mean 63.04 65.74
Median 64 68
Range 65 75
Sample Variance 266.36 303.12
Standard
Deviation
16.32 17.41
Minimum 25 25
Maximum 90 100
1st Quartile 51 54
2nd Quartile 64 68
3rd Quartile 25 25
Count 100 100
20 30 40 50 60 70 80 90 100
0
20
40
60
80
100
120
f(x) = 0.583053973782169 x + 28.984277492772
MARKS VS PREPARATION TIME
Preparation Time
Marks
7. Numerical summary report: mean, median, range, variance, standard deviation, smallest and
largest values, and the three quartiles, for Preparation Time and Mark
The following is a table to summarize the result of computations involving data of the two variables
PREPARATION
TIME
MARK
Mean 63.04 65.74
Median 64 68
Range 65 75
Sample Variance 266.36 303.12
Standard
Deviation
16.32 17.41
Minimum 25 25
Maximum 90 100
1st Quartile 51 54
2nd Quartile 64 68
3rd Quartile 25 25
Count 100 100
Business Data Analysis
8. Numerical summary measure to measure the strength of the linear relationship between the
two variables.
Correlation Between Marks and Preparation
Time
PREPARATION
TIME
MAR
K
PREPARATIO
N TIME
1.00
MARK 0.55 1.00
From the table above the correlation coefficient for preparation time and the mark is 0.55, which
is positive and greater than 0, suggesting a strong positive linear relationship between the two
variables (Healey, 2014).
9. Construction of a 90% confidence interval estimate for the population average time
spent on preparation
z-Estimate of a Mean
Sample mean 63.04 Confidence Interval Estimate
Population standard
deviation
16.32 63.04 2.68
Sample size 100 Lower confidence
limit
60.36
Confidence level 90% Upper confidence
limit
65.72
Therefore, using z-estimate, the 90% confidence interval for the population mean of Preparation time
is (60.36, 65.72).
10. Hypothesis test that the population average time spent on preparation is more than 65
hours using a 5% level of significance.
8. Numerical summary measure to measure the strength of the linear relationship between the
two variables.
Correlation Between Marks and Preparation
Time
PREPARATION
TIME
MAR
K
PREPARATIO
N TIME
1.00
MARK 0.55 1.00
From the table above the correlation coefficient for preparation time and the mark is 0.55, which
is positive and greater than 0, suggesting a strong positive linear relationship between the two
variables (Healey, 2014).
9. Construction of a 90% confidence interval estimate for the population average time
spent on preparation
z-Estimate of a Mean
Sample mean 63.04 Confidence Interval Estimate
Population standard
deviation
16.32 63.04 2.68
Sample size 100 Lower confidence
limit
60.36
Confidence level 90% Upper confidence
limit
65.72
Therefore, using z-estimate, the 90% confidence interval for the population mean of Preparation time
is (60.36, 65.72).
10. Hypothesis test that the population average time spent on preparation is more than 65
hours using a 5% level of significance.
Business Data Analysis
Hypotheses:
H0 : μ=65
H1 : μ ≠ 65
z-Test of a Mean
Sample mean 63.04 z Stat -1.20
Population standard
deviation 16.32 P(Z<=z) one-tail 0.1149
Sample size 100 z Critical one-tail 1.6449
Hypothesized mean 65 P(Z<=z) two-tail 0.2298
Alpha 5% z Critical two-tail 1.9600
Since z-statistics (-1.20) is less than z-critical two-tail (1.96) then null hypothesis will be
accepted, suggesting that the population mean of preparation time is 65 hours.
11. Estimating a simple linear regression model and presenting the estimated linear
equation.
The table below shows the summary of Excel Regression analysis for the data.
SUMMARY
OUTPUT
Regression Statistics
Multiple R 0.5466
R Square 0.2987
Adjusted R Square 0.2916
Standard Error 14.6541
Observations 100
ANOVA
df SS MS F
Significance
F
Regression 1 8964.478 8964.478 41.74525 4.04E-09
Residual 98 21044.76 214.7425
Total 99 30009.24
Hypotheses:
H0 : μ=65
H1 : μ ≠ 65
z-Test of a Mean
Sample mean 63.04 z Stat -1.20
Population standard
deviation 16.32 P(Z<=z) one-tail 0.1149
Sample size 100 z Critical one-tail 1.6449
Hypothesized mean 65 P(Z<=z) two-tail 0.2298
Alpha 5% z Critical two-tail 1.9600
Since z-statistics (-1.20) is less than z-critical two-tail (1.96) then null hypothesis will be
accepted, suggesting that the population mean of preparation time is 65 hours.
11. Estimating a simple linear regression model and presenting the estimated linear
equation.
The table below shows the summary of Excel Regression analysis for the data.
SUMMARY
OUTPUT
Regression Statistics
Multiple R 0.5466
R Square 0.2987
Adjusted R Square 0.2916
Standard Error 14.6541
Observations 100
ANOVA
df SS MS F
Significance
F
Regression 1 8964.478 8964.478 41.74525 4.04E-09
Residual 98 21044.76 214.7425
Total 99 30009.24
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Business Data Analysis
Coefficients
Standard
Error t Stat P-value Lower 95%
Upper
95%
Intercept 28.9843 5.8745 4.9339 0.0000 17.3265 40.6421
PREPARATION
TIME
0.5831 0.0902 6.4611 0.0000 0.4040 0.7621
The equation of the model is
y=0.5831 x +28.9843 ,
where , y=Mark , x=PreparationTime
The slope coefficient of Preparation Time is 0.5831, this implies that when preparation time
change by one unit, Mark will change by 0.5831.
12. Interpretation of the coefficient of determination, R-squared (R2) value.
The coefficient of determination of the above model is 0.2987(29.87%), which indicate that the
variation of Mark(Y) is 29.87% in relation to Preparation Time(X) (Yan& Su, 2009).
Coefficients
Standard
Error t Stat P-value Lower 95%
Upper
95%
Intercept 28.9843 5.8745 4.9339 0.0000 17.3265 40.6421
PREPARATION
TIME
0.5831 0.0902 6.4611 0.0000 0.4040 0.7621
The equation of the model is
y=0.5831 x +28.9843 ,
where , y=Mark , x=PreparationTime
The slope coefficient of Preparation Time is 0.5831, this implies that when preparation time
change by one unit, Mark will change by 0.5831.
12. Interpretation of the coefficient of determination, R-squared (R2) value.
The coefficient of determination of the above model is 0.2987(29.87%), which indicate that the
variation of Mark(Y) is 29.87% in relation to Preparation Time(X) (Yan& Su, 2009).
Business Data Analysis
Reference
Healey, J. F. (2014). Statistics: A tool for social research. Cengage Learning.
Thompson, S. K. (2012). Simple random sampling. Sampling, 9-37.
Singh, R., & Mangat, N. S. (2013). Elements of survey sampling (Vol. 15). Springer Science &
Business Media.
Yan, X., & Su, X. (2009). Linear regression analysis: theory and computing. World Scientific.
Reference
Healey, J. F. (2014). Statistics: A tool for social research. Cengage Learning.
Thompson, S. K. (2012). Simple random sampling. Sampling, 9-37.
Singh, R., & Mangat, N. S. (2013). Elements of survey sampling (Vol. 15). Springer Science &
Business Media.
Yan, X., & Su, X. (2009). Linear regression analysis: theory and computing. World Scientific.
1 out of 9
Related Documents
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.