Business Data Analysis: Statistical Analysis and Interpretation

Verified

Added on  2020/04/15

|17
|2276
|39
Homework Assignment
AI Summary
This document presents a comprehensive business data analysis assignment solution, encompassing various statistical techniques and their applications. The assignment covers topics such as data visualization, including the analysis of fluctuating data and fatality rates across different regions. It delves into hypothesis testing using z-tests and t-tests, exploring concepts like null and alternative hypotheses, test statistics, and degrees of freedom. The solution also addresses stratified random sampling and cluster sampling methods. Furthermore, it demonstrates the application of ANOVA for comparing different groups and simple linear regression models for analyzing relationships between variables, including correlation coefficients, slopes, and y-intercepts. Time series analysis, paired t-tests, and chi-square tests are also included, providing a well-rounded approach to data analysis and interpretation.
Document Page
Running Head: BUSINESS DATA ANALYSIS
Business Data Analysis
Name of the Student
Name of the University
Author Note
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
1BUSINESS DATA ANALYSIS
TUTORIAL
Part A
Answer 1
Task 1
Figure 1 shows that the count of Rock Road * Bus Lane Beside Park fluctuates a lot each
day.
1/1/2015
1/6/2015
1/11/2015
1/16/2015
1/21/2015
1/26/2015
1/31/2015
2/5/2015
2/10/2015
2/15/2015
2/20/2015
2/25/2015
3/2/2015
3/7/2015
3/12/2015
3/17/2015
3/22/2015
3/27/2015
0
100
200
300
400
500
600
700
800
Rock Road *Bus Lane Beside Park
Days
Count
Figure 1
Document Page
2BUSINESS DATA ANALYSIS
Task 2
Figure 2
From the chart it can be seen that in countries like Cork, Dublin, Limerick and Tipperary
have shown very high fatality rates compared to other countries.
Task 3
Data has been selected on the GDP growth of Singapore. The data has been collected
from World Bank. With the help of a line graph in figure 3, the trend of the data is described.
The GDP of Singapore had a dramatic fall in 2008 and 2009 and after that there has been
recovery in the GDP in 2010. It again started to fall from 2011 gradually.
Document Page
3BUSINESS DATA ANALYSIS
2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
-2
0
2
4
6
8
10
12
14
16
18
Real GDP Growth Rate of Singapore
Year
GDP Growth
Figure 3
Answer 2
Problem 1:
a) The null hypothesis (H0) and the alternate hypothesis (HA) has been stated as
follows:
Ho: μ ≤ 0.75
HA: μ >0.75
The test statistic for this test can be given by the following formula:
t= X−μ
σ
√ n
Here, X is the sample mean, μ is the mean value that has been hypothesized, σ is the standard
deviation of the sample and n is the sample size. In this study,
Sample mean ( X ) = 0.68
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
4BUSINESS DATA ANALYSIS
Hypothesized mean ( μ) = 0.75
Standard deviation of the sample ( σ )
Sample size (n) = 400
Degrees of freedom (n-1) = 399
The null hypothesis will be rejected if the tabulated value of t statistic is less than the observed
value of absolute t.
b) In stratified random sampling, the different strata are created from the population
considering the variables based on which the study will be conducted. Sampling is then
done from each stratum. Sampling error is reduced in this case.
In cluster sampling, different clusters are formed from the whole population and
as samples some clusters are selected from all the clusters formed.
c) Smaller
Greater
d) 1.96
e) -1.29 and 1.29
f) Z tests are used when sample size is greater than 30 and standard deviation is
known. T test is used when sample size is less than 30.
g) To test whether the proportion wishing to travel to Europe has fallen below 44%,
one sample z-test has to be conducted.
Here, the sample proportion ( p1) = (400/1000) = 0.4
The population proportion (p) = 0.44
Sample size (n) = 100
Document Page
5BUSINESS DATA ANALYSIS
The standard deviation of the sample (s) = √ p1 (1−p1)
n = √ 0.4(1−0.4)
100 =0.049
The null (H0) and the alternate ( H A) hypothesis can be given as:
H0 : p ≥ 0.44
H A : p<0.44
The test statistic for the test can be given as:
z= p1− p
s = 0.4−0.44
0.049 =−0.816
The tabulated value of z for 1% level of significance is 2.58 which is more than
the absolute value of calculated z statistic. Thus, null hypothesis is accepted. the
proportion wishing to travel to Europe has not fallen below 44%
Problem 2:
The null hypothesis (H0) and the alternate hypothesis (HA) has been stated as follows:
Ho: μ =0.75
HA: μ ≠0.75
The test statistic for this test can be given by the following formula:
t= X−μ
σ
√ n
Here, X is the sample mean, μ is the mean value that has been hypothesized, σ is the standard
deviation of the sample and n is the sample size. In this study,
Sample mean ( X ) = 0.68
Document Page
6BUSINESS DATA ANALYSIS
Hypothesized mean ( μ) = 0.75
Standard deviation of the sample ( σ )
Sample size (n) = 400
Degrees of freedom (n-1) = 399
The null hypothesis will be rejected if the tabulated value of t statistic is less than the observed
value of absolute t, at 0.025 level of significance at each of the two tails.
Problem 3:
Confidence Interval ¿ x ± z0.05
s
√ n
¿ 29.8 ±1.96 0.41
√ 30
¿ 29.8 ±1.96 × 0.075
¿ 29.8 ± 0.147
¿( 29.65 ,29.95)
There are no values outside this range, thus the process is in control
Part B
Section A
1. 13 packets of beans were sampled.
a) Median = 498
b) Mean = 497.15
c) Variance = 29.64
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
7BUSINESS DATA ANALYSIS
d) Null Hypothesis: μ <500
Alternate hypothesis: μ ≥500
e) z= X −μ
σ
√ n
= 497.15−500
5.44
√ 13
=−1.89
f) The probability of committing a type 1 error at α = 0.05 is 1.96
g) The absolute value of z statistic is less than the probability of type 1 error. Thus
null hypothesis is accepted. The average weights of the packets are less than 500
grams.
2. Statistics course is available online and in traditional classroom.
a) Null Hypothesis: μonline =μclassroom
Alternate Hypothesis: μonline ≠ μclassroom
b) The test statistic for the test can be given as:
T = Y online−Y classroom
√ sonline
2
Nonline
+ sclassroom
2
N classroom
= 63.3−58.33
√ 33.57
10 + 73.75
9
=1.46
The degree of freedom is given by:
v=
( sonline
2
N online
+ sclassroom
2
Nclassroom ) 2
( sonline
2
N online )
2
( Nonline −1 ) +
( sclassroom
2
Nclassroom )
2
( N classroom−1 )
= ( 33.57
10 + 73.75
9 )
2
( 33.57
10 )
2
( 10−1 ) + ( 73.75
9 )
2
( 9−1 )
=14
c) The probability of committing a type 1 error at 0.05 level of significance with 14
degree of freedom is 2.145
Document Page
8BUSINESS DATA ANALYSIS
d) The test statistic is less than the probability of committing a type 1 error. Thus,
the null hypothesis is accepted. Thus, there is no difference in the exam results at
the end of the course between the two groups.
Section B
3.
a) From the ANOVA table it can be seen that sig. value is 0.000 which is less than
the level of significance. Thus, it can be said that there is significant difference of
noise pollution created by small, medium and large sized cars.
b) The null hypothesis in this case is:
Null Hypothesis: There is no significant difference in the premiums paid per six
months by these households with these three companies
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups
1020
6 2
510
3
1.09718
3
0.37462
2
4.25649
5
Within Groups
4185
9 9
465
1
Total
5206
5
1
1
The ANOVA table shows that p-value is more than the level of
significance (0.05). Thus, null hypothesis is accepted.
4.
a) Vsb
b) Bchjs
Document Page
9BUSINESS DATA ANALYSIS
5.
a) Simple linear regression model will be calculated
i. Correlation coefficient = 0.144
ii. Slope of a line best fit = 0.003
iii. The Y intercept = 3.82
iv. The value of Y when X = 22 is 3.89
The correlation coefficient shows that the study hours and grade points
have a very weak relation. With one-unit increase in study hours, grade points
increase by 0.003 times. The y-intercept indicates the grade points in the
absence of study hours.
b) A simple linear regression model predicts the value of Y with changes in the
value of X.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
10BUSINESS DATA ANALYSIS
2015-2016
Answer 1
a) Null Hypothesis:There is no significant difference between the salaries of male and
female professors
Alternate Hypothesis:There is no significant difference between the salaries of male and
female professors
b) The test statistic to test the hypothesis is 2.636
c) The probability of committing type 1 error is 2.101
d) The test statistic is more than the probability of committing a type 1 error. Thus, null
hypothesis is rejected.
Answer 2
(a) Null Hypothesis:There is no significant difference between the salaries of male and
female professors
Alternate Hypothesis:There is no significant difference between the salaries of male and
female professors
(b) The test statistic to test the hypothesis is 4.049
(c) The probability of committing type 1 error is 2.262
(d) The test statistic is more than the probability of committing a type 1 error. Thus, null
hypothesis is rejected.
Answer 3
(a) In case of an ANOVA test, the null and alternate hypothesis can be given as:
Document Page
11BUSINESS DATA ANALYSIS
Null Hypothesis:There is no significant difference between the rental units at different
time points
Alternate Hypothesis:There is no significant difference between the rental units at
different time points
From the ANOVA table given in the question, it can be seen that the p-value
(Sig.) is more than the 55 level of significance (0.05). Thus, the Null hypothesis is
accepted. There is no significant difference in the rental units over different time points.
(b) The dataset contains the test scores from three statistics classes of three different
teachers.
a) Null Hypothesis: There is no significant difference in test scores of three
different classes
Alternate Hypothesis:There are significant differences in test scores of three
different classes
b) The test statistic can be given as 4.84
c) The probability of committing a type 1 error is 4.459
d) The critical value of F statistic is less than the observed value of the F statistic.
Thus, null hypothesis is rejected.
Answer 4
a) To perform this test, chi-square test of association has been conducted.
i. Null Hypothesis: There is no association between the observed and the expected
values
Alternate Hypothesis: There is association between the observed and the expected
values
chevron_up_icon
1 out of 17
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]