Semester 1, 2018 BUS105 Computing Assignment
VerifiedAdded on 2023/06/11
|13
|2177
|341
AI Summary
This report covers various statistical techniques used to summarize data, including pivot tables, sample estimates, hypothesis testing, and scatterplots. It includes discussions on the relationship between variables such as age and product preference, and the amount spent on a product. The report also provides insights into the usefulness of these discussions in business. The report is based on the Semester 1, 2018 BUS105 Computing Assignment.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
Title: semester 1, 2018 bus105 computing assignment
Name:
Student number:
Allocated sample: 420
Section 1
Name:
Student number:
Allocated sample: 420
Section 1
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
The author of the previous report used different techniques of summarizing data. First, the author
gives an introduction of what he is planning to report. He then goes ahead to explain the dataset
used by stating what each and every variable in the dataset represents. He states whether the
variables are categorical or quantitative variables. For instance, the author categorizes “Gender”,
“Are they old? Above or under 40” and “Do they like the product” as categorical variables while
“How much they would pay for it” is categorized as a quantitative variable.
For the quantitative variables, the author utilizes descriptive summary statistics such as mean,
median, mode among others to summarize the variables (quantitative) while for the categorical
variables, the author utilizes frequency tables and bar graphs to present them.
Section 2
A) Pivot tables that let you investigate the relationship between the variables
“old or young” and “do the like the product? hate or like”
Count of do they
like product?
Column
Labels
Row Labels old young
Grand
Total
hate 15 10 25
like 55 20 75
Grand Total 70 30 100
Count of do they
like product?
Column
Labels
Row Labels old young
Grand
Total
hate 21.43% 33.33% 25.00%
like 78.57% 66.67% 75.00%
gives an introduction of what he is planning to report. He then goes ahead to explain the dataset
used by stating what each and every variable in the dataset represents. He states whether the
variables are categorical or quantitative variables. For instance, the author categorizes “Gender”,
“Are they old? Above or under 40” and “Do they like the product” as categorical variables while
“How much they would pay for it” is categorized as a quantitative variable.
For the quantitative variables, the author utilizes descriptive summary statistics such as mean,
median, mode among others to summarize the variables (quantitative) while for the categorical
variables, the author utilizes frequency tables and bar graphs to present them.
Section 2
A) Pivot tables that let you investigate the relationship between the variables
“old or young” and “do the like the product? hate or like”
Count of do they
like product?
Column
Labels
Row Labels old young
Grand
Total
hate 15 10 25
like 55 20 75
Grand Total 70 30 100
Count of do they
like product?
Column
Labels
Row Labels old young
Grand
Total
hate 21.43% 33.33% 25.00%
like 78.57% 66.67% 75.00%
Grand Total 100.00% 100.00% 100.00%
B) Make a simple comment
Majority (33.33%, n = 10) of the young people seem to hate the product as compared to
the old people (21.43%, n = 15).
C) Using your sample what is the estimate for p1- p2? In other words what is the difference
between the sample proportions ^p1 - ^p2
Answer
^p1=0.7857
^p2=0.6667
0.7857-0.6667 = 0.119
Section 3
A) A pivot table that let you investigate the relationship between the variables
“old or young” and “how much they would pay for the product ”
sample collector id 420
Row Labels Average of how
much would
pay?
StdDev of how
much would
pay?
Count of are
they old?
old 2.520 1.224 70
young 2.183 1.405 30
Grand Total 2.419 1.283 100
B) Make a simple comment about the relationship between the variables
Old people are willing to pay slightly higher for the product as compared to the young
people
C) Using your sample what is the estimate for μ1- μ2? In other words what is the difference
between the sample means
x1- x2
answer
B) Make a simple comment
Majority (33.33%, n = 10) of the young people seem to hate the product as compared to
the old people (21.43%, n = 15).
C) Using your sample what is the estimate for p1- p2? In other words what is the difference
between the sample proportions ^p1 - ^p2
Answer
^p1=0.7857
^p2=0.6667
0.7857-0.6667 = 0.119
Section 3
A) A pivot table that let you investigate the relationship between the variables
“old or young” and “how much they would pay for the product ”
sample collector id 420
Row Labels Average of how
much would
pay?
StdDev of how
much would
pay?
Count of are
they old?
old 2.520 1.224 70
young 2.183 1.405 30
Grand Total 2.419 1.283 100
B) Make a simple comment about the relationship between the variables
Old people are willing to pay slightly higher for the product as compared to the young
people
C) Using your sample what is the estimate for μ1- μ2? In other words what is the difference
between the sample means
x1- x2
answer
x1=2.520
x2=2.183
2.520 – 2.183 = 0.337
Section 4
A) Scatterplot
B) Make a simple comment about the relationship between the variables
C) Estimated profit for the casino when there 1000 bets is
Profit=0.9386∗1000+3.1543=941.7543
Section 5
A) Using the answer in section 2
Test the claim there is a difference in the proportions, use a 5% level of significance
i) State an appropriate H0 and H1
Solution
x2=2.183
2.520 – 2.183 = 0.337
Section 4
A) Scatterplot
B) Make a simple comment about the relationship between the variables
C) Estimated profit for the casino when there 1000 bets is
Profit=0.9386∗1000+3.1543=941.7543
Section 5
A) Using the answer in section 2
Test the claim there is a difference in the proportions, use a 5% level of significance
i) State an appropriate H0 and H1
Solution
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
H0 : ^p1= ^p2
H1 : ^p1 ≠ ^p2
ii) Find the p-value Only using the answers to part (A) and the webpage
http://epitools.ausvet.com.au/content.php?page=z-test-2
Results
Sample 1 Sample 2 Difference
Sample proportion 0.7857 0.6667 0.119
95% CI (asymptotic) 0.6896 - 0.8818 0.498 - 0.8354 -0.0662 - 0.3042
z-value 1.3
P-value 0.2079
Interpretation
Not significant,
accept null hypothesis that
sample proportions are equal
n by pi n * pi >5, test ok
iii) State whether or not you reject the H0
Solution
We fail to reject the null hypothesis (H0) since the p-value > 0.05
iv) Give a conclusion in plain English
Solution
There is no significant statistical evidence to conclude that the proportion of old
people who like the product is different from the proportion of young people who like
the product.
B) Using the answer in section 3
Test the claim that there is a difference between the means using a 5% level of significance
i) State an appropriate H0 and H1
Solution
H0 : μ1=μ2
H1 : μ1 ≠ μ2
H1 : ^p1 ≠ ^p2
ii) Find the p-value Only using the answers to part (A) and the webpage
http://epitools.ausvet.com.au/content.php?page=z-test-2
Results
Sample 1 Sample 2 Difference
Sample proportion 0.7857 0.6667 0.119
95% CI (asymptotic) 0.6896 - 0.8818 0.498 - 0.8354 -0.0662 - 0.3042
z-value 1.3
P-value 0.2079
Interpretation
Not significant,
accept null hypothesis that
sample proportions are equal
n by pi n * pi >5, test ok
iii) State whether or not you reject the H0
Solution
We fail to reject the null hypothesis (H0) since the p-value > 0.05
iv) Give a conclusion in plain English
Solution
There is no significant statistical evidence to conclude that the proportion of old
people who like the product is different from the proportion of young people who like
the product.
B) Using the answer in section 3
Test the claim that there is a difference between the means using a 5% level of significance
i) State an appropriate H0 and H1
Solution
H0 : μ1=μ2
H1 : μ1 ≠ μ2
ii) Find the p-value using the answers to part (A) and the webpage
https://www.medcalc.org/calc/comparison_of_means.php
Solution
Results
Difference -0.337
Standard error 0.279
95% CI -0.8914 to 0.2174
t-statistic -1.206
DF 98
Significance level P = 0.2306
iii) State whether or not you reject H0
Solution
We fail to reject the null hypothesis (H0) since the p-value > 0.05
iv) Give a conclusion in plain English
Solution
There is no significant statistical evidence to conclude that the average amount spent
by old people is different from the average amount spent by young people.
Section 6
Use the dataset given below you must use your own sample
https://app.box.com/s/kzc6ivy10gvy4vz6d0pgy0lzh929ivx9
Suppose A business has conducted an opinion poll to find out if their customers support a change
to the Business
a) Use the PivotTable feature in excel to find appropriate summary statistics for your
sample,. You should paste both into word, you do not need the excel file.
https://www.medcalc.org/calc/comparison_of_means.php
Solution
Results
Difference -0.337
Standard error 0.279
95% CI -0.8914 to 0.2174
t-statistic -1.206
DF 98
Significance level P = 0.2306
iii) State whether or not you reject H0
Solution
We fail to reject the null hypothesis (H0) since the p-value > 0.05
iv) Give a conclusion in plain English
Solution
There is no significant statistical evidence to conclude that the average amount spent
by old people is different from the average amount spent by young people.
Section 6
Use the dataset given below you must use your own sample
https://app.box.com/s/kzc6ivy10gvy4vz6d0pgy0lzh929ivx9
Suppose A business has conducted an opinion poll to find out if their customers support a change
to the Business
a) Use the PivotTable feature in excel to find appropriate summary statistics for your
sample,. You should paste both into word, you do not need the excel file.
This pivot table must have the number of people that answer yes and the number of
people that answer no
Solution
Row Labels Count of do you support proposed change?
no 90
yes 112
Grand Total 202
a) The sample size n is 1000 and the sample proportion ^p= 112
202 =0.555
b) Find 90% confidence interval for the proportion of people that support the change
standard error = √ 0.555∗(1−0.555)
202 = √ 0.001223 = 0.03497
Using the z distribution 90% of sample proportions are within 1.645 standard errors of
the population proportion so the 90% confidence for sample proportion is between
Lower bound: 0.555−1.645∗0.03497=0.4975
Upper bound: 0.555+1.645∗0.03497=0.6125
We are 90% confident that the sample proportion of people that support the change is
between 0.4975 and 0.6125.
Section 7
a) Histogram
The histogram below shows the relationship between the variables “Win or loss” and the
“goal difference “for the Man United football club.
people that answer no
Solution
Row Labels Count of do you support proposed change?
no 90
yes 112
Grand Total 202
a) The sample size n is 1000 and the sample proportion ^p= 112
202 =0.555
b) Find 90% confidence interval for the proportion of people that support the change
standard error = √ 0.555∗(1−0.555)
202 = √ 0.001223 = 0.03497
Using the z distribution 90% of sample proportions are within 1.645 standard errors of
the population proportion so the 90% confidence for sample proportion is between
Lower bound: 0.555−1.645∗0.03497=0.4975
Upper bound: 0.555+1.645∗0.03497=0.6125
We are 90% confident that the sample proportion of people that support the change is
between 0.4975 and 0.6125.
Section 7
a) Histogram
The histogram below shows the relationship between the variables “Win or loss” and the
“goal difference “for the Man United football club.
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
b) Description of the variables
The variable “win or loss” is categorical variable because it is a question “Was it a win or
a loss?” The variable goal difference is quantitative variable because the value is given in
numbers.
c) Description of the relationship
The amount people would pay for the snack food is between 0 and 6
Large goal difference is observed for the wins as compared for the losses
d) Consider the histogram you found yourself and discussed in parts (a) ,(b) and (c)
Would the discussion be useful in business? Give a reason for your answer.
Solution
Yes the discussion would be useful in business since it will be able to predict the goal
difference the team is likely to get in a win or a loss and this will prepare the manager on
how to handle the case.
The variable “win or loss” is categorical variable because it is a question “Was it a win or
a loss?” The variable goal difference is quantitative variable because the value is given in
numbers.
c) Description of the relationship
The amount people would pay for the snack food is between 0 and 6
Large goal difference is observed for the wins as compared for the losses
d) Consider the histogram you found yourself and discussed in parts (a) ,(b) and (c)
Would the discussion be useful in business? Give a reason for your answer.
Solution
Yes the discussion would be useful in business since it will be able to predict the goal
difference the team is likely to get in a win or a loss and this will prepare the manager on
how to handle the case.
e) Consider the following discussion taken from the sample report you had to read in section
1, Would the discussion be useful in business? Give a reason for your answer
Solution
The discussions in section 1 are useful since they help in making summary for a business
case. The summaries are able to tell the mean or the median values which helps the
decision makers to plan well.
Section 8
This section is abstract so you are encouraged to try and roughly understand the following before
attempting the task
https://app.box.com/s/3e8pxh994ixhwj50je849xz1gzxcsen3
a) Using section 2
i) Find the zscore of the estimate section 2d note that average of the estimates is 0.14 with
standard deviation 0.088
Solution
Count of do they
like product?
Column
Labels
Row Labels old young
Grand
Total
hate 21.43% 33.33% 25.00%
like 78.57% 66.67% 75.00%
Grand Total 100.00% 100.00% 100.00%
1, Would the discussion be useful in business? Give a reason for your answer
Solution
The discussions in section 1 are useful since they help in making summary for a business
case. The summaries are able to tell the mean or the median values which helps the
decision makers to plan well.
Section 8
This section is abstract so you are encouraged to try and roughly understand the following before
attempting the task
https://app.box.com/s/3e8pxh994ixhwj50je849xz1gzxcsen3
a) Using section 2
i) Find the zscore of the estimate section 2d note that average of the estimates is 0.14 with
standard deviation 0.088
Solution
Count of do they
like product?
Column
Labels
Row Labels old young
Grand
Total
hate 21.43% 33.33% 25.00%
like 78.57% 66.67% 75.00%
Grand Total 100.00% 100.00% 100.00%
Z= x −μ
σ = x−μ
σ =0.119−0.14
0.088 =−0.238636
ii) Using part (i) find P(Z<zscore) using www.wolframalpha.com
for example if the zscore is 0.5 type in
P(Z<0.5)”
into wolframalpha.com
P ( Z ←0.2 38636 )=0.4057
iii) IF there was a list of 1000 estimates ranked from lowest to highest, roughly what rank
do you expect your estimate to have?
Hint: just use the formula
expected rank = P(Z<zscore)*1000
Solution
Expected rank =P(Z <zscore )∗1000
¿ 0.4057∗1000=405.7 ≈ 406
iv) Complete the following table using
https://app.box.com/s/2to195ysj0deo5wawwjp53e9jlt4peqp
Which
sample
Rank lowest to
highest
Estimate X Zscore=(X-mean)/stdev
Lowest estimate 475 1 -0.14306 -3.19465
Estimate from
allocated sample
420 422 0.11905 -0.2386
Highest estimate 663 1000 0.543672 4.570203
b) Using section 3
i) Find the zscore of the estimate in section 3c note that average of the estimates is
0.408 with standard deviation 0.26
Solution
sample collector id 420
Row Labels Average of how
much would
pay?
StdDev of how
much would
pay?
Count of are
they old?
old 2.520 1.224 70
young 2.183 1.405 30
Grand Total 2.419 1.283 100
σ = x−μ
σ =0.119−0.14
0.088 =−0.238636
ii) Using part (i) find P(Z<zscore) using www.wolframalpha.com
for example if the zscore is 0.5 type in
P(Z<0.5)”
into wolframalpha.com
P ( Z ←0.2 38636 )=0.4057
iii) IF there was a list of 1000 estimates ranked from lowest to highest, roughly what rank
do you expect your estimate to have?
Hint: just use the formula
expected rank = P(Z<zscore)*1000
Solution
Expected rank =P(Z <zscore )∗1000
¿ 0.4057∗1000=405.7 ≈ 406
iv) Complete the following table using
https://app.box.com/s/2to195ysj0deo5wawwjp53e9jlt4peqp
Which
sample
Rank lowest to
highest
Estimate X Zscore=(X-mean)/stdev
Lowest estimate 475 1 -0.14306 -3.19465
Estimate from
allocated sample
420 422 0.11905 -0.2386
Highest estimate 663 1000 0.543672 4.570203
b) Using section 3
i) Find the zscore of the estimate in section 3c note that average of the estimates is
0.408 with standard deviation 0.26
Solution
sample collector id 420
Row Labels Average of how
much would
pay?
StdDev of how
much would
pay?
Count of are
they old?
old 2.520 1.224 70
young 2.183 1.405 30
Grand Total 2.419 1.283 100
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
The estimate is x1-x2 = 2.520 – 2.183=0.337
So the zscore is
Z= x −μ
σ = x−μ
σ =0.337−0.408
0.26 =−0.27308
ii) Using part (ii) What is P(Z<zscore), you can find out the answer using
www.wolframalpha.com
for example if the zscore =-1 type in
P(Z<-1)
into wolfram alpha
Solution
P ( Z ←0.27308 )=0.3924
iii) If there was a list of 1000 estimates ranked from lowest to highest, what rank do you
think your would be close to, hint just use the formula
expected rank = P(Z<zscore)*1000
Solution
Expected rank =P(Z <zscore )∗1000
¿ 0.3924∗1000=392.4 ≈ 392
iv) Complete the following table , use
https://app.box.com/s/kiqemn0h0m3d03uygo1dhemvx4e5uf6r
Which sample Rank lowest to
highest
Estimate X Zscore=(X-mean)/
stdev
Lowest estimate 475 1 -0.43474 -3.23897
Estimate from
allocated sample
420 416 0.3367 -0.27308
Highest estimate 663 1000 1.607576 4.613465
c) Using section 4
So the zscore is
Z= x −μ
σ = x−μ
σ =0.337−0.408
0.26 =−0.27308
ii) Using part (ii) What is P(Z<zscore), you can find out the answer using
www.wolframalpha.com
for example if the zscore =-1 type in
P(Z<-1)
into wolfram alpha
Solution
P ( Z ←0.27308 )=0.3924
iii) If there was a list of 1000 estimates ranked from lowest to highest, what rank do you
think your would be close to, hint just use the formula
expected rank = P(Z<zscore)*1000
Solution
Expected rank =P(Z <zscore )∗1000
¿ 0.3924∗1000=392.4 ≈ 392
iv) Complete the following table , use
https://app.box.com/s/kiqemn0h0m3d03uygo1dhemvx4e5uf6r
Which sample Rank lowest to
highest
Estimate X Zscore=(X-mean)/
stdev
Lowest estimate 475 1 -0.43474 -3.23897
Estimate from
allocated sample
420 416 0.3367 -0.27308
Highest estimate 663 1000 1.607576 4.613465
c) Using section 4
i) Find the zscore of the slope estimate in section 4a note that average of the estimates is
0.952 with standard deviation 0.237
Solution
Z= x −μ
σ = x−μ
σ =0.9386−0.952
0.237 =−0. 05654
ii) Using part (ii) What is P(Z<zscore), you can find out the answer using
www.wolframalpha.com
for example if the zscore =-1 type in
P(Z<-1)
into wolfram alpha
Solution
P ( Z ←0.05654 )=0.4775
iii) If there was a list of 1000 estimates ranked from lowest to highest, what rank do you
think your would be close to, hint just use the formula
expected rank = P(Z<zscore)*1000
Solution
Expected rank =P(Z <zscore )∗1000
¿ 0.4775∗1000=477.5≈ 478
iv) Summary some of the 1000 estimates the full list of estimates is available from
https://app.box.com/s/35a0x0hnxcqq2qh6krzua6qp587fke51
Which sample Rank lowest Estimate X Zscore=(X-
0.952 with standard deviation 0.237
Solution
Z= x −μ
σ = x−μ
σ =0.9386−0.952
0.237 =−0. 05654
ii) Using part (ii) What is P(Z<zscore), you can find out the answer using
www.wolframalpha.com
for example if the zscore =-1 type in
P(Z<-1)
into wolfram alpha
Solution
P ( Z ←0.05654 )=0.4775
iii) If there was a list of 1000 estimates ranked from lowest to highest, what rank do you
think your would be close to, hint just use the formula
expected rank = P(Z<zscore)*1000
Solution
Expected rank =P(Z <zscore )∗1000
¿ 0.4775∗1000=477.5≈ 478
iv) Summary some of the 1000 estimates the full list of estimates is available from
https://app.box.com/s/35a0x0hnxcqq2qh6krzua6qp587fke51
Which sample Rank lowest Estimate X Zscore=(X-
to highest mean)/stdev
Lowest estimate 141 1 -0.00348010 -4.03134
Estimate from
allocated sample
420 471 0.93864267 -0.05654
Highest estimate 683 1000 3.878984 3.876998
d) For parts a,b and c , compare the predicted rank for your sample iii using P(Z<zscore) to
the actual rank in part iv
Solution
Section Predicted rank Actual rank
Section 2 406 422
Section 3 392 416
Section 4 478 471
As can be seen, the predicted and the actual ranks are slightly different; none of the ranks
(predicted and actual ranks) were the same.
e) Comment on the connection between the following facts
*“part (d) shows totally different population with totally different variables have the same
sampling distribution, (the normal distribution)”
*”Hypothesis testing uses a sampling distribution, p-value is a shaded area on the
sampling distribution
Solution
Yes results showed totally different with the actual values since there is use of samples
which are predicted to come from the sample but have almost similar characteristics.
Lowest estimate 141 1 -0.00348010 -4.03134
Estimate from
allocated sample
420 471 0.93864267 -0.05654
Highest estimate 683 1000 3.878984 3.876998
d) For parts a,b and c , compare the predicted rank for your sample iii using P(Z<zscore) to
the actual rank in part iv
Solution
Section Predicted rank Actual rank
Section 2 406 422
Section 3 392 416
Section 4 478 471
As can be seen, the predicted and the actual ranks are slightly different; none of the ranks
(predicted and actual ranks) were the same.
e) Comment on the connection between the following facts
*“part (d) shows totally different population with totally different variables have the same
sampling distribution, (the normal distribution)”
*”Hypothesis testing uses a sampling distribution, p-value is a shaded area on the
sampling distribution
Solution
Yes results showed totally different with the actual values since there is use of samples
which are predicted to come from the sample but have almost similar characteristics.
1 out of 13
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.