Statistical Analysis Project Assignment

Verified

Added on  2022/08/23

|15
|1302
|19
AI Summary
tabler-icon-diamond-filled.svg

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Running head: STATISTICAL ANALYSIS
Statistical Analysis
Name of the Student
Name of the University
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
2STATISTICAL ANALYSIS
STUDENT NAME:
STUDENT ID NUMBER:
MAT10251 – Statistical Analysis
Project Part B
Complete the summary table below.
Sample Number (last digit of your student ID number)
Confidence Level
Level of Significance
Value: 25%
PLEASE ENSURE YOU KEEP A COPY OF YOUR PROJECT
Document Page
3STATISTICAL ANALYSIS
Marking and Feedback Sheet Part B
Marks
Cover sheet or sample incorrect -2.0
Format incorrect, including name -2.0
Statistical Tasks
Statistical Inference Question 1
Assumptions & other required steps 2.5
Calculation (Excel output) 2.0
Conclusion 1.0
Statistical Inference Question 2
Assumptions & other required steps 3.5
Calculation (Excel output) 2.0
Decision and onclusion 2.0
Statistical Inference Question 3
Assumptions & other required steps 4.0
Calculation (Excel output) 2.0
Decision and conclusion 2.0
Regression and Correlation
Assumptions and random variables defined 2.0
Simple Linear Model Question 4
Excel Output and Equation 3.0
Interpretation of regression coefficients & coefficient of determination 1.5
Multiple Linear Model Question 5
Excel Output and Equation 4.0
Interpretation of regression coefficients & coefficient of determination 2.5
Statistical Inference
Choice of technique and other required steps 1.0
Decision and conclusion 2.0
Best model 1.0
Total Statistical Tasks 38.0 0.0
Written Answer (Components of a report)
Question 1 2.0
Question 2 2.0
Question 3 2.0
Questions 4 & 5
Introduction and discussion of best model 4.0
Structure, grammar, spelling and revised Part A content 2.0
Total Report 12.0 0.0
Maximum
Marks
Document Page
4STATISTICAL ANALYSIS
Table of Contents
Q1...............................................................................................................................................5
Q2...............................................................................................................................................6
Q3...............................................................................................................................................7
Q4...............................................................................................................................................8
Q5...............................................................................................................................................9
Appendices:..............................................................................................................................10
Appendix to Q1....................................................................................................................10
Appendix to Q2....................................................................................................................11
Appendix to Q3....................................................................................................................12
Appendix to Q4....................................................................................................................13
Appendix to Q5....................................................................................................................14
References:...............................................................................................................................15
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
5STATISTICAL ANALYSIS
Q1.
The given data consists of information about the download speeds in test 1. From excel, it
was found that out of the 120-speed tests, 90 times the speed was higher than 40 Mbps.
Therefore, the Proportion of the population of time with more than 40 Mbps of speed ^p is =
90
120 =0.75 .
The confidence interval is at 95 % for the proportion:
Therefore, the formula for calculating the confidence interval is
Substituting the values of z = 1.96, n = 120 and ^p=0.75, we get
= 0.75 ± 1.96∗ √ 0.75(1−0.75)
120
= 0.75 ± 0.08
= 0.75+0.08, 0.75 −0.08
= (0.67, 0.83)
Therefore, the value of the interval indicates that with 95 % confidence the proportion of time
that the internet or downloading speed is at least 40 mbps or higher lies within the range of
67% to 83 %.
Document Page
6STATISTICAL ANALYSIS
Q2.
The mean evening speed for the speed 1 download speed for the sample is found to be 40.76
mbps with a standard deviation 4.66 for a total of 48 observation
A one sample z test is used to test the hypothesis that the mean download speed exceeds the
advertised speed of 41 mbps.
H0 : μ=41
Ha: μ> 41
z= 40.76−41
4.66/ √48 = -0.35
From the z distribution table, P( z=−0.35) = 0.3632
As the p value is more than 0.05 the null hypothesis cannot be rejected and it cannot be
concluded that the mean download speed is greater than 41 mbps.
The z test was done after confirming that the pre-conditions were met for the test.
The sample size was greater than 30.
The data points were independent of each other.
The data is large enough and normality can be assumed.
Document Page
7STATISTICAL ANALYSIS
Q3.
A two sample t test is done to check the claim that there is a difference in average download
test speed for the two trials done.
The conditions required to check the validity of a two sample t test are tested for this case. It
was found that the data collection is assumed to be randomly selected. The sample size is big
enough to ensure normal distribution and a F test for equality of variance is done which
shows that the variance across the two trails are equal.
The two sample t test done had a p value of 0.01 which was much less than 0.05 and
therefore the null hypothesis can be rejected to conclude that the average download speed is
different for
Speed Test 1 Download and Speed Test 2 Download.
The F test check the null hypothesis that the sample variances are equal. Normality can be
assumed because of the large sample size and the data points are assumed to randomly
sampled and independent of each other.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
8STATISTICAL ANALYSIS
Q4.
A simple linear regression is run to predict the upload speed from the download speed. This
is done using excel and it is ensured that all the assumptions for the regression is met.
The R square value for the regression was 0.51 and all the model coefficients have p values
less than 0.05.
The linear equation that can model the relation between the download and upload speed is
y=0.31 x−0.17
Where y represents the Upload speed in mbps and x represents the download speed in mbps.
The slope 0.31 represents that for unit increase in download speed the upload speed increases
by 0.31 mbps.
Document Page
9STATISTICAL ANALYSIS
Q5.
A multiple linear regression is run to develop a predictive model for Upload speed using
Download speed and time with Evening as a predictor. A dummy variable is created because
excel cannot work with categorical variable. As time of the day is taken as Evening or Not.
Evening is recoded as 1 and not evening is recoded as 0.
The assumptions of multiple linear regression is checked normality, independence of data,
From the final model, it is seen that the multiple linear regression has a slightly improved R
squared value (0.513) over the simple linear model (0.511). That is 51.3 % of the variability
of the dependent variable can be explained by the variability of the independent variable.
The final equation can be written as
y=0.31 x1−0.15 x2−0.05
Where, x1 represents the download speed, x2 represents the binary variable: evening or not
evening and y is the upload speed.
However it must be noted that the variable time Evening or Not Evening does not contribute
positively to the model as it has p value of 0.6 ( > 0.05).
Document Page
10STATISTICAL ANALYSIS
Appendices:
Appendix to Q1.
Speed Test 1 Download
Mean 41.33333333
Standard Error 0.461835234
Median 43.7
Mode 44.6
Standard Deviation 5.05915151
Sample Variance 25.59501401
Kurtosis 4.03118794
Skewness -2.097297222
Range 23
Minimum 22.3
Maximum 45.3
Sum 4960
Count 120
20 30 40 50 More
0
10
20
30
40
50
60
70
80
90
100
Histogram
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
11STATISTICAL ANALYSIS
Appendix to Q2.
Evening DL Speed
Mean 40.7625
Standard Error
0.67283027
3
Median 41.85
Mode 43.7
Standard
Deviation
4.66150487
1
Sample Variance
21.7296276
6
Kurtosis
2.75455824
3
Skewness
-
1.68713803
Range 19.4
Minimum 25.9
Maximum 45.3
Sum 1956.6
Count 48
Document Page
12STATISTICAL ANALYSIS
Appendix to Q3.
F-Test Two-Sample for Variances
Speed Test 1 Download
Speed Test 2
Download
Mean 41.33333333 42.93025
Variance 25.59501401 22.59074868
Observations 120 120
df 119 119
F 1.132986532
P(F<=f) one-tail 0.248453053
F Critical one-tail 1.353610209
Tests the null hypothesis that the variance across the groups are equal.
t-Test: Two-Sample Assuming Equal Variances
Speed Test 1
Download
Speed Test 2
Download
Mean 41.33333333 42.93025
Variance 25.59501401 22.59074868
Observations 120 120
Pooled Variance 24.09288134
Hypothesized Mean
Difference 0
df 238
t Stat -2.520075243
P(T<=t) one-tail 0.006194257
t Critical one-tail 1.651281164
P(T<=t) two-tail 0.012388515
t Critical two-tail 1.96998153
Document Page
13STATISTICAL ANALYSIS
Appendix to Q4.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.715281
R Square 0.511626
Adjusted R
Square 0.507488
Standard
Error 1.532846
Observatio
ns 120
ANOVA
df SS MS F
Significan
ce F
Regression 1
290.45
55
290.45
55
123.61
83 4.4E-20
Residual 118
277.25
47
2.3496
16
Total 119
567.71
02
Coefficie
nts
Standa
rd Error t Stat P-value
Lower
95%
Upper
95%
Lower
95.0%
Upper
95.0%
Intercept -0.17098
1.1565
12
-
0.1478
4
0.8827
18 -2.46119
2.1192
25
-
2.4611
9
2.1192
25
Speed Test
1
Download 0.308808
0.0277
75
11.118
38 4.4E-20 0.253807
0.3638
09
0.2538
07
0.3638
09
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
14STATISTICAL ANALYSIS
Appendix to Q5.
Regression Statistics
Multiple R 0.716102
R Square 0.512803
Adjusted R
Square 0.504475
Standard Error 1.537527
Observations 120
ANOVA
df SS MS F
Significanc
e F
Regression 2
291.123
3
145.561
6
61.5745
5 5.38E-19
Residual 117
276.586
9 2.36399
Total 119
567.710
2
Coefficient
s
Standar
d Error t Stat P-value Lower 95%
Upper
95%
Lower
95.0%
Upper
95.0%
Intercept -0.05295
1.18111
1 -0.04483 0.96432 -2.39208
2.28618
1 -2.39208
2.28618
1
Speed Test 1
Download 0.307432
0.02797
9
10.9878
2
9.96E-
20 0.252021
0.36284
4
0.25202
1
0.36284
4
Evening -0.15293
0.28773
5 -0.53149
0.59608
6 -0.72277
0.41691
6 -0.72277
0.41691
6
Document Page
15STATISTICAL ANALYSIS
References:
Levine, D.M., 2010. Business statistics: A first course. Pearson Education India.
McClave, J.T., Sincich, T. and Sincich, T.T., 2013. A first course in statistics. Pearson.
chevron_up_icon
1 out of 15
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]