Statistical Analysis Project: Internet Speed Data for MAT10251, SCU
VerifiedAdded on 2022/08/17
|5
|1224
|34
Project
AI Summary
This project presents a statistical analysis of internet speed data, likely sourced from the National Broadband Network (NBN). The analysis begins with descriptive statistics, including histograms, and measures of central tendency (mean, median, mode) and dispersion (standard deviation, skewness, box plots). The study examines download and upload speeds, exploring their relationships through scatter plots and correlation analysis. Hypothesis testing is conducted to assess population proportions and compare average download speeds using t-tests. Furthermore, the project includes regression analysis, both simple and multiple linear regression, to model the relationship between download and upload speeds, evaluating the coefficients of determination (R-squared) and the significance of variables. The conclusion favors a multiple regression model based on a higher correlation coefficient and significant variables.

Running head: STATISTICAL ANALYSIS
Statistical Analysis
Name of the student:
Name of the University:
Author note:
Statistical Analysis
Name of the student:
Name of the University:
Author note:
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

1STATISTICAL ANALYSIS
To
Nicola Jayne, nicola.jayne@scu.edu.au
Subject: MAT10251 STATISTICAL ANALYSIS PROJECT
This study is based on the internet speed of the National Broadband Network.
According to this study, it has been seen that maximum download speed is 50 Mbps and
maximum upload speed is 20 Mbps. The 120 observations have been selected in this study.
The histogram has been drawn based on speed test 1 download only for evening. In
this histogram in the X-axis shows the class and the Y-axis reflect the frequency. It clear
from the histogram that the data set is symmetric. The maximum number of frequency has
been occupied in the interval 40 to 45, and the frequency is 30. Similarly, the minimum
frequency has been occupied in the interval 30 to 35 and the frequency is 2.
The 46 observations have been selected in the speed test 1 download for evening. The
mean of the speed test 1 download for the evening is 41.98. The median or the second
quartile of this data set is 44. Similarly, the mode of the speed test 1 download for the
evening is 44. It has been seen that the mean of this data set is less than the median. Hence
the skewness of speed test 1 download for the evening is negative. The value of this skewness
is -1.41. The standard deviation of this data is 3.76.
The box plot of different download speeds shows that the data-sheet is symmetric. In
other words, difference download speed data is normally distributed — the value of the
bottom whisker that the difference between the first quartile and minimum is 13.33.
Similarly, the top whisker that is the difference between the maximum and the third quartile
is 12.40.
To
Nicola Jayne, nicola.jayne@scu.edu.au
Subject: MAT10251 STATISTICAL ANALYSIS PROJECT
This study is based on the internet speed of the National Broadband Network.
According to this study, it has been seen that maximum download speed is 50 Mbps and
maximum upload speed is 20 Mbps. The 120 observations have been selected in this study.
The histogram has been drawn based on speed test 1 download only for evening. In
this histogram in the X-axis shows the class and the Y-axis reflect the frequency. It clear
from the histogram that the data set is symmetric. The maximum number of frequency has
been occupied in the interval 40 to 45, and the frequency is 30. Similarly, the minimum
frequency has been occupied in the interval 30 to 35 and the frequency is 2.
The 46 observations have been selected in the speed test 1 download for evening. The
mean of the speed test 1 download for the evening is 41.98. The median or the second
quartile of this data set is 44. Similarly, the mode of the speed test 1 download for the
evening is 44. It has been seen that the mean of this data set is less than the median. Hence
the skewness of speed test 1 download for the evening is negative. The value of this skewness
is -1.41. The standard deviation of this data is 3.76.
The box plot of different download speeds shows that the data-sheet is symmetric. In
other words, difference download speed data is normally distributed — the value of the
bottom whisker that the difference between the first quartile and minimum is 13.33.
Similarly, the top whisker that is the difference between the maximum and the third quartile
is 12.40.

2STATISTICAL ANALYSIS
From the summary statistics on difference download speed, there are 120 observations
have been taken. It has been seen the mean value of the difference of download speed is -
2.20. The second quartile of this data set is -1.10. Similarly, the mode of difference download
speed is -0.80. The standard deviation of the difference download speed is 4.52. The value of
the skewness of this data set is -0.80. The range that is the difference between the maximum
and the minimum value of this data set is 29.
From the scatter plot for speed test 1 shows the relationship between two variables.
The X-axis takes to upload and the Y-axis takes the download speed. It has been seen that
there is a strong and positive relationship between these two variables. The value of the
coefficient of determination is very low.
The correlation between speed test 1 upload and download is positive and strong. The
value of the correlation coefficient is 0.7. In the correlation download speed is independent
and upload speed is dependent variable.
There are 120 observations has been seen in the speed test 1 download. The number
of sample observations in the speed test 1 download which is greater than 40 Mbps or faster
is 92. Therefore the time of population proportion which have download speed greater than or
equal to 40 is 0.77. Hence 77% of the data set which have more than or equal to 40 Mbps
download speed.
The number of sample observations in the speed test 1 download only foe evening is
46. The percentage of the null hypothesis is 41%. Similarly the percentage of alternative
hypothesis is 70%. The level of significance has been taken at 5%. The count of success that
is the number of observations which has internet speed 41 or greater Mbps is 32. Thus the
sample of proportion is 69.57%. More over the standard error of this data set is 0.5347. The
test statistic of speed test 1 download only foe evening is 3.939. The critical or the P-value of
From the summary statistics on difference download speed, there are 120 observations
have been taken. It has been seen the mean value of the difference of download speed is -
2.20. The second quartile of this data set is -1.10. Similarly, the mode of difference download
speed is -0.80. The standard deviation of the difference download speed is 4.52. The value of
the skewness of this data set is -0.80. The range that is the difference between the maximum
and the minimum value of this data set is 29.
From the scatter plot for speed test 1 shows the relationship between two variables.
The X-axis takes to upload and the Y-axis takes the download speed. It has been seen that
there is a strong and positive relationship between these two variables. The value of the
coefficient of determination is very low.
The correlation between speed test 1 upload and download is positive and strong. The
value of the correlation coefficient is 0.7. In the correlation download speed is independent
and upload speed is dependent variable.
There are 120 observations has been seen in the speed test 1 download. The number
of sample observations in the speed test 1 download which is greater than 40 Mbps or faster
is 92. Therefore the time of population proportion which have download speed greater than or
equal to 40 is 0.77. Hence 77% of the data set which have more than or equal to 40 Mbps
download speed.
The number of sample observations in the speed test 1 download only foe evening is
46. The percentage of the null hypothesis is 41%. Similarly the percentage of alternative
hypothesis is 70%. The level of significance has been taken at 5%. The count of success that
is the number of observations which has internet speed 41 or greater Mbps is 32. Thus the
sample of proportion is 69.57%. More over the standard error of this data set is 0.5347. The
test statistic of speed test 1 download only foe evening is 3.939. The critical or the P-value of
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

3STATISTICAL ANALYSIS
this test is 1. Since the critical value is larger than the level of significance that is alpha value.
Therefore the null hypothesis of this test is accepted and the alternative hypothesis is rejected.
Hence the average download speed is more than or equal to 41 Mbps.
To test the difference between average download speed of test 1 and test 2, two
sample t test has been applied. The null hypothesis of this test is there is no difference
between the average speed of test 1 and test 2. Similarly the alternative hypothesis is there is
a difference between average speed of test 1 and test 2. The test statistic of this test is -3.477.
The critical or the P-value of this test is 0.000. The level of alpha has been taken at 5%. It has
been seen that P-value < alpha value at 5% level. Hence the null hypothesis is rejected and
the alternative hypothesis is accepted. Thus it may be concluded that there is a relationship
between average download speed of test 1 and test 2.
The least square regression line on speed test 1 download and upload is speed test1
upload= 0.211* speed test 1 upload+3.499. The correlation between speed test 1 download
and upload is strong and positive. The value of the correlation is 0.7. The value of R- square
that is the value of the value of the coefficient of determination is 0.460. The gradient of the
simple linear regression is 0.211 and the intercept of this regression is 3.499. The value of
coefficient of determination is approximately 0.5. In the excel sheet this are shown with the
help of scatter plot.
The multiple regression equation is upload speed= 0.284*evening+0.208*speed test
download+3.498. The value of the coefficient of determination that is the R square value is
0.466. The value of the coefficients are speed test download= 0.208 and the evening= 0.284.
The value of the coefficient of determination in the multiple regression is 0.466 and
the in the simple linear regression is 0.460. Thus the R square for multiple linear regression >
this test is 1. Since the critical value is larger than the level of significance that is alpha value.
Therefore the null hypothesis of this test is accepted and the alternative hypothesis is rejected.
Hence the average download speed is more than or equal to 41 Mbps.
To test the difference between average download speed of test 1 and test 2, two
sample t test has been applied. The null hypothesis of this test is there is no difference
between the average speed of test 1 and test 2. Similarly the alternative hypothesis is there is
a difference between average speed of test 1 and test 2. The test statistic of this test is -3.477.
The critical or the P-value of this test is 0.000. The level of alpha has been taken at 5%. It has
been seen that P-value < alpha value at 5% level. Hence the null hypothesis is rejected and
the alternative hypothesis is accepted. Thus it may be concluded that there is a relationship
between average download speed of test 1 and test 2.
The least square regression line on speed test 1 download and upload is speed test1
upload= 0.211* speed test 1 upload+3.499. The correlation between speed test 1 download
and upload is strong and positive. The value of the correlation is 0.7. The value of R- square
that is the value of the value of the coefficient of determination is 0.460. The gradient of the
simple linear regression is 0.211 and the intercept of this regression is 3.499. The value of
coefficient of determination is approximately 0.5. In the excel sheet this are shown with the
help of scatter plot.
The multiple regression equation is upload speed= 0.284*evening+0.208*speed test
download+3.498. The value of the coefficient of determination that is the R square value is
0.466. The value of the coefficients are speed test download= 0.208 and the evening= 0.284.
The value of the coefficient of determination in the multiple regression is 0.466 and
the in the simple linear regression is 0.460. Thus the R square for multiple linear regression >
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

4STATISTICAL ANALYSIS
simple linear regression. Thus adding more variables the coefficient of determination
increases.
It has been seen that the critical value for speed test 1 download is 0.000 and for
evening is 0.265. In this case the alpha has been taken at 5%. Thus the critical of speed test 1
download is smaller than the alpha and for evening the critical value is larger than the alpha.
Hence it may be concluded that speed test 1 download plays a significant role in the
regression model and the evening is not.
From the simple and multiple regression equation it may be concluded that the
multiple regression model is the best fit, because the correlation coefficient of multiple
regression is larger than the simple regression.
simple linear regression. Thus adding more variables the coefficient of determination
increases.
It has been seen that the critical value for speed test 1 download is 0.000 and for
evening is 0.265. In this case the alpha has been taken at 5%. Thus the critical of speed test 1
download is smaller than the alpha and for evening the critical value is larger than the alpha.
Hence it may be concluded that speed test 1 download plays a significant role in the
regression model and the evening is not.
From the simple and multiple regression equation it may be concluded that the
multiple regression model is the best fit, because the correlation coefficient of multiple
regression is larger than the simple regression.
1 out of 5