Application of Statistical Tools and Techniques
VerifiedAdded on 2020/11/12
|9
|1370
|458
AI Summary
The provided assignment involves the application of statistical tools and techniques to analyze the data of Crewe Station regarding total number of passengers entering the station over 10 years. The analysis includes calculating mean, mode, median, range, and standard deviation for the given data. Additionally, forecasting is done for 12 and 15 years using a linear regression model. The report aims to provide more accuracy and reliability in the outcomes by implying statistical tools and techniques.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
Numeracy and Data
Analysis
Analysis
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Table of Contents
INTRODUCTION ..........................................................................................................................3
MAIN BODY...................................................................................................................................3
1. Arrangement of data in table format.......................................................................................3
2. Presenting the data in the form of bar and scatter plot charts................................................3
3. Calculating and analysing collected data by the medium of descriptive statistics..................5
4. Linear forecasting model for collected data............................................................................6
CONCLUSION ...............................................................................................................................7
REFERENCES................................................................................................................................8
INTRODUCTION ..........................................................................................................................3
MAIN BODY...................................................................................................................................3
1. Arrangement of data in table format.......................................................................................3
2. Presenting the data in the form of bar and scatter plot charts................................................3
3. Calculating and analysing collected data by the medium of descriptive statistics..................5
4. Linear forecasting model for collected data............................................................................6
CONCLUSION ...............................................................................................................................7
REFERENCES................................................................................................................................8
INTRODUCTION
Data collection in the quantitative terms and analysing the same by the medium of
statistical tools is an effective method of ascertaining reliable and correct results. Data analysis is
a systematic measure of inspecting, filtering, transforming and modelling the data for the
purpose of determining meaningful information so that better conclusions can be drawn (Wang
and et.al., 2016). Altogether effect of such analysis is to assists in decision making. In the
present project report, data of Crewe Station for the 10 years regarding the total number of
entries of passengers at the station has been collected. The collected data will then be analysed
by calculating mean, mode, median, range and standard deviation. Further, a linear forecasting
model will be used for the 10 year data of the station.
MAIN BODY
1. Arrangement of data in table format
The data was collected related to the total number of entries that is total number of
passengers entered the station for the last 10 years. From the table, changes can be noted such as
number of passengers entered the station has significantly after year 2015 and in 2017-18 the
numbers increased again.
Years Total number of entries
2008 -2009 977239
2009 - 2010 1017273
2010- 2011 1123237
2011 - 2012 1175738
2012 - 2013 1221857
2013- 2014 1255979
2014-2015 1325267
2015-2016 50897
2016-2017 115680
2017-2018 313622
Data collection in the quantitative terms and analysing the same by the medium of
statistical tools is an effective method of ascertaining reliable and correct results. Data analysis is
a systematic measure of inspecting, filtering, transforming and modelling the data for the
purpose of determining meaningful information so that better conclusions can be drawn (Wang
and et.al., 2016). Altogether effect of such analysis is to assists in decision making. In the
present project report, data of Crewe Station for the 10 years regarding the total number of
entries of passengers at the station has been collected. The collected data will then be analysed
by calculating mean, mode, median, range and standard deviation. Further, a linear forecasting
model will be used for the 10 year data of the station.
MAIN BODY
1. Arrangement of data in table format
The data was collected related to the total number of entries that is total number of
passengers entered the station for the last 10 years. From the table, changes can be noted such as
number of passengers entered the station has significantly after year 2015 and in 2017-18 the
numbers increased again.
Years Total number of entries
2008 -2009 977239
2009 - 2010 1017273
2010- 2011 1123237
2011 - 2012 1175738
2012 - 2013 1221857
2013- 2014 1255979
2014-2015 1325267
2015-2016 50897
2016-2017 115680
2017-2018 313622
2. Presenting the data in the form of bar and scatter plot charts
Column chart :
Bar chart :
2008 -2009
2009 - 2010
2010- 2011
2011 - 2012
2012 - 2013
2013- 2014
2014-2015
2015-2016
2016-2017
2017-2018
0
200000
400000
600000
800000
1000000
1200000
1400000
977239 1017273
1123237 1175738 1221857 1255979
1325267
50897
115680
313622
Total number of entries
Column chart :
Bar chart :
2008 -2009
2009 - 2010
2010- 2011
2011 - 2012
2012 - 2013
2013- 2014
2014-2015
2015-2016
2016-2017
2017-2018
0
200000
400000
600000
800000
1000000
1200000
1400000
977239 1017273
1123237 1175738 1221857 1255979
1325267
50897
115680
313622
Total number of entries
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Interpretation : From the above chart presentations, it can be observed that Crewe
Station witnessed highest number of passengers entering the place in the year 2014-2015 in
which the number of passengers was 13,25,267. From the years 2008-2009, it showed increasing
trend in which the numbers increased year after year. However, in the year 2016-17, the number
of passengers fell to a drastic level as only 50,897 passengers entered the station. After which the
numbers started increasing again.
3. Calculating and analysing collected data by the medium of descriptive statistics
Descriptive statistics : It is method which summarises the data into the meaningful
information. Such summarised data represents either the whole of population/data set or a sample
of data population. It consists of the measures of central tendency and measures of dispersion or
variability (Wei and et.al., 2019). In descriptive analysis, mean, mode, median, range, variance,
standard deviation are calculated for drawing useful information from the data so collected. In
present study, the data of Crewe Station will be analysed by using descriptive analysis method.
Total number of entries
Mean 857678.9
Median 1070255
Mode #N/A
2008 -2009
2009 - 2010
2010- 2011
2011 - 2012
2012 - 2013
2013- 2014
2014-2015
2015-2016
2016-2017
2017-2018
0 200000 400000 600000 800000 1000000 1200000 1400000
977239
1017273
1123237
1175738
1221857
1255979
1325267
50897
115680
313622
Total number of entries
Station witnessed highest number of passengers entering the place in the year 2014-2015 in
which the number of passengers was 13,25,267. From the years 2008-2009, it showed increasing
trend in which the numbers increased year after year. However, in the year 2016-17, the number
of passengers fell to a drastic level as only 50,897 passengers entered the station. After which the
numbers started increasing again.
3. Calculating and analysing collected data by the medium of descriptive statistics
Descriptive statistics : It is method which summarises the data into the meaningful
information. Such summarised data represents either the whole of population/data set or a sample
of data population. It consists of the measures of central tendency and measures of dispersion or
variability (Wei and et.al., 2019). In descriptive analysis, mean, mode, median, range, variance,
standard deviation are calculated for drawing useful information from the data so collected. In
present study, the data of Crewe Station will be analysed by using descriptive analysis method.
Total number of entries
Mean 857678.9
Median 1070255
Mode #N/A
2008 -2009
2009 - 2010
2010- 2011
2011 - 2012
2012 - 2013
2013- 2014
2014-2015
2015-2016
2016-2017
2017-2018
0 200000 400000 600000 800000 1000000 1200000 1400000
977239
1017273
1123237
1175738
1221857
1255979
1325267
50897
115680
313622
Total number of entries
Standard Deviation 496522.595601192
Sample Variance 246534687942.544
Range 1274370
Minimum 50897
Maximum 1325267
Mean : It is the simple arithmetic average of values of a data set. It is one of the measure
of central tendency which provides the average value of the total data set. It is calculated by
summing up all the values of a data series and dividing that from total number of observations.
From the descriptive analysis, mean of the passengers entered into Crewe station was 857678.9
for 10 years.
Median : Median can be defined as the value in a data series which performs the work of
dividing the data sample into upper and lower parts. It is the middle most value in the given set
of data sample (Desagulier, 2017). From the calculations, median of the number of entries in the
Crewe Station was 1070255. This is the middle value of the concerned data population.
Mode : This is a measure of central tendency which represents the value that occurs
most in the data series. In the present data regarding total number of entries in Crewe Station,
mode was 0. This reflects that there was no value which was repetitive in the sample data.
Range : Range is a measures of variability which is defined as the difference between
highest and lowest value in the give set of population (Miller, 2018). From the calculation, range
of the data was observed to be 1274370. This is the difference between highest and lowest
numerical quantity in the sample data.
Standard deviation : It is one of the dispersion measure which is concerned with the
determination of scatterdness amongst the values of a variables. The standard deviation of the
total number of passengers entered was 496522.59.
4. Linear forecasting model for collected data
It is one of the forecasting technique which is considered with prediction of future values
with the help of liner regression equation. The equation is y= mx+c
Sample Variance 246534687942.544
Range 1274370
Minimum 50897
Maximum 1325267
Mean : It is the simple arithmetic average of values of a data set. It is one of the measure
of central tendency which provides the average value of the total data set. It is calculated by
summing up all the values of a data series and dividing that from total number of observations.
From the descriptive analysis, mean of the passengers entered into Crewe station was 857678.9
for 10 years.
Median : Median can be defined as the value in a data series which performs the work of
dividing the data sample into upper and lower parts. It is the middle most value in the given set
of data sample (Desagulier, 2017). From the calculations, median of the number of entries in the
Crewe Station was 1070255. This is the middle value of the concerned data population.
Mode : This is a measure of central tendency which represents the value that occurs
most in the data series. In the present data regarding total number of entries in Crewe Station,
mode was 0. This reflects that there was no value which was repetitive in the sample data.
Range : Range is a measures of variability which is defined as the difference between
highest and lowest value in the give set of population (Miller, 2018). From the calculation, range
of the data was observed to be 1274370. This is the difference between highest and lowest
numerical quantity in the sample data.
Standard deviation : It is one of the dispersion measure which is concerned with the
determination of scatterdness amongst the values of a variables. The standard deviation of the
total number of passengers entered was 496522.59.
4. Linear forecasting model for collected data
It is one of the forecasting technique which is considered with prediction of future values
with the help of liner regression equation. The equation is y= mx+c
Calculation of m value : In linear regression, m represents the slope of the line which
can be calculated in the following way.
m = NΣxy – Σx Σy / NΣ x^2 - (Σx)^2
Calculation of c :
c in the linear regression equation represents the interception of the point y axis.
c = Σy - mΣx / N
Forecasting for 12 and 15 years
Years
(x)
Total number
of entries
(y) (xy) x^2
1 977239 977239 1
2 1017273 2034546 4
3 1123237 3369711 9
4 1175738 4702952 16
5 1221857 6109285 25
6 1255979 7535874 36
7 1325267 9276869 49
8 50897 407176 64
9 115680 1041120 81
10 313622 3136220 100
55 85,76,789 3,85,90,992 385
m = NΣxy – Σx Σy / NΣ x^2 – (Σx)^2
n = 10
Σxy = 858
Σx = 55
can be calculated in the following way.
m = NΣxy – Σx Σy / NΣ x^2 - (Σx)^2
Calculation of c :
c in the linear regression equation represents the interception of the point y axis.
c = Σy - mΣx / N
Forecasting for 12 and 15 years
Years
(x)
Total number
of entries
(y) (xy) x^2
1 977239 977239 1
2 1017273 2034546 4
3 1123237 3369711 9
4 1175738 4702952 16
5 1221857 6109285 25
6 1255979 7535874 36
7 1325267 9276869 49
8 50897 407176 64
9 115680 1041120 81
10 313622 3136220 100
55 85,76,789 3,85,90,992 385
m = NΣxy – Σx Σy / NΣ x^2 – (Σx)^2
n = 10
Σxy = 858
Σx = 55
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Σy = 86
Σx^2 = 385
m = (10*3,86-55*858)/(10*385-3025)
m = -52.52
c = Σy - m Σx / N
= 858-(-52.52)*55/10
c=1146.86
Forecasting the station usage for 12 years
Y = mX + c
X = 12
= -52.52*12+1146.86
= 516.62
Forecasting the station usage for 15 years
Y = mX + c
X = 15
= -52.52*15+1146.86
=359.06
CONCLUSION
From the project report, it can be summarised that implying of statistical tools and
techniques for analysing the data provides more accuracy and reliability of the outcomes. In the
report, mean, mode, median, range and standard deviation was calculated for the data of 10 years
of Crewe Station regarding the total number of passengers entering the stations every year.
Further, forecasting of the data was done for 12 & 15 years with the application of linear
regression model.
Σx^2 = 385
m = (10*3,86-55*858)/(10*385-3025)
m = -52.52
c = Σy - m Σx / N
= 858-(-52.52)*55/10
c=1146.86
Forecasting the station usage for 12 years
Y = mX + c
X = 12
= -52.52*12+1146.86
= 516.62
Forecasting the station usage for 15 years
Y = mX + c
X = 15
= -52.52*15+1146.86
=359.06
CONCLUSION
From the project report, it can be summarised that implying of statistical tools and
techniques for analysing the data provides more accuracy and reliability of the outcomes. In the
report, mean, mode, median, range and standard deviation was calculated for the data of 10 years
of Crewe Station regarding the total number of passengers entering the stations every year.
Further, forecasting of the data was done for 12 & 15 years with the application of linear
regression model.
REFERENCES
Books and Journals
Wang, H and et.al., 2016. Towards felicitous decision making: An overview on challenges and
trends of Big Data. Information Sciences. 367. pp.747-765.
Wei, C and et.al., 2019. Descriptive Statistics of Questionnaire Data. In Household Energy
Consumption in China: 2016 Report (pp. 39-93). Springer, Singapore.
Desagulier, G., 2017. Descriptive Statistics. In Corpus Linguistics and Statistics with R (pp. 139-
149). Springer, Cham.
Miller, C., 2018. Overview and Descriptive Statistics.
Books and Journals
Wang, H and et.al., 2016. Towards felicitous decision making: An overview on challenges and
trends of Big Data. Information Sciences. 367. pp.747-765.
Wei, C and et.al., 2019. Descriptive Statistics of Questionnaire Data. In Household Energy
Consumption in China: 2016 Report (pp. 39-93). Springer, Singapore.
Desagulier, G., 2017. Descriptive Statistics. In Corpus Linguistics and Statistics with R (pp. 139-
149). Springer, Cham.
Miller, C., 2018. Overview and Descriptive Statistics.
1 out of 9
Related Documents
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.