Descriptive Statistics and Linear Forecasting

Verified

Added on  2020/10/22

|8
|1160
|151
AI Summary
The provided document is an assignment that delves into descriptive statistics, presenting step-by-step calculations for various statistical measures such as mean, median, mode, range, variance, and standard deviation. Additionally, it employs a linear forecasting model to predict future data points, specifically the 12th and 15th year's train station usage. The analysis is thorough and detailed, showcasing the importance of descriptive statistics in understanding and interpreting numerical data in a business context.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
DATA ANALYSIS

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
INTRODUCTION
Data analysis is considered as procedure to inspect, transform, cleans and data modelling
with objective to discover useful information, information conclusions and to support decision
making. The present report will provide brief discussion of train station usage data of Caergwrle
over past 10 consecutive years. In this aspect, it will provide tabular form and visual presentation
as well and offer calculation of descriptive statistics. Apart from this, it will give use of linear
forecasting model as y = mx +c with forecast of 12th and 15 year.
1. Arranging data in tabular form
Year Station Caergwrle
1 20
2 20
3 12
4 201
5 208
6 202
7 202
8 202
9 8
10 15
(Source: gov.uk, 2019)
2. Presenting data in column and line chart
Line chart
Document Page
Column chart
3. Calculating and giving steps for descriptive statistics
Steps for mean
Mean is replicated as simple mathematical average of set of numericals as it is measure
of central tendency. In the first step, each number must be aggregated and after that count of
Document Page
observations must be undertaken as n. Further, the sum of numbers must be divided by n is
replicated as mean of the station usage or entire data set as 109.
Particulars Amount
mean 109
Steps for median
It is referred as middle number which is extracted through ordering each data points and
choosing middle number. The detailed steps for calculating median is stated below:
Ordering every number is ascending format.
Finding the mid value of the data set of station usage of Caergwrle but prior to this, it
must categorise that count of data is even or odd.
However, this is even so it should cross out numbers on either side, but must be located
in middle and then aggregating it. This sum will be divided by 2 will considered as final
outcome of median.
Particulars Amount
Median 110.5
Steps for mode
It is the value which is most likely to be sampled as it expresses single number,
significant information related to random variable or a population. This is the number which
appears on frequent basis as every dataset might have numerous mode as its steps are stated
below:
List all numbers of data set and then ordering them in ascending order.
Further, the count of number of time that every number is appeared has been undertaken.
Apart from this, value has been defined which occurred on very often basis which is
replicated as mode of the data set.
Particulars Amount
Mode 202
Steps for range

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
It is elaborated as area of variation among upper and lower limit on specific scale, a set of
things of the same general type. It could be also known as mean of the output values of a
function. Its steps are stated below:
Arranging the set of numbers with context to ascending order for attaining better sense of
data working with.
Determining smallest and largest number in data set and in this order subtract the smallest
number from the largest which is specified as range.
Particulars Amount
Range 200
Steps for standard deviation
It is replicated as measure that how numbers are spread out through mean (average) or
with expected value. This is the statistic which measures dispersion of dataset on basis of its
mean as its steps are stated below:
Working out for mean as it is simple average of numbers.
Further, for every number subtracting the mean from every value in data set and then
squaring differences in every step.
In order to this, there must be aggregate of squared differences and then dividing the total
from this squared differences through (n-1) for sample data.
In the below data, its standard deviation is extracted as 99.16 which is less spread
throughout the mean of 109 which has smallest difference as it is low standard deviation.
Particulars Amount
Standard deviation 99.16
4. Implication of linear forecasting model
Year (x) Station Caergwrle XY X^2
1 20 20 1
2 20 40 4
3 12 36 9
Document Page
4 201 804 16
5 208 1040 25
6 202 1212 36
7 202 1414 49
8 202 1616 64
9 8 72 81
10 15 150 100
55 1090 6404 385
Steps for m value
Particulars Details
m NΣxy – Σx Σy / NΣ x^2 – (Σx)^2
(10 * 6404) – (55 * 1090) / (10 * 385) – (55)^2
(64040 – 59950)/ (3850 – 3025)
4.96
The above table has helped for extracting slope which is denoted by m as it is rate of
change. In simple terms, it tells that how y value alters through one x value of the above data set.
Steps for c value
Particulars Details
c Σy - m Σx / N
(1090 – (4.96 * 55))/10
81.73
Document Page
Forecasting for year 12 and 15
Forecast of 12th year
Y = mX + C
Y 4.96 (X) + 81.73
X 12
Y 4.96 (12) + 81.73
141.22
Forecast of 15th year
Y = mX + C
Y 4.96 (X) + 81.73
X 15
Y 4.96 (15) + 81.73
156.10
With application of linear forecasting model, there is prediction for 12th and 15th year of
train station usage as 141.22 and 156.10 respectively.
CONCLUSION
From the above report it could be concluded that statistics play very important role for
business decision making and for analysing the numbers in systematic manner. It has shown
importance of descriptive statistics with detailed steps as it simply reflected the raw data which is
very difficult for visualize. It enables to show data in meaningful order and allows simpler data
interpretation. Thus, it had implied implication of linear forecasting model and this had forecast
for 12th and 15th year of Caergwrle as 141.22 and 156.10 significantly.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
REFERENCES
Online
gov.uk, 2019. Train Station usage. [Online]. Available through
<https://data.london.gov.uk/dataset/train-station-usage>.
1 out of 8
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]