Numeracy and Data Analysis: Forecasting and Statistical Analysis

Verified

Added on 2023/01/11

AI Summary

Numeracy and Data
Analysis

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Contents
Contents...........................................................................................................................................2
INTRODUCTION...........................................................................................................................3
TASK...............................................................................................................................................3
Arranged data in table format:.....................................................................................................3
Presentation of chosen data through multiple graphs:.................................................................3
Calculation and discussion on following aspects:.......................................................................5
Linear forecasting model which is y = mx + c in order to do below mentioned calculations:....6
CONCLUSION................................................................................................................................7
REFERENCES................................................................................................................................8

INTRODUCTION
Data analysis is comprehensive approach which consists of collecting, interpreting and
presenting data in systematic manner with aim to assist in taking key decisions and choosing best
alternative course of action. This also involve forecasting by application of one or more key tools
and techniques of statistics (Washington, Karlaftis, Mannering. and Anastasopoulos, 2020). The
study emphasises on the multiple key elements of data analysis and use of statistical methods to
evaluate data. For study purpose ten specific bill-payments data is selected as well as expenses of
12 and 14 day has been forecasted based on such data.
TASK
Arranged data in table format:
Date Expense bill Amount (in ‘00 pounds)
01-Apr Taxi Expenses bill 18
03-Apr Utility bill 30
04-Apr Mobile data bill 10
07-Apr Food bill 32
08-Apr Building Maintenance bill 20
11-Apr Garden Maintenance bill 18
15-Apr Gas Bill 10
18-Apr Water Bill 15
20-Apr Transportation Bill 19
22-Apr Food bill 10
Presentation of chosen data through multiple graphs:
Line Chart: A line diagram or line chart are a kind of graph that shows data as specific series of
data named markers, linked by flat lines. For shorter and longer time frames, line charts may be
used to chart shifts. Line charts are easier than bar charts if smaller shifts occur (Greenacre,
2019).

Column Chart: Column diagrams are helpful for displaying data modifications over a specific
time length or for demonstrating comparisons between objects. Classifications usually are
grouped along horizontal axis whereas values along vertical axis in a column diagram.
As displayed in the graphs above, expenditure for food bill is the largest whereas
spending on items: Mobile data bill, Gas Bill and Food bill is lowest i.e. 1000 as opposed to
other expenditure.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Calculation and discussion on following aspects:
Mean: The mean, often termed average by analysts, is most famous measure employed to
determine core of a collection of statistical results. The mean average is sum of all variables
divided by number of variables in set of results. Here the measurement of mean, as below:
Mean = ∑x / n = 182 / 10
= 18.2
Here, ∑x = Total of all the value of the bill / expenses
N = Aggregate bill count
Median: A further way to calculate the core of numerical data collection is by median. A
statistical value of median is almost the same as interstate highway. Median is centre of several
highways, as well as an equivalent number of roads lie on each side of it (McShane and et.al,
2019). It is measured as follows:
Median = {(n + 1) ÷ 2}th value (10 + 1)/2 th value i.e 5.5
So, here median reside between 5th and 6th value
Thus, median = (5th value + 6th value)
÷ 2
(20 + 18) / 2 = 38 /2 = 19
Mode: This is commonly implying to highest repeated value in a dataset. Here the measurement
of mode in relation of the chosen data collection of 10 recent bills, as below:
Mode = Maximum frequency amount among the selected data is 10 that's 3 times repeated
(Highest).
Therefore, Mode = 12
Range: This represents the principal limits of any specific collection of data. As a range is the
difference between maximum and minimum figure. As for the data chosen:
Here, Maximum Range is 32
Minimum Rage is 10
So, here Range will be: 32 – 10 = 22
Standard Deviation: Standard deviation is perhaps the most prevalent indicator of variance for
statistical results in statistics. A SD determines how clustered the data is around mean value; the
more clustered standard deviation would be the greater. This is kind of average of

average, which will also help to find the explanation behind numbers/data set (Lee and Huh,
2019). The step-by-step measurements to assess the figure of standard deviation of chosen bill
data are as shown in this respect:
Date Expense bill Amount (in
‘00 GBP) (x)
x- x̄ (x-
x̄)^2
01-Apr Taxi Expenses bill 18 0.8 0.64
03-Apr Utility bill 30 12.8 163.84
04-Apr Mobile data bill 10 -7.2 51.84
07-Apr Food bill 32 14.8 219.04
08-Apr Building Maintenance bill 20 2.8 7.84
11-Apr Garden Maintenance bill 18 0.8 0.64
15-Apr Gas Bill 10 -7.2 51.84
18-Apr Water Bill 15 -2.2 4.84
20-Apr Transportation Bill 19 1.8 3.24
22-Apr Food bill 10 -7.2 51.84
x̄ = ∑x / n = 18.2 ∑ (x- x̄) ^2 555.6
Thus,
Standard
Deviation
=
√ [∑ (x- x̄) ^2]/n 555.6 /10 = 55.56
Linear forecasting model which is y = mx + c in order to do below mentioned calculations:
Calculation of value m:
Day (x) Expense bill Amount (y) xy x^2
1 18 18 1
2 30 60 4
3 10 30 9
4 32 128 16
5 20 100 25
6 18 108 36
7 10 70 49
8 15 120 64
9 19 171 81
10 10 100 100
55 182 905 385

∑x = 55 ∑y = 182 ∑xy = 905 ∑x^2 = 385
Based on above calculations, “m” will be calculated as follows:
m = n (∑xy) - (∑x) (∑y)/ n(∑x2)-( ∑x)2
m = 10 * 905 - 55 * 182 / 10 * 385 - 55*55
m = 9050 - 10010 / 3850 - 3025
m = -960 / 825
m = -1.16
Computation of value of “c”:
C= [(∑y) / n]-m(∑x/n)
c = (182 / 10) - (-1.16) * (55/10)
c = 18.2 + 6.38
c = 24.58
Applying the assessed value of 'm' and 'c', forecasting expenses for day-12 and day-14:
Forecasting for day-12:
y= mx + c
y = -1.16 * 12 + 24.58
= 10.66
Forecasting for day-14:
y= mx + c
y = -1.16 * 14 + 24.58
= 8.34
CONCLUSION
This has been expressed from above study that the data analysis and predicting are critical
factors that enable to determine the core aspects of the chosen data collection. In practical world,
this could be basically implemented specifically with the aid of computational tools to predict
stuff.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

REFERENCES
Books and Journals:
Washington, S., Karlaftis, M.G., Mannering, F. and Anastasopoulos, P., 2020. Statistical and
econometric methods for transportation data analysis. CRC press.
Greenacre, M., 2019. Variable selection in compositional data analysis using pairwise
logratios. Mathematical Geosciences, 51(5), pp.649-682.
McShane, B.B. and et.al., 2019. Abandon statistical significance. The American
Statistician, 73(sup1), pp.235-245.
Lee, S. and Huh, J.H., 2019. An effective security measures for nuclear power plant using big
data analysis approach. The Journal of Supercomputing, 75(8), pp. 4267-4294.