Data Analysis and Forecasting of Chicago Weather: A 10-Day Study

Verified

Added on  2025/05/03

|10
|1205
|53
AI Summary
Desklib provides solved assignments and past papers to help students succeed.
Document Page
DATA ANALYSIS AND FORECASTING
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Historical weather data was collected for Chicago, Illinois, United states of America for 10
consecutive days from 26th November 2018 to 5th December 2018. Two parameters, temperature
and humidity were considered and all data is tabled down in excel. This data was gathered
through an online source. Temperature is in degree Celsius (°C) units and humidity is written in
percentage form. Gathered data is shown below:
Date Temperature (°C) Humidity (%)
26-Nov-18 8 57
27-Nov-18 12 65
28-Nov-18 14 52
29-Nov-18 8 53
30-Nov-18 10 54
1-Dec-18 9 55
2-Dec-18 10 57
3-Dec-18 11 59
4-Dec-18 15 68
5-Dec-18 9 51
This data can be visually expressed in the form of graphs or charts.
26-Nov
27-Nov
28-Nov
29-Nov
30-Nov
1-Dec
2-Dec
3-Dec
4-Dec
5-Dec
0 10 20 30 40 50 60 70 80
Bar chart
Humdity (%) Temperature (°C) Day
Document Page
Above plotted is bar chart which is showing relationship between temperature and humidity
scale vise on different dates.
26-Nov-18 27-Nov-18 28-Nov-18 29-Nov-18 30-Nov-18 1-Dec-18 2-Dec-18 3-Dec-18 4-Dec-18 5-Dec-18
0
10
20
30
40
50
60
70
80
Clustered - Column Line chart
Day Temperature (°C) Humdity (%)
Above plotted chart is called clustered chart which is a combination of both column chart and
line chart. Here humidity trend is shown in relation with temperature which is scaled in columns
for every date.
Mean is the average of all the numbers in a particular set of data. Mean can be calculated by
adding all the numbers in a set and then dividing it by the sum of total count of numbers (Anna
Foard, 2019).
Mean = of all numbers
count of total number
Mean of temperature can be calculates as:
Mean = 8 + 12 + 14 + 8 + 10 + 9 +10+11+15 +9
10
= 106
10 = 10.6
So, mean of the temperature for ten days is 10.6°C.
Document Page
Mean of humidity can be calculates as:
Mean = 57 + 65 + 52 + 53 + 54 + 55 + 57 + 59 + 68 + 51
10
= 571
10 = 57.1
So, mean of the humidity for ten days is 57.1%.
Median can be defined as the middle number from the set of numbers. To find the median,
numbers should be organized in an order by size. And if total count of number is even number,
then average of the middle two number is considered as median of the sequence (Anna Foard,
2019).
Median of temperature data can be calculated as:
Sequence: 8, 8, 9, 9, 10, 10, 11, 11, 12, 14, 15
Total number of observations here is 10, so two middle numbers are 10 and 10.
Median = 10 + 10
2 = 10
So, median of temperature data is 10.
Median of humidity data can be calculated as:
Sequence: 51, 52, 53, 54, 55, 57, 57, 59, 65, 68
Total number of observations here is 10, so two middle numbers are 55 and 57.
Median = 55 + 57
2 = 56
So, median of humidity data is 56.
Mode of a given set of numbers can be defined as the number whose occurrence is maximum
within the set (Anna Foard, 2019).
Mode of temperature data can be calculated as:
Temperature (°C) Frequency
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
8 2
9 2
10 2
11 1
12 1
14 1
15 1
Here 8, 9 and 10 appeared two times which is the most. So, this is a multimodal case.
Mode of temperature data is 8, 9 and 10.
Mode of humidity data can be calculated as:
Humidity (%) Frequency
51 1
52 1
53 1
54 1
55 1
57 2
59 1
65 1
68 1
Here 57 appeared two times, which is the most.
Mode of humidity data is 57.
Range of a given set of numbers can be defines as the difference of the highest and the lowest
number from the set (John Clark, 2018).
Range of the temperature data can be calculated as:
Highest number = 15
Lowest number = 8
Range = 15 – 8 = 7
So, range of temperature data is 7.
Document Page
Range of the humidity data can be calculated as:
Highest number = 68
Lowest number = 51
Range = 68 – 51 = 17
So, range of humidity data is 17.
Standard deviation of a given set can be defined as the value of which shows how much the
numbers of the set differ from the mean value of the set (John Clark, 2018).
Standard deviation = ( xx)2
n1
Where x is the number
X bar is the mean
And n is the total count of numbers
Standard deviation of temperature data can be calculated as:
x x - x bar (x-x bar) ^2
8 -2.6 6.76
12 1.4 1.96
14 3.4 11.56
8 -2.6 6.76
10 -0.6 0.36
9 -1.6 2.56
10 -0.6 0.36
11 0.4 0.16
15 4.4 19.36
9 -1.6 2.56
x bar 10.6 52.4
Standard deviation = 52.4
9 = 2.4129
So, standard deviation of temperature data is 2.4129.
Document Page
Standard deviation of humidity data can be calculated as:
x x - x bar (x-x bar) ^2
57 -0.1 0.01
65 7.9 62.41
52 -5.1 26.01
53 -4.1 16.81
54 -3.1 9.61
55 -2.1 4.41
57 -0.1 0.01
59 1.9 3.61
68 10.9 118.81
51 -6.1 37.21
x bar 57.1 278.9
Standard deviation = 278.9
9 = 5.5668
So, standard deviation of humidity data is 5.5668.
Now, Liner equation can be written as, y = mx + c
m = (x- x )(y- y )
(x- x )2
and c = y -m x
Where m and c are constants. (Madhu Sanjeevi, 2017)
now let us consider number of days as x and temperature as y.
x y x-x
bar y-y bar (x-x bar) (y-y
bar) (x-x bar) ^2
1 8 -5 -2.6 12 20.25
2 12 -4 1.4 -5 12.25
3 14 -3 3.4 -9 6.25
4 8 -2 -2.6 4 2.25
5 10 -1 -0.6 0 0.25
6 9 1 -1.6 -1 0.25
7 10 2 -0.6 -1 2.25
8 11 3 0.4 1 6.25
9 15 4 4.4 15 12.25
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
10 9 5 -1.6 -7 20.25
x bar =
5.5
y bar =
10.6 Total 10 82.5
So, m = 10
82.5 = 0.1212
And c = 10.6 – (0.1212 x 5.5) = 9.9333
Putting both constants in liner equation,
y = 0.1212x + 9.9333
using this equation, we can forecast temperature on future days.
For day 15th,
Put x = 15 in liner equation,
We get y = (0.1212 * 15) + 9.9333 = 11.75
So, temperature on day 15th will be 11.75°C.
For day 23rd,
Put x = 23 in liner equation,
We get y = (0.1212 * 23) + 9.9333 = 12.72
So, temperature on day 23rd will be 12.72°C.
Similarly, to determine forecast data for humidity,
x y
x-x
bar y-y bar
(x-x bar) (y-y
bar) (x-x bar)2
1 57 -5 -0.1 0 20.25
2 65 -4 7.9 -28 12.25
3 52 -3 -5.1 13 6.25
4 53 -2 -4.1 6 2.25
5 54 -1 -3.1 2 0.25
6 55 1 -2.1 -1 0.25
7 57 2 -0.1 0 2.25
8 59 3 1.9 5 6.25
Document Page
9 68 4 10.9 38 12.25
10 51 5 -6.1 -27 20.25
x bar =
5.5
y bar =
57.1 Total 8 82.5
let us consider number of days as x and humidity as y.
So, m = 8
82.5 = 0.0909
And c = 57.1 – (0.0909 x 5.5) = 56.6
Putting both constants in liner equation,
y = 0.0909x + 56.6
using this equation, we can forecast humidity on future days.
For day 15th,
Put x = 15 in liner equation,
We get y = (0.0909 * 15) + 56.6 = 57.96
So, humidity on day 15th will be 57.96%.
For day 23rd,
Put x = 23 in liner equation,
We get y = (0.0909 * 23) + 56.6 = 58.69
So, humidity on day 23rd will be 58.69%
Document Page
REFERENCES:
World Weather Online, 2019, Chicago Historical Weather, viewed 16th May 2019,
https://www.worldweatheronline.com/chicago-weather-history/illinois/us.aspx
Medium, 2019, Chapter 1: Complete Linear Regression with Math, viewed 16th May 2019,
https://medium.com/deep-math-machine-learning-ai/chapter-1-complete-linear-regression-with-
math-25b2639dde23
Clark, J, 2019, Range and Standard Deviation - Magoosh Statistics Blog, viewed 16th May 2019,
https://magoosh.com/statistics/range-standard-deviation
The Stats Ninja, 2019, Mean, Median, and Mode: How Visualizations Help Find What’s
“Typical, viewed 16th May 2019, https://thestatsninja.com/2019/01/05/how-to-measure-typical
Corporate Finance Institute, 2019, FORECAST.LINEAR Function - Formula, Examples, How to
Use, viewed 16th May 2019,
https://corporatefinanceinstitute.com/resources/excel/functions/forecast-linear-function
chevron_up_icon
1 out of 10
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]