logo

Predicting Temperature using Linear Regression Model

   

Added on  2022-12-27

17 Pages3505 Words68 Views
Data Science and Big DataStatistics and Probability
 | 
 | 
 | 
Predicting Temperature using linear regression model
Research Methodology
Student Name:
Instructor Name:
Course Number:
5th May 2019
1
Predicting Temperature using Linear Regression Model_1

ABSTRACT
The main goal of this particular study was to develop a single point temperature forecast model
utilizing Multiple Linear Regression (MLR). Time series data spanning from April 2006 to
September 2016 was utilized. About 96453 observational cases were considered for analysis in
this study. Results showed that all the five factors considered significantly impacted on the
earth’s surface temperature. Three of the factors considered had negative relationship with the
dependent variable (temperature). The factors that had inverse relationship with the dependent
variable (temperature) include humidity, wind speed and pressure. The other two factors (wind
bearing and visibility) had positive relationship with dependent variable (temperature).
INTRODUCTION
Our day to day lives (including those of other living creatures/organisms) are greatly influenced
by the weather and climate. Especially temperature has significant effect on our lives. Thousands
of lives all over the world are taken away every year more so during summer. More than five
hundred thousand chickens died in Georgia alone amid a two-day time span at the pinnacle of the
summer heat (Donald, 2011).
Estimation of timely and accurate temperature is necessary in helping to take prudent steps
(Christoph, et al., 2009). Precise count of what the atmosphere will do in the coming days is
quite challenging based on the fact that the atmospheric environment is dynamic in influencing
the observed of the earth surface (Shengpan, et al., 2012). The goal of this investigation is to
create single point temperature forecast model utilizing Multiple Linear Regression (MLR).
Academic Community has over the past recommended numerous both linear and no-linear
2
Predicting Temperature using Linear Regression Model_2

techniques of predicting temperature, yet at the same time MLR has always been chosen based
on the fact that linear models frequently produce preferable estimates over non-linear models
notwithstanding when the given data is non-linear (Chatfield, 2009) and furthermore factual
plans require little calculation time to make a prediction (Dhawal & Mishra, 2016).
METHODOLOGY
A time series data was collected to enable answer the research question. Multiple linear
regression (MLR) was employed to try and predict the temperature using factors such as
humidity, wind speed, wind bearing, visibility and pressure.
Other statistical measures performed include the Pearson correlation test between the variables.
The following regression equation model was estimated.
y=β0 + β1 x1 + β2 x2 + β3 x3 + β4 x4 + β5 x5 +ε
Where we have the variables as follows;
y=Temperature , x1 =humidity , x2=wind speed ( km
h ), x3=wind bearing ( degrees ) , x4 =visibility ( km )x5= pressur
β0=Intercpt coefficient , β1=coefficient for the humidity , β2=coefficient for the wind speed , β3=coefficient for win
DATA
Time series data was used to predict the temperature for this particular study. The data was
spanning from April 2006 to September 2016 (with a total of 96,453 data points). Table 1 below
presents a section (first 10 cases) of the data.
Table 1: Data
Formatted Date Temperat
ure (C)
Humid
ity
Wind
Speed
Wind
Bearin
Visibil
ity
Pressu
re
3
Predicting Temperature using Linear Regression Model_3

(km/h
)
g
(degre
es)
(km) (millib
ars)
2006-04-01 00:00:00.000
+0200
9.472222 0.89 14.119
7
251 15.826
3
1015.13
2006-04-01 01:00:00.000
+0200
9.355556 0.86 14.264
6
259 15.826
3
1015.63
2006-04-01 02:00:00.000
+0200
9.377778 0.89 3.9284 204 14.956
9
1015.94
2006-04-01 03:00:00.000
+0200
8.288889 0.83 14.103
6
269 15.826
3
1016.41
2006-04-01 04:00:00.000
+0200
8.755556 0.83 11.044
6
259 15.826
3
1016.51
2006-04-01 05:00:00.000
+0200
9.222222 0.85 13.958
7
258 14.956
9
1016.66
2006-04-01 06:00:00.000
+0200
7.733333 0.95 12.364
8
259 9.982 1016.72
2006-04-01 07:00:00.000
+0200
8.772222 0.89 14.151
9
260 9.982 1016.84
2006-04-01 08:00:00.000
+0200
10.82222 0.82 11.318
3
259 9.982 1017.37
2006-04-01 09:00:00.000
+0200
13.77222 0.72 12.525
8
279 9.982 1017.22
STATISTICAL DATA ANALYSIS:
Descriptive Statistics
Table 2 below presents the descriptive statistics for the six variables (including the dependent
variable- temperature). We can see from the table that, the average temperature is 11.93 with a
standard deviation of 9.55 and a median temperature of 12.00. The maximum and the minimum
temperature values are given as 39.91 and -21.82 respectively. The skewness value for the
temperature is 0.094 (a value very close to zero), this suggests that the distribution for
temperature is close to normal distribution. The average humidity was 0.735 (SD = 0.195) with a
median humidity of 0.780. The skewness value for humidity was -0.716; this shows that the
variable humidity is slightly negatively skewed.
4
Predicting Temperature using Linear Regression Model_4

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Time Series Analysis for Ozone Layer Thickness
|19
|1811
|401