This report provides an overview of data analysis techniques and focuses on the forecasting of future humidity levels. It includes data presented in table and chart formats, calculation of descriptive statistical tools, and the use of a linear forecasting model. The report concludes with predictions for the 15th and 20th day.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
Data Analysis and Forecasting
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Contents INTRODUCTION...........................................................................................................................1 MAIN BODY..................................................................................................................................1 1. Data in a table format..............................................................................................................1 2. Data in a Chart format.............................................................................................................1 3. Calculation of descriptive statistical tools...............................................................................2 4. Using the linear forecasting model..........................................................................................4 CONCLUSION................................................................................................................................6 REFERENCES................................................................................................................................7
INTRODUCTION Data analysis is a procedure of inspecting and transforming an information into much more understandable and summarised way. Data analysis techniques are the methods which helps an investigator to analyse the data in such a way that can assist in the process of decision making (Wang and Sun, 2015). The main aim of this report is to develop understanding about the data analysis technique of forecasting so that future humidity can be predicted. In this particular report, humidity level ofArad, a city of Romaniais first identified with reliable sources and then presented using table and graphs, them transformed into descriptive statistics and lastly used to predict future humidity level of day 15thand 20th. MAIN BODY 1. Data in a table format Data arrangement in a table format is an act of presenting the data in viable rows and columns so that understanding level of the data can be enhanced. Below, the humidity level data for ten days is presented which is noted at 09:00 AM daily (Source:Humidity level in Arad, Romania,2019). DateHumidity level 29-Dec-1965% 30-Dec-1971% 31-Dec-1969% 01-Jan-2072% 02-Jan-2089% 03-Jan-2086% 04-Jan-2069% 05-Jan-2086% 06-Jan-2079% 07-Jan-2073% 2. Data in a Chart format Line Chart 1
Pie chart 3. Calculation of descriptive statistical tools Mean This is the statistical average of the values in a data set which is computed by comparing sum of the values with total number of the values. Mean for the data set of ten days’ humidity level is calculated below: Formula: Mean = Σx / n = 759% / 10 = 75.9 or 76% 2
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
The above computation of Mean shows that the number which represents the central tendency of the data set is 76%. This value is the average which states that other values in the data set will not be broadly dispersed to this average. Median This is a statistical measure which represents the middle value of a data set. This value splits the data set into two equal sections (Voukantsis and et.al., 2011). This value can be determined using total number of frequencies added with 1 and then divided by 2. Formula: Median = (n + 1) / 2 = (10 + 1) / 2 = 5.5thitem = 73% In the procedure of calculation of the median. The whole data set is first arranged in a linear format that is in ascending order and then the median item that 5.5thitem is determined as median as seen in below working note. Working note: 165% 269% 369% 471% 572% 673% 779% 886% 986% 1089% Mode Mode is the most recurring frequency in the data set. This statistical measure shows the value which is related with most of the frequencies. From the data set of humidity level, it is evidently seen that the value “69%” is recurring two times. Hence Mode of the given data set is 69%. 3
Range In the terms of numeracy and data analysis, range is the measure which represents the data set that is the difference between largest and smallest values. Range is often used to analyse the statistical dispersion. Formula: Range = Maximum band value – Minimum band value = 89% - 65% = 24% From the above calculation, it has been seen that the minimum frequency value in the humidity data set is 65% and maximum is 89%. The difference between these two values is 24% which is considered as the range for this data set. Standard deviation This is a statistical measure which shows the amount of variability in a data set and the range from which the values in that data set is dispersed from the average mean. Higher the dispersion between frequency values is, higher the standard deviation will be (Patterson and et.al., 2011). Formula: Standard Deviations =√(variance) Variance2= {∑ (x – mean) / N}2 = {∑ (x2/ N – (mean)2} = {583% / 10 – (76%)2} = {58.3% – 58%} = 0.3 Std. Dev. =√0.3 Standard deviation = 0.547 From the above calculation, it has been observed that the level of dispersion or chances of standard errors are less as the standard deviation for the humidity data level is less than 0 that is 0.547. 4. Using the linear forecasting model (i). Calculation of m value DateDay (X) Humiditydata(%) (Y)X * YX ^ 2 29-Dec-19165%65%1 4
30-Dec-19271%142%4 31-Dec-19369%207%9 01-Jan-20472%288%16 02-Jan-20589%445%25 03-Jan-20686%516%36 04-Jan-20769%483%49 05-Jan-20886%688%64 06-Jan-20979%711%81 07-Jan-201073%730%100 Total55759%4275%385 ParticularsDetails mNΣxy – Σx Σy / N Σ x ^ 2 – (Σx) ^ 2 (10 * 4275) - (55 * 759) / ( 10 * 385) - (55) ^ 2 (42750 - 41745) / (3850 - 3025) 1.21 (ii). Calculation of c value ParticularsDetails cΣy - mΣx / N (759 - (1.21 * 55)) / 10 69.24 (iii). Forecasting humidity level for 15thday and 20thday Forecast of 15th day y= mx + c y1.21 (x) + 69.24 x15 5
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
y1.21 (15) + 69.24 87.39 Forecast of 20th day y= mx + c y1.21 (x) + 69.24 x20 y1.21 (20) + 69.24 93.44 Linear equation mode of mx+c is used above in order to predict the future humidity level of Arad, Romania. The value of “m” represents the humidity level whereas the value “c” represents the days of which humidity level is pre determined (Montgomery, Jennings and Kulahci, 2015). These values are then used to predicted two Xs that are humidity level of day 15thand 20th. From the above analysis, it can be seen that humidity level for day 15thwill be 87.39% and 93.44% for day 20th. CONCLUSION From the above report, it has been concluded that the procedure of data analysis is a complex process which includes various sub tasks of identifying, classifying, transforming and even predicting the data. 6
REFERENCES Books and Journals Wang, D. and Sun, Z., 2015. Big data analysis and parallel load forecasting of electric power user side. Proceedings of the CSEE. 35(3). pp.527-537. Voukantsis, D. and et.al., 2011. Intercomparison of air quality data using principal component analysis, and forecasting of PM10 and PM2. 5 concentrations using artificial neural networks, in Thessaloniki and Helsinki. Science of the Total Environment. 409(7). pp.1266-1276. Patterson, K. and et.al., 2011. Multivariate singular spectrum analysis for forecasting revisions to real-time data. Journal of Applied Statistics. 38(10). pp.2183-2211. Montgomery, D. C., Jennings, C. L. and Kulahci, M., 2015. Introduction to time series analysis and forecasting. John Wiley & Sons. Online HumiditylevelinArad,Romania.2019.[Online].Availablethrough: <https://www.worldweatheronline.com/arad-weather-history/arad/ro.aspx> 7