Data Analysis Report: Analysis of London Humidity Data 2019

Verified

Added on 2023/01/16

AI Summary

This report presents a comprehensive analysis of London's humidity data from December 22nd to 31st, 2019. It begins with a tabular presentation of the data and proceeds to visualize the data using both bar graphs and line charts, facilitating easy interpretation. The report then delves into various data analysis techniques, including the calculation of mean, median, mode, range, and standard deviation to provide a statistical overview of the data. Furthermore, the assignment employs a linear forecasting model to determine the values of 'm' and 'c', enabling predictions based on the historical humidity data. The report concludes with a summary of the findings and provides references to relevant literature.

DATA ANALYSIS
TECHNIQUES

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Table of Contents
INTRODCUTION ..........................................................................................................................1
TASK ..............................................................................................................................................1
1. Presentation of data in table....................................................................................................1
2. Graphical presentation of data................................................................................................1
3. Calculation of final results with help of different techniques.................................................2
4. Calculating values of m, c.......................................................................................................4
CONCLUSION ...............................................................................................................................5
REFERENCES ...............................................................................................................................6

INTRODCUTION
The study of data can be defined as a systematic process for collecting and interpreting
financial data using a wide variety of techniques (Jobson, 2012). This includes different kinds of
statistics and diagrams to easily implement validated data. The respective report is based on
humidity figures of London for ten consecutive days. Numeracy and analysis of data are used for
knowledge evaluation to allow the company to undertake its business processes together with its
activities.
In this project, various techniques of data analysis is used to calculate the final values of
provided data. In addition linear forecasting model is being used to determine the values of m
and c.
TASK
1. Presentation of data in table.
The below table shows the humidity of London from 22nd December to 31st December
2019.
Days (Date) Humidity (values in %)
22nd 96.00%
23rd 91.00%
24th 92.00%
25th 80.00%
26th 98.00%
27th 89.00%
28th 99.00%
29th 86.00%
30th 100.00%
31st 100.00%
1

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

2. Graphical presentation of data.
Bar graph: This help in displaying the data with different bars with different bar height
because of different figures. The relevant bar graph is as follows:
22nd 23rd 24th 25th 26th 27th 28th 29th 30th 31st
0
0.2
0.4
0.6
0.8
1
1.2
Column B
Line chart: This kind of graph use to draw a line around the values of data set which
help the user to easily analyse the information needed (Ropodi, Panagou and Nychas, 2016).
22nd 23rd 24th 25th 26th 27th 28th 29th 30th 31st
0
0.2
0.4
0.6
0.8
1
1.2
96.00% 91.00% 92.00%
80.00%
98.00%
89.00%
99.00%
86.00%
100.00%100.00%
Column B
2

3. Calculation of final results with help of different techniques.
22nd 96.00%
23rd 91.00%
24th 92.00%
25th 80.00%
26th 98.00%
27th 89.00%
28th 99.00%
29th 86.00%
30th 100.00%
31st 100.00%
∑ X 931
Mean 93.1
Median 94
Mode 100
Range 20
Maximum
range 100
Minimum 80
Mean: It is discussed as the aggregate of total information which is calculated by
addition of observation and dividing with total number of observation. The calculation are as
follows:
∑N / N
N = 10
∑ N = 931
= 931 / 10
= 93.1
3

Median: It is related to the middle value of give set of data (DiStefano and Morgan,
2014). In case if data series have even observation the median is calculated by using formula
{N/2th item+ N/2th item + 1}2. where as if series is odd than treatment to find median will be (N
+ 1 / 2)th item.
= {10/2+ 10/2 +1} / 2
= (5th item + 6th item) / 2
= (92+96)/2
= 94
Mode: This techniques of data analysis is used to determine the value which occurs more
number of time in given data series. From the respective humidity figures the model value is 100
which is the only repeated frequency.
Range: The value which is figure out by subtracting of lowest value of given data series
from the highest value is known as range (Soofi and Cao, 2012). Formula to calculate range is
Maximum- minimum
100-80
=20
Standard deviation: It is indeed a mathematical model that is used to evaluate the
estimated amount of variability or spread (Cao, 2012). It is observed that lower value of Std. Dev
includes values which are nearest to mean value of series. Similarly at the other side higher
values define the value that have broad range.
22nd 96.00% x- m (x-m)2
23rd 91.00% 2.9 8.41
24th 92.00% -2.1 4.41
25th 80.00% -1.1 1.21
26th 98.00% -13.1 171.61
27th 89.00% 4.9 24.01
28th 99.00% -4.1 16.81
29th 86.00% 5.9 34.81
30th 100.00% -7.1 50.41
31st 100.00% 6.9 47.61
∑ X 931 6.9 47.61
4

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

406.9
Variance = [ ∑(x – mean) 2 / N ]
= 406.9/10
= 40.69
Std. dev: √ ( variance )
= √40.69
= 6.38
4. Calculating values of m, c.
Days (Date) Humidity (values in %) X2 ∑XY
1 96 1 96
2 91 4 182
3 92 9 276
4 80 16 320
5 98 25 490
6 89 36 534
7 99 49 693
8 86 64 688
9 100 81 900
10 100 100 1000
∑X= 55 ∑Y= 931 ∑X2= 385 ∑XY= 5179
1. Calculation to determine the value of M:
M = N * ∑xy - ∑x * ∑y / N*∑x2 - ( ∑x )2
= 10*5179-55*931/10*385-(55) 2
= 51790- 51205/3850-3025
= 585/825
= 0.71
2. Calculation to determine the value of c:
∑y- m ∑x/ N
= 931- 0.71 * 55*10
= 540.5
5

CONCLUSION
In the end of report, it is concluded that numeracy means the ability to think and the
execution of common numerical principles. Data analysis allows people to understand statistics
and provide data to better refine the company's decision-making strategies. Different techniques
of data analysis such as mean, mode, median and Std. Dev. support in making better decision.
Liner equation is implemented to figure out the humidity of future days of London.
6

REFERENCES
Books and Journals:
Jobson, J. D., 2012. Applied multivariate data analysis: regression and experimental design.
Springer Science & Business Media.
Ropodi, A. I., Panagou, E. Z. and Nychas, G. J., 2016. Data mining derived from food analyses
using non-invasive/non-destructive analytical techniques; determination of food
authenticity, quality & safety in tandem with computer science disciplines. Trends in
food science & technology, 50, pp.11-25.
DiStefano, C. and Morgan, G. B., 2014. A comparison of diagonal weighted least squares robust
estimation techniques for ordinal data. Structural Equation Modeling: A
Multidisciplinary Journal, 21(3), pp.425-438.
Soofi, A. S. and Cao, L. eds., 2012. Modelling and forecasting financial data: techniques of
nonlinear dynamics (Vol. 2). Springer Science & Business Media.
7