Data Analysis and Visualisation

Verified

Added on  2023/01/03

|12
|2233
|81
AI Summary
This report discusses data visualisation and analysis techniques, different types of data, software for statistical analysis, and problems with visualisation. It provides insights into data analysis and interpretation.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Data Analysis and
Visualisation-2
1

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Table of Contents
INTRODUCTION...........................................................................................................................3
MAIN BODY..................................................................................................................................3
CONCLUSION..............................................................................................................................11
REFERENCES..............................................................................................................................12
2
Document Page
INTRODUCTION
Data visualisation is process of analysing and interpreting data in graphical way through
charts and tables. Through that, it becomes easy to develop graphs, tables, etc. and understand
data in effective way. also, with that it becomes easy to find out trends, patterns, etc. in data and
then doing analysis. In this report it will be described about data taken of 50 patients of covid 19
infected in Bame and white people (Ahmed, and Lugovic, 2019). Also, it will be discussed on
various types of data and methods for showing it in graphical way. Moreover, problem related
with visualisation will be evaluated.
MAIN BODY
There are various types of data and info available which enable in doing analysis and
interpreting. Also, there are different tests which is applied in it such as frequency, descriptive,
normal curve, etc. these all allow in gathering of data and then analysing outcomes. In addition
to that, there are different types of graphs available as well. The type of graph developed depends
on data available.
Software for statistical analysis and data visualisation
It has been evaluated that there are various software which is available for statistical
analysis. The use of software depends in scholar willingness and type of data. Thus, the type of
software is as below
SPSS- it is most common software used for data visualisation. In this there are many tests which
is applied within data. it enables in making things easy to analyse data and write findings. Also,
in this graphs and tables are automatically created and outcomes are obtained.
Tableau- this is also a software for data visualisation. it also helps in understanding data and
obtaining relevant outcomes. it creates a wide range of data visualisation to present data in
proper way (Al-Saqaf, 2016).
Datawrapper- it is also a software for creating graphs and charts. However, it is used only for
developing charts. This is open source of data that is used to generate results.
Different types of data
Data types are important concept in statistics. This is because it is necessary to
understand data in proper way so that assumptions are mad properly and there is no change in it.
besides, data types needs to same in order to obtain results. furthermore, having a good
3
Document Page
understanding of data enable in doing exploratory data analysis. On basis of it right visualisation
is done. So, different data is defined as below:
Categorical data – it is a type of data that represent characteristics. this means it shows gender,
age, income level, etc. basically, it is named as numerical value that is either 0 and 1 but there is
no meaning in it.
Nominal data – they represent discrete values and is used in labelling of variables. it does not
contain any quantitative value. there is no change in value even when order is changed.
Ordinal data- it represent discrete and order units. Thus, it is same as nominal data.
Numerical data
Discrete data – in this type of data the values are distinct and separate. However, data is not
measured but count. Here info represent can be categorised into various classes (Bertoni, and
et.al., 2020).
Continuous data – this type of data represent measurement and it does not count. But can be
easily measured.
Interval data – the data where difference between values or units is same. Also, it contains
numeric value which is ordered in specific way. there are many ways of how things are done and
exact different between value is identified. for example- temperature = -5, -10, -15, etc. the main
issue in this data is it does not have any true value. so, due to that no descriptive and inferential
stats are applied in it.
Ratio data – in this units have same difference in them. they are same as interval value but there
they do not have an absolute zero. For example- 5, 10, 15 etc.
Display methods for differing data type
Along with that, there are many methods and ways by which data is visualised. This
makes it easy to find out trends and pattern in it. However, it is essential to use appropriate data
type so that accordingly results is obtained (Boddy, and et.al., 2017). Thus, various methods are
defined as follows :
Frequency – it defines rate at which something occur within dataset in particular time period.
Proportion – it can be easily calculated by dividing frequency with number of events.
Percentage – in this pie chart or bar chart can be used in order to visualise data in effective way.
this shows percentage of total frequency of each data element in it.
4

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Graphs- this is way in which data is represented in graphs and charts. They can be of different
types like pie, etc.
Tables- the data is presented in table form where many columns are made. It changes way of
making things easy and understanding data.
Histogram – it is graphical display of data by using charts of different heights. Usually, it is
similar to bar chart. Here, group number are presented into range. thus, height of bar shows how
many units are there into it (Coudriau, Lahmadi, A. and Francois, 2016).
Boxplots- it is used as plot for by which data is summarised and measures on interval scale. thus,
it is used in explaining data analysis. the use of boxplot is done to shape distribution, central
value and variation.
Scatterplots- this used as mathematical diagram in which cartesain coordinate is presented to
display value of two variables.
Bar chart- it is a rectangular column in which there is change in length of values which it shows.
When there is long bar it shows big number. On X axis it shows category whereas on Y axis it
shows discreate value.
Parallel coordinate plots – it enables in comparing features of various series on set of numeric
value. here, each bar represents a variable as it is having its own scale.
Maps – the data can be represented through map in which there are different areas that is
highlighted in it with colours. This usually shows location, states, regions, etc. Here colour can
also be done to find out value of metric in it.
It is important to extract data properly so that meaning can be identified from it. The
arranging of data makes it easy to analyse it and obtain results. besides that, there are certain
methods and technique by which extraction of data is done. It depends on nature and type of
data. Distortion is a technique in which data is classified into various groups and then sorting is
done. This can also distort by giving fake outcomes and stats. It is done to segregate complex
data in effective way (Hiriyannaiah, and et.al., 2018).
Identifying problems with visualisation
There are several problems which is being faced in visualisation. The problem needs to
be determined so that it can be solved by taking proper measures. When issue occurs then it may
lead to impact on outcomes. Thus, they are defined as below :
5
Document Page
Oversimplification of data – It is common problem which occur in it as data simplifying is
complex tasks in it. The data points need to be defined as it led to unfound conclusion. This basic
thing is not done in it (Katina, Vittert, and Bowman, 2020).
Human limitation of algorithms – In this algorithms are made by human so it reduces data
outcomes as algo made may be flawed. Therefore, most algo are based on national scale thus
they are not fit into all algorithms. Hence, it does not address needs of individuals.
Overreliance on visuals- There is more reliance on data visuals to obtain outcomes. thus,
conclusion obtained from it may be false which not be applicable in data outcomes. So, it
requires to draw conclusion based on practical way. This will make it easy to interpret data.
Inevitability of visualisation – There are many data model available for analysing data. Thus,
company may develop product before visualisation. hence, it may affect on over reliance and it
results in limitation of human error in developing algo (Leal and et.al., 2016).
There is need to develop hypothesis in order to find out relationship between ratio of
covid 19 on white and bame people. This is because it will enable in obtaining relevant outcomes
and testing significance value of P. Thus, hypothesis is formed as below
Hypothesis – is there any difference in ratio of covid 19 infection in white and Bame patient in
UK.
Frequency table
Statistics
whitepatient bame
N
Valid 50 50
Missing 0 0
Mean 36.1800 36.7400
Median 36.0000 34.5000
Mode 20.00a 50.00
Std. Deviation 9.41924 9.31711
Variance 88.722 86.809
a. Multiple modes exist. The smallest value is
shown
6
Document Page
Interpretation- From above table it can be identified that mean of white patient is 36.18 and
median is 36. Also, mode is 20 and SD is 9.41. However, in Bame patient the mean is 36.74 and
median is 34.5. Similarly, mode is 50 and SD is 9.31.
Frequency Table
whitepatient
Frequency Percent Valid Percent Cumulative
Percent
Valid
20.00 4 8.0 8.0 8.0
22.00 2 4.0 4.0 12.0
25.00 2 4.0 4.0 16.0
26.00 1 2.0 2.0 18.0
27.00 2 4.0 4.0 22.0
28.00 1 2.0 2.0 24.0
29.00 1 2.0 2.0 26.0
30.00 2 4.0 4.0 30.0
32.00 1 2.0 2.0 32.0
33.00 4 8.0 8.0 40.0
34.00 2 4.0 4.0 44.0
35.00 3 6.0 6.0 50.0
37.00 2 4.0 4.0 54.0
38.00 2 4.0 4.0 58.0
40.00 4 8.0 8.0 66.0
42.00 1 2.0 2.0 68.0
43.00 4 8.0 8.0 76.0
44.00 3 6.0 6.0 82.0
45.00 1 2.0 2.0 84.0
48.00 2 4.0 4.0 88.0
49.00 1 2.0 2.0 90.0
50.00 4 8.0 8.0 98.0
55.00 1 2.0 2.0 100.0
Total 50 100.0 100.0
7

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
bame
Frequency Percent Valid Percent Cumulative
Percent
Valid
20.00 2 4.0 4.0 4.0
22.00 1 2.0 2.0 6.0
23.00 1 2.0 2.0 8.0
24.00 1 2.0 2.0 10.0
26.00 1 2.0 2.0 12.0
27.00 2 4.0 4.0 16.0
28.00 2 4.0 4.0 20.0
29.00 2 4.0 4.0 24.0
30.00 6 12.0 12.0 36.0
31.00 1 2.0 2.0 38.0
32.00 3 6.0 6.0 44.0
33.00 1 2.0 2.0 46.0
34.00 2 4.0 4.0 50.0
35.00 1 2.0 2.0 52.0
39.00 1 2.0 2.0 54.0
40.00 3 6.0 6.0 60.0
41.00 1 2.0 2.0 62.0
43.00 1 2.0 2.0 64.0
44.00 4 8.0 8.0 72.0
45.00 3 6.0 6.0 78.0
46.00 1 2.0 2.0 80.0
47.00 3 6.0 6.0 86.0
50.00 7 14.0 14.0 100.0
Total 50 100.0 100.0
8
Document Page
9
Document Page
Correlation
Descriptive Statistics
Mean Std. Deviation N
whitepatient 36.1800 9.41924 50
bame 36.7400 9.31711 50
Correlations
whitepatient bame
whitepatient
Pearson Correlation 1 .072
Sig. (2-tailed) .621
N 50 50
10

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
bame
Pearson Correlation .072 1
Sig. (2-tailed) .621
N 50 50
Interpretation – By analysing the above table it can be found that significance value obtained is
P= .621 which is more than P= 0.05. thus, it means that null hypothesis is accepted. That means
there is difference in ratio of covid 19 infection in white and Bame patient in UK. In both patient
ratio of covid 19 infection varies.
However, there is need to include many other factors as well in dataset by which results
has affected. It will enable in finding out relationship between both patient and how infection
ratio varies. In dataset there is no other feature which may appear. It is because only data of
infection rate was included in it.
CONCLUSION
It has been concluded that there are different types of software available for data
visualisation that is SPSS, Tableau, Datawrapper, etc. besides that, there are various types of data
such as nominal, ordinal, etc. and methods for display data types like histogram, maps, boxplots,
etc. the problems with visualisation are oversimplification of data, overreliance on visuals, etc.
11
Document Page
12
1 out of 12
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]