# Analysis of Infection Rate of COVID-19 in White and BAME Patients

This report analyzes the infection rate of COVID-19 in white and BAME patients in the UK. It includes a detailed data set and hypothesis testing using statistical tests.

ACCB4002 Data Analysis and Visualization

Contents

INTRODUCTION

MAIN BODY

Analysis

Practical application and deployment

CONCLUSION

REFERENCES

INTRODUCTION

The report is based on analysis of infection rate of COVID– 19 are varied between white people

and BAME in the UK. The data is based on 50 different white and BAME patients. The report

contains detailed given data set in order to determine which types of patients affected more from

COVID 19. In order to do so various kinds of hypothesis are prepared that are tested from a

suitable statistical testes.

MAIN BODY

Analysis

Description of data set.

The given data set is about 50 people who are from different age group. The age is between 20 to

50 years. In the data set, infection rate of various kinds of people who are of white and BAME is

mentioned. The objective of data set is to assess ratio of infection rate of both types of people

including white and BAME (Black, Asian and Minority Ethnic) people. Below descriptive

analysis of given data is done in such manner:

The analysis of historical data to help explain improvements that have happened in an

organization is descriptive analytics (Borrill, Ramirez-Gonzalez and Uauy, 2016). The use of a

set of historical data to make parallels is represented in Descriptive analytics.

Descriptive statistics:

Statistics

White

patient BAME

N Valid 50 50

Missing 0 0

Mean 36.180 36.740

Median 36.000 34.500

The analysis of historical data to help explain improvements that have happened in an

organization is descriptive analytics (Borrill, Ramirez-Gonzalez and Uauy, 2016). The use of a

set of historical data to make parallels is represented in Descriptive analytics.

White patient

Frequency Percent

Valid

Percent

Cumulative

Percent

Valid 20.0 4 8.0 8.0 8.0

22.0 2 4.0 4.0 12.0

25.0 2 4.0 4.0 16.0

26.0 1 2.0 2.0 18.0

27.0 2 4.0 4.0 22.0

28.0 1 2.0 2.0 24.0

29.0 1 2.0 2.0 26.0

30.0 2 4.0 4.0 30.0

32.0 1 2.0 2.0 32.0

33.0 4 8.0 8.0 40.0

34.0 2 4.0 4.0 44.0

35.0 3 6.0 6.0 50.0

37.0 2 4.0 4.0 54.0

38.0 2 4.0 4.0 58.0

40.0 4 8.0 8.0 66.0

42.0 1 2.0 2.0 68.0

43.0 4 8.0 8.0 76.0

44.0 3 6.0 6.0 82.0

45.0 1 2.0 2.0 84.0

Development of appropriate hypothesis:

Hypothesis- A research hypothesis is a real, direct, and verifiable premise or statistical assertion

on the possible consequence of an academic research analysis based on a community specific

resource, such as assumed discrepancies between communities on a single factor or correlations

(Hammersley, 2016).

H0: There is significance difference between infection rate in white patients and BAME.

Hypothesis- A research hypothesis is a real, direct, and verifiable premise or statistical assertion

on the possible consequence of an academic research analysis based on a community specific

resource, such as assumed discrepancies between communities on a single factor or correlations

(Hammersley, 2016).

H0: There is significance difference between infection rate in white patients and BAME.

H1: There is no significance difference between infection rate in white patients and BAME.

H0: There is relation between infection rate in white patients and BAME.

H1: There is no relation between infection rate in white patients and BAME.

Techniques to perform the analysis:

There are range of methods and techniques to perform the analysis and ANOVA test is one of

the useful methods to measure significant difference in a particular data set. Apart from this,

correlation analysis is also an essential method to assess relation between given data set of white

patients and BAME.

One sample T-test-A one-sample t-test is being used to check if a data obtained is

considerably different from any significance level. Every makes statements on how the

confidence interval meaning μ is linked to some significance level M. A t-test is a type of

descriptive analysis that is used to assess whether there is a substantial contrast between

the different measures that can be attributed to certain characteristics (McCormick and

Salcedo, 2017). It is used if the data sets live up to expectations and may have unexpected

differences, such as the sample group recorded as a consequence of 100 times spinning

the dice. A t-test was used as a principle assessment tool, which enables testing of even

an assumption unique to a community.

Correlation- In statistics, any statistical association, whether reciprocal or not, between

two or more variables or multivariate data is association or dependency. Correlation is

any clear relationship in the broadest terms, although it generally refers to the extent with

which a set of parameters are linearly connected (Zuo, Carranza and Wang, 2016).

Correlations are useful since a statistical association that can be manipulated in action can

be suggested. For instance, show the correlation between generation capacity and

weather, an electricity utility could generate less capacity on a mild day. There is indeed

a causal correlation in this case, since severe weather induces individuals to use more

At the level of 0.01:

One-Sample Statistics

Statistic

Bootstrapa

Bias Std. Error

90% Confidence

Interval

Lower Upper

White

patient

N 50

Mean 36.180 .029 1.322 34.041 38.499

Std. Deviation 9.4192 -.1412 .7189 8.1234 10.4846

Std. Error

Mean 1.3321

BAME N 50

Mean 36.740 -.016 1.314 34.561 38.820

Std. Deviation 9.3171 -.1231 .5529 8.2384 10.0995

Std. Error

Mean 1.3176

a. Unless otherwise noted, bootstrap results are based on 1000 bootstrap samples

One-Sample Test

Test Value = 0

t df

Sig. (2-

tailed)

Mean

Difference

99.9% Confidence Interval

of the Difference

Lower Upper

White

patient 27.161 49 .000 36.1800 31.517 40.843

BAME 27.883 49 .000 36.7400 32.128 41.352

Testing of hypothesis two:

Correlation analysis-

Correlations

White

patient BAME

White

patient

Pearson

Correlation 1 .072

Sig. (2-tailed) .621

N 50 50

BAME Pearson

Correlation .072 1

Sig. (2-tailed) .621

N 50 50

Interpretation of results.

Descriptive analysis- In terms of above done descriptive analysis of above data set, this can be

inferred that value of mean is of 36.18 while standard deviation is of 9.42. Thus, this can be

stated that there is no relation between infection rate of white patients and BAME. This is so

because if value of mean and standard deviation varies from each other than it can be inferred

that data sets are not closed to each other.

Hypothesis one:

At level of 0.10- On the basis of above performed one sample t – test this can be stated that value

of significance difference or P is of 0.00 which is lower than 0.05. Thus, it can be interpreted that

null hypothesis is true and there is significance difference between infection rate in white

patients and BAME at the level of 0.10.

At level of 0.01- Similar to this, at the level of 0.01 it can be stated that value of p is same which

is of 0.001 that is less than 0.05. Hence, this can be assessed that null hypothesis is correct and

REFERENCES

Borrill, P., Ramirez-Gonzalez, R. and Uauy, C., 2016. expVIP: a customizable RNA-seq data

analysis and visualization platform. Plant physiology, 170(4), pp.2172-2186.

Hammersley, A.P., 2016. FIT2D: a multi-purpose data reduction, analysis and visualization

program. Journal of Applied Crystallography, 49(2), pp.646-652.

McCormick, K. and Salcedo, J., 2017. SPSS statistics for data analysis and visualization. John

Wiley & Sons.

Zuo, R., Carranza, E.J.M. and Wang, J., 2016. Spatial analysis and visualization of exploration

geochemical data. Earth-Science Reviews, 158, pp.9-18.

Nelson, J.W., Sklenar, J., Barnes, A.P. and Minnier, J., 2017. The START App: a web-based

RNAseq analysis and visualization resource. Bioinformatics, 33(3), pp.447-449.

