Statistics - Study on Statistical Classifications and Analysis

Verified

Added on 2023/04/25

AI Summary

This study highlights the development of statistical classifications on the given amount of data. It covers categorical and numerical data, descriptive statistics, probability tree, misconceptions of figures given, and biasedness of data. The study is relevant for students pursuing statistics courses in college or university.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.

Running head: Statistics
Statistics
Name of the course
Name of Student
Course ID:

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

1
Statistics
Table of Contents
Introduction................................................................................................................................2
1a) Variables..............................................................................................................................2
2) Probability tree.......................................................................................................................5
3) Misconceptions of the figures given......................................................................................6
4) Biasedness of the data............................................................................................................6

2
Statistics
Introduction
The study is going to highlight the development of the statistical classifications on the
given amount of the data. Through the development of categorical data and numerical data, it
is important for the study to highlight the importance of the statistical analysis that will help
in the classification of the data. The distribution of the data in the categorical and numerical
data is going to help in the development of better understanding of the factors that are mainly
present for the incidence of mercury in the hair of fisherman staying in the region of Kuwait.
The descriptive statistics will help in the development of mean, median and mode and will
help in the development of statistical modelling. The classification of the statistical modelling
will also help in the development of better statistical incidence. Through the development of
better weights of the variables that has been taken under consideration.
1a) Variables
Category of the variables of the model
Continuous numerical data is that kind of variable, whose counts is infinite in nature. These
kinds of data generally important in case of modelling as it helps in making a regression that
actually helps in the modelling taking the data. Through the development of category, it is
possible to identify the development of the models. In the given data fisherman.xlsx,
generally age, height and weight are generally considered as the numerical data as we can
count the data. However, these variables are continuous in the sense that these variables do
contain decimal places (Bost et al. 2015). On the other hand, variables in the form of
residence time in full years, number of fish meals per weak, is the numerical discrete
variable. This is because of the fact that the values of this variable can be counted. The
number of years spent in residence will always be finite in number. The categorical variable
always counts the number of responses in binomial responses and the responses are expressed
in terms of either yes or no and 0 or 1 (Breiman, 2017). Categorical ordinal is going to
highlight the count of the data for the variable expressed in terms of 0 or 1.
b) Descriptive statistics
weight
Mean 73.2

3
Statistics
Standard Error 0.57
Median 73
Mode 70
Standard Deviation 6.67
Sample Variance 44.54
Kurtosis -0.17
Skewness 0.35
Range 33
Minimum 59
Maximum 92
Sum 9876
Count 135
Confidence Level
(95.0%) 1.14
Table 1: Descriptive statistics of the variable weight
In the given table, the descriptive statistics of the variable weight. The mean of this
variable weight is 73.2 and the median is around 73. The standard deviation is around 6.67.
The upper quartile, lower quartile and the IQR for the variable weight is given by following
table.
Q1 68
Q2 73
Q3 77
IQR 9
Table 2: Upper, lower and IQR for weight
The Q3 is upper quartile in the above table is 77. The second quartile is the median and in the
above data, the Q3 is the higher median that is separating the whole data set into two halves
but in the ratio in 3:1 ratio. From the given table, it is reflecting that about 77% of the weight
of the fisherman lies in the bracket of 77. Q1 is separating the data into 1:3 ratio. This

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

4
Statistics
signifies the first half of the whole data set. The IQR or the inter-quartile range is showing the
difference between upper and lower quartile.
c)
In the above data set, the value of the 80 percentile is 78. The significant factor behind
this 80th percent is that about 20% of data in the age value is lying above this value. Through
the calculation of the percentile the study has been able to successfully highlight the number
of data that is lying above the variable.

5
Statistics
2) Probability tree
a)
C 300/503
A
3444/500
0
B 663/5000
D 512/663
E 24/63
F 160/327
Table 3: Probability tree
The probability tree is showing the fact that the values of the probability tree through which
the study will be able to find the areas where the values are lying (Gómez et al. 2016). It is
highly important in nature as it helps in the identification of probability that will not only help
the determination of mutually exclusive events but will also determine the development and
will identify other branches of the probability. Through the development of probability tree, it
is important for the whole study to identify
b) Probability (ethnicity is pasifica)
One Two or more Row Total
Asian 203 300 503
European 3165 279 3444
Maori 512 151 663
MEAA 24 39 63
Pasifika 167 160 327
Column
Total 4071 929 5000
The probability is 167/327
ii) P (Two or more languages spoken | Ethnicity is Maori) is 151/663
iii) P (Two or more languages spoken \ Ethnicity is Maori) is 151/929
iii) P (One language spoken) is 4071/5000

6
Statistics
iv) P (Two or more languages spoken) is 929/5000

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

7
Statistics
3) Misconceptions of the figures given
a) In the above diagram it can be seen that trend line which is drawn over the time series data
is not so relevant in nature. This is because of the fact that in the initial phases of
development, there is a huge gap that is lying on the celcious data. However, it will not be
highlight the true ups and down of the celcious (Haque et al. 2016). This is important in the
sense that through the development of the proper trend line it will be possible to predict the
values of the data. It will be helping in the improvement of the long run policies that will not
only increase the development of the long run policies regarding the formation of the celcious
data. Through the incorporation of smooth trend, the economic model formation will be easy.
b) The above picture is not right to predict the average height of the men in the world,
because of the fact that the height of the man will depend entirely on the geographical
boundaries and the nature of the place. Through the development of this study, it is important
to take samples in the particular place so that the sample does not give biased results. In order
to improve the study it is important to increase the development of the sample size so that the
study will be able to highlight the correct figure.
c) The above data is a cross sectional data. In order to make a comparison of the time bound
of the number of vehicles stolen it is important to incorporate the time or year under
consideration in the horizontal axis. Through the development of better design of the study, it
is important to identify the development of both cross sectional and time series data.
4) Biasedness of the data
This method of the data collection will be of no use since the respondents who are asked in
this study actually travels by car and they will not be able to understand the purpose of
building the bus stop. In order to build the importance of the study it is important to ask the
daily commuters those who daily travels by bus.

8
Statistics
Reference list
Bost, R., Popa, R.A., Tu, S. and Goldwasser, S., 2015, February. Machine learning
classification over encrypted data. In NDSS (Vol. 4324, p. 4325).
Breiman, L., 2017. Classification and regression trees. Routledge.
Gómez, C., White, J.C. and Wulder, M.A., 2016. Optical remotely sensed time series data for
land cover classification: A review. ISPRS Journal of Photogrammetry and Remote
Sensing, 116, pp.55-72.
Haque, A., Khan, L. and Baron, M., 2016, February. Sand: Semi-supervised adaptive novel
class detection and classification over data stream. In THIRTIETH AAAI Conference on
Artificial Intelligence.
Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M. and Guibas, L.J., 2016. Volumetric and
multi-view cnns for object classification on 3d data. In Proceedings of the IEEE conference
on computer vision and pattern recognition (pp. 5648-5656).
Salamon, J. and Bello, J.P., 2017. Deep convolutional neural networks and data augmentation
for environmental sound classification. IEEE Signal Processing Letters, 24(3), pp.279-283.
Tennant, M., Stahl, F., Rana, O. and Gomes, J.B., 2017. Scalable real-time classification of
data streams with concept drift. Future Generation Computer Systems, 75, pp.187-199.
Wong, S.C., Gatt, A., Stamatescu, V. and McDonnell, M.D., 2016, November. Understanding
data augmentation for classification: when to warp?. In 2016 international conference on
digital image computing: techniques and applications (DICTA) (pp. 1-6). IEEE.

1 out of 9

+13062052269

info@desklib.com

Statistics - Study on Statistical Classifications and Analysis

Contribute Materials

Secure Best Marks with AI Grader

Secure Best Marks with AI Grader

Paraphrase This Document

Related Documents

Statistics: Variables, Descriptive Statistics, Probability Tree

Unit 31: Statistics for Management

(Solution) Statistics for Management- Assignment

Numeracy and Data Analysis: Calculation of Mean, Median, Mode, Range and Standard Deviation