Statistics - Study on Statistical Classifications and Analysis
Verified
Added on 2023/04/25
|9
|1677
|299
AI Summary
This study highlights the development of statistical classifications on the given amount of data. It covers categorical and numerical data, descriptive statistics, probability tree, misconceptions of figures given, and biasedness of data. The study is relevant for students pursuing statistics courses in college or university.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
Running head:Statistics Statistics Name of the course Name of Student Course ID:
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
1 Statistics Table of Contents Introduction................................................................................................................................2 1a) Variables..............................................................................................................................2 2) Probability tree.......................................................................................................................5 3) Misconceptions of the figures given......................................................................................6 4) Biasedness of the data............................................................................................................6
2 Statistics Introduction The study is going to highlight the development of the statistical classifications on the given amount of the data. Through the development of categorical data and numerical data, it is important for the study to highlight the importance of the statistical analysis that will help in the classification of the data. The distribution of the data in the categorical and numerical data is going to help in the development of better understanding of the factors that are mainly present for the incidence of mercury in the hair of fisherman staying in the region of Kuwait. The descriptive statistics will help in the development of mean, median and mode and will help in the development of statistical modelling. The classification of the statistical modelling will also help in the development of better statistical incidence. Through the development of better weights of the variables that has been taken under consideration. 1a) Variables Category of the variables of the model Continuous numerical data is that kind of variable, whose counts is infinite in nature. These kinds of data generally important in case of modelling as it helps in making a regression that actually helps in the modelling taking the data. Through the development of category, it is possible to identify the development of the models. In the given data fisherman.xlsx, generally age, height and weight are generally considered as the numerical data as we can count the data. However, these variables are continuous in the sense that these variables do contain decimal places(Bostet al.2015). On the other hand, variables in the form of residence time in full years, number of fish meals per weak, is the numerical discrete variable. This is because of the fact that the values of this variable can be counted. The number of years spent in residence will always be finite in number. The categorical variable always counts the number of responses in binomial responses and the responses are expressed in terms of either yes or no and 0 or 1(Breiman,2017). Categorical ordinal is going to highlight the count of the data for the variable expressed in terms of 0 or 1. b) Descriptive statistics weight Mean73.2
3 Statistics Standard Error0.57 Median73 Mode70 Standard Deviation6.67 Sample Variance44.54 Kurtosis-0.17 Skewness0.35 Range33 Minimum59 Maximum92 Sum9876 Count135 Confidence Level (95.0%)1.14 Table 1: Descriptive statistics of the variable weight In the given table, the descriptive statistics of the variable weight. The mean of this variable weight is 73.2 and the median is around 73. The standard deviation is around 6.67. The upper quartile, lower quartile and the IQR for the variable weight is given by following table. Q168 Q273 Q377 IQR9 Table 2: Upper, lower and IQR for weight The Q3 is upper quartile in the above table is 77. The second quartile is the median and in the above data, the Q3 is the higher median that is separating the whole data set into two halves but in the ratio in 3:1 ratio. From the given table, it is reflecting that about 77% of the weight of the fisherman lies in the bracket of 77. Q1 is separating the data into 1:3 ratio. This
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
4 Statistics signifies the first half of the whole data set. The IQR or the inter-quartile range is showing the difference between upper and lower quartile. c) In the above data set, the value of the 80 percentile is 78. The significant factor behind this 80thpercent is that about 20% of data in the age value is lying above this value. Through the calculation of the percentile the study has been able to successfully highlight the number of data that is lying above the variable.
5 Statistics 2) Probability tree a) C300/503 A 3444/500 0 B663/5000 D512/663 E24/63 F160/327 Table 3: Probability tree The probability tree is showing the fact that the values of the probability tree through which the study will be able to find the areas where the values are lying(Gómezet al.2016). It is highly important in nature as it helps in the identification of probability that will not only help the determination of mutually exclusive events but will also determine the development and will identify other branches of the probability. Through the development of probability tree, it is important for the whole study to identify b) Probability (ethnicity is pasifica) OneTwo or moreRow Total Asian203300503 European31652793444 Maori512151663 MEAA243963 Pasifika167160327 Column Total40719295000 The probability is 167/327 ii) P (Two or more languages spoken|Ethnicity is Maori) is 151/663 iii) P (Two or more languages spoken\Ethnicity is Maori) is 151/929 iii) P (One language spoken) is 4071/5000
6 Statistics iv) P (Two or more languages spoken) is 929/5000
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
7 Statistics 3) Misconceptions of the figures given a) In the above diagram it can be seen that trend line which is drawn over the time series data is not so relevant in nature. This is because of the fact that in the initial phases of development, there is a huge gap that is lying on the celcious data. However, it will not be highlight the true ups and down of the celcious (Haqueet al.2016). This is important in the sense that through the development of the proper trend line it will be possible to predict the values of the data. It will be helping in the improvement of the long run policies that will not only increase the development of the long run policies regarding the formation of the celcious data. Through the incorporation of smooth trend, the economic model formation will be easy. b) The above picture is not right to predict the average height of the men in the world, because of the fact that the height of the man will depend entirely on the geographical boundaries and the nature of the place. Through the development of this study, it is important to take samples in the particular place so that the sample does not give biased results. In order to improve the study it is important to increase the development of the sample size so that the study will be able to highlight the correct figure. c) The above data is a cross sectional data. In order to make a comparison of the time bound of the number of vehicles stolen it is important to incorporate the time or year under consideration in the horizontal axis. Through the development of better design of the study, it is important to identify the development of both cross sectional and time series data. 4) Biasedness of the data This method of the data collection will be of no use since the respondents who are asked in this study actually travels by car and they will not be able to understand the purpose of building the bus stop. In order to build the importance of the study it is important to ask the daily commuters those who daily travels by bus.
8 Statistics Reference list Bost,R.,Popa,R.A.,Tu,S.andGoldwasser,S.,2015,February.Machinelearning classification over encrypted data. InNDSS(Vol. 4324, p. 4325). Breiman, L., 2017.Classification and regression trees. Routledge. Gómez, C., White, J.C. and Wulder, M.A., 2016. Optical remotely sensed time series data for landcoverclassification:Areview.ISPRSJournalofPhotogrammetryandRemote Sensing,116, pp.55-72. Haque, A., Khan, L. and Baron, M., 2016, February. Sand: Semi-supervised adaptive novel class detection and classification over data stream. InTHIRTIETH AAAI Conference on Artificial Intelligence. Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M. and Guibas, L.J., 2016. Volumetric and multi-view cnns for object classification on 3d data. InProceedings of the IEEE conference on computer vision and pattern recognition(pp. 5648-5656). Salamon, J. and Bello, J.P., 2017. Deep convolutional neural networks and data augmentation for environmental sound classification.IEEE Signal Processing Letters,24(3), pp.279-283. Tennant, M., Stahl, F., Rana, O. and Gomes, J.B., 2017. Scalable real-time classification of data streams with concept drift.Future Generation Computer Systems,75, pp.187-199. Wong, S.C., Gatt, A., Stamatescu, V. and McDonnell, M.D., 2016, November. Understanding data augmentation for classification: when to warp?. In2016 international conference on digital image computing: techniques and applications (DICTA)(pp. 1-6). IEEE.