Project 2: Statistical Analysis of Crime Data - Fall Semester 2024

Verified

Added on  2022/08/13

|7
|1234
|15
Homework Assignment
AI Summary
This project analyzes statistical concepts including level of measurement, central tendency, and dispersion using provided datasets. The student determines the level of measurement for various variables, calculates measures of central tendency (mean, median, mode) and dispersion (standard deviation), and interprets the results. The project includes frequency distributions, and interpretations, and explores the normal curve, skewness, and data distribution. The student provides interpretations of the findings, including the homogeneity of groups, score spreads, and the shape of data distributions, supported by statistical evidence and output from data analysis. The project demonstrates an understanding of applying statistical methods to real-world data and interpreting the results.
tabler-icon-diamond-filled.svg

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
1 Project 2 — Level of Measurement, Central Tendency & Dispersion
[All items are 1 point unless otherwise noted.]
A. Level of Measurement:
1. At what level are the following variables measured?
a. How many times in the past month have you been drunk?
_________ times
b. How many of your friends have used marijuana or hashish?
1. none of them
2. very few of them
3. some of them
4. most of them
5. all of them
c. What is your employment status?
1. unemployed 2. employed
2. Choose any variable that is NOT a dichotomy (or a duplicate of those above) from any of the
data sets on the class website. Request a frequency distribution and the central tendency
statistics. What is the level of measurement of the variable? Copy and paste the appropriate
output with your interpretation of the results from the central tendency measures. (2 points)
Using the Race variable retrieved from the “crime seriousness dataset”
The above variable has three categories, which include White, African American, and Hispanice
thus the level of measurement used is nominal scale.
Measures of Central Tendency
Statistics
Race -- 3 category
N Valid 985
Missing 0
Mean 1.25
Median 1.00
Mode 1
Level is:
Ratio Scale
Level is:
Ordinal Scale
Level is:
Nominal scale
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Frequency distribution
Race -- 3 category
Frequency Percent Valid Percent Cumulative
Percent
Valid
White 843 85.6 85.6 85.6
African
American
42 4.3 4.3 89.8
Hispanic 100 10.2 10.2 100.0
Total 985 100.0 100.0
As evident, in the above table the whites (843) recorded the highest level of percentage (85.6%)
in relation to crime seriousness.
B. Measures of Central Tendency
1. What is the appropriate measure of central tendency for each of the three variables in 1A
above?
a. _Mean_____
b. _Median___
c. __Mode_____
2. What is the mode, mean, and median for the following data?
3 3 1 1 0 1 0 2 1 0 8 4 3 1 2 1 9 0 1 7 0 7 2 0 1
Mode is:__1___ Median is:__1__ Mean is:______
Mode = most repeated value = 1
Median= The middle number in ascending order = 1
0 0 0 0 0 0 1 1 1 1 1 1 1 1 2 2 2 3 3 3 4 7 7 8 9
Mean= x
N = 58
25 =2.32
Document Page
C. Measures of Dispersion
1. If the standard deviation (s) of one group is 3.8 and for a second group s = 2.3, which group is
more homogeneous?
As evident, the second group has less standard deviation of 2.3 thus it is the more homogenous
group.
2. With s = 2.5 and mean = 7, what is the 1 s score spread?
Spread=μ ±1 s=7 ±2.5=4.5 , 9.5
Therefore, the score spread at 1s is 4.5 and 9.5
3. With s = 3.5 and mean = 28, what is the 2 s score spread?
Spread=μ ± 2 s=28 ±23.5=28 ± 7=21 , 35
Therefore, the score spread at 2s is 21 and 35
4. Choose any continuous variable (from the class data sets) and print the frequency distribution
with appropriate dispersion and central tendency statistics. Copy and paste the appropriate output
with your interpretation of the 2 s score spread for your variable. (2 points)
Using the Age variable retrieved from the “crime seriousness dataset”
Descriptive Statistics
N Minimu
m
Maximu
m
Mean Std.
Deviation
Age of the
respondent
978 17 88 42.84 14.953
Valid N (listwise) 978
Statistics
Age of the respondent
N Valid 978
Missing 7
Mean 42.84
Median 40.00
Mode 30
The table above indicates that the respondents recorded a mean age of 42.84 years with a
standard deviation of 14.95 years. Moreover, the respondents reported a mode and median of 30
and 40 years respectively.
Document Page
Spread=μ ± 2 s=42.84 ± 214.95=42.84 ± 29.9=72.74 , 12.94
Therefore, the age spread at 2s is 12.94 and 72.74 years.
D. Normal Curve Interpretation
1. Examine the frequency distribution below. What can you say about inmate uncooperativeness?
Answer the two questions below. (2 points)
Variable: Level of uncooperativeness among inmates
UNCOOPERATIVENESS
Frequency Percent Valid
Percent
Cumulative
Percent
Valid
Not present 185 67.8 88.5 88.5
Very mild 13 4.8 6.2 94.7
Mild 7 2.6 3.3 98.1
Moderate 2 .7 1.0 99.0
Severe 1 .4 .5 99.5
Extremely
Severe 1 .4 .5 100.0
Total 209 76.6 100.0
Missing Blank 64 23.4
Total 273 100.0
Statistics
UNCOOPERATIVENESS
N Valid 209
Missing 64
Mean 1.21
Median 1.00
Mode 1
Std. Deviation .730
Skewness 4.900
Std. Error of
Skewness .168
Kurtosis 29.219
Std. Error of Kurtosis .335
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
a. Describe inmates’ level of uncooperativeness based on the data.
As shown it the table above among the 273 inmates, only 209 were included in the survey
whereby 185 reported no present of uncooperativeness thus 88.5% of the inmates recorded lack
of uncooperativeness.
b. Is level of uncooperativeness normally distributed? If not, how is the data shaped? Cite
evidence from your analysis to support your conclusion in your answer.
The level of uncooperativeness is not normally distributed rather is skewed to the right.
2. Choose any ordinal or higher variable with at least 10 values. Determine if it is normally
distributed. Cite statistical evidence to support your interpretation. Include the appropriate output
with your answer. (2 points)
Using the q8a variable (How serious ia a home burglary of a color TV) retrieved from the “crime
seriousness dataset”
Statistics
How serious is a home burglary of a color tv?
N Valid 802
Missing 183
Mean 6.69
Median 7.00
Mode 7
Skewness -.519
Std. Error of Skewness .086
Kurtosis -.040
Std. Error of Kurtosis .172
How serious is a home burglary of a color tv?
Frequency Percent Valid Percent Cumulative Percent
Valid
Not serious at all 5 .5 .6 .6
2 3 .3 .4 1.0
3 20 2.0 2.5 3.5
4 43 4.4 5.4 8.9
5 136 13.8 17.0 25.8
6 121 12.3 15.1 40.9
7 201 20.4 25.1 66.0
8 148 15.0 18.5 84.4
Extremely serious 125 12.7 15.6 100.0
Total 802 81.4 100.0
Missing -9 183 18.6
Total 985 100.0
Document Page
Notably, in a normal distribution, the tails on either side of the curve are of exact size. However,
when a distribution is skewed to the left, the tail on the curve’s left-hand side is longer than the
tail on the right-hand side, and the mean is less than the mode, the situation is known as negative
skewness (Narkhede, 2018). Similarly, the above table indicates that the mean 6.69 is less than
the mode 7 thus the variable is not normally distributed rather it is left skewed.
Document Page
References
Narkhede, S. (2018, June 6). Understanding Descriptive Statistics. Retrieved from Towards Data
Science: https://towardsdatascience.com/understanding-descriptive-statistics-
c9c2b0641291
chevron_up_icon
1 out of 7
circle_padding
hide_on_mobile
zoom_out_icon
logo.png

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]