logo

STAT 193: Statistics in Practice Project Assignment

   

Added on  2023-06-04

8 Pages1723 Words272 Views
STAT 193 Statistics in Practice
Project Assignment
Question 1: The dataset
o Gender is a nominal, categorical variable. Because there is no natural order, only
categorized in Male or Female.
o Age is a continuous, numerical variable. The minimum value is 15 years and the
maximum value is 45 years.
o Ethnicity is a nominal, categorical variable. The possible values the variable takes are
European, Pacific, Maori, or Other.
o Marital is a nominal, categorical variable. The possible values the variable takes are
Married, Never, Previously or Other.
o Qualification is a nominal, categorical variable. The possible values the variable takes
are Degree, School, Vocational or None.
o PostSchool is a nominal, categorical variable. Because there is no natural order, only
categorized in Yes or No.
o Hours is a continuous, numerical variable. The minimum value is 2 and the maximum
value is 70 years.
o Income is a discrete, numerical variable. The minimum value is 11 and the maximum
value is 1789.
Question 2: Weekly Income
(a) Histogram:
1

The peak of this histogram veers to the left, hence the histogram’s tail has a positive skew to
the right. Therefore, the resulting distribution of weekly income is right-skewed. The vast
majority of New Zealanders aged 15-45 earn low weekly income, with very few earning high
weekly income. There seem to be probable outliers to the far right of the distribution.
(b) The point estimate of the mean weekly income of the population of New Zealanders aged
15-45 is given by the sample mean of weekly income which is given by: = 547.04. the
value was obtained from Excel using the function AVERAGE().
The sample standard deviation, s = 337.57
The interval estimate of the mean weekly income of population of New Zealanders aged
15-45 is given by the confidence interval, at 99% confidence level, is given by: μ ±
61.48= [485.59, 608.52].
P (485.59 < μ < 608.52) = 99%
2

Since, the sample size is greater than 30, the distribution of the random variable was
approximated with a normal distribution. In Excel, we used the function,
CONFIDENCE.NORM(alpha, standard deviation, sample size)
(c) The general distribution of the sample mean is normally distributed. that is,
with mean, μ and variance, σ2.
The sample in this case is said to be normally distributed because the sample size is large
(n > 30). Moreover, normal distribution is used to approximate many natural phenomena
so well. In a nut shell, the sample mean is calculated from an independent, identically
distributed random variable with a finite variance. Accordingly, based on the central limit
theorem, the sample mean has a normal distribution regardless of the distribution of the
population.
(d) Hypothesis testing
Let μ = to the average weekly income of New Zealand
We formulate the hypothesis test as:
Ho: μ = $986
Ha: μ $986
This is a two-tailed test. We the z-test to calculate the test statistics.
The significance level, α = 0.05. Hence, the critical values, = ± 1.96
{“=NORM.S.INV(0.025)”}
We calculate the test statistics as:
3

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Statistics in Practice
|14
|1201
|305

Report on Bio Statistics Data of Smoking
|5
|548
|70

Statistical Practice: One-sample t-test, Two-sample t-test in R
|9
|1412
|85

Statistics For Business Questions 2022
|5
|727
|23

Data and Business Decision Making
|15
|1220
|381

Comparative Analysis of Exam Scores using Boxplots, Histograms, F-test, Confidence Interval and Hypothesis Test
|8
|785
|125