Statistical Analysis of Drinking Water: Hypothesis Testing

Verified

Added on Ā 2019/09/16

|5
|889
|199
Homework Assignment
AI Summary
This assignment focuses on applying statistical techniques to analyze data related to zinc concentration in drinking water. The solution begins by differentiating between primary and secondary data, highlighting the methods of collection for each. It then introduces various statistical techniques, including frequency distribution, mean, standard deviation, and hypothesis testing. A case study on zinc concentration in bottom and surface water is presented, detailing the data collected from ten locations. The solution calculates frequency distributions, mean, and standard deviation using Excel. Furthermore, it performs a paired t-test to determine if the true average concentration in bottom water exceeds that of surface water. The hypothesis, significance level, critical value, test statistic, and conclusion are explicitly outlined, demonstrating the application of statistical methods to draw meaningful inferences from the data.
Document Page
Question
Submit data (either primary (survey) or secondary) and apply any of the statistical techniques.
You may start with the profile of the respondents using frequency distribution, solving for mean
and standard deviation, then testing for hypothesis using tests of difference or tests of
relationship.
Solution
Primary Data is the real time data which is collected by the researcher himself. It is also known
as first hand data. This type of data can be collected by conducting surveys, personal interview,
by means of questionnaires etc. It is available in crude form.
Secondary Data is the past data which is collected by someone else previously. This type of
data can be collected by means of articles, government publications, journals, websites, book
records etc. It is available in refined form.
Statistical Techniques are the formulas or the methods of collecting, summarizing and
analyzing and interpreting random data in numeric terms. They can be frequency distribution,
Mean, Median, Mode, Standard Deviation or Hypothesis testing.
Case: Drinking Water
It is known that metals in drinking water can affect its flavor. Also an unusually high
concentration can pose a health hazard. Ten pairs of data were taken measuring zinc
concentration in bottom water and surface water. The data collected is given below:
Location
1 2 3 4 5 6 7 8 9 10
Zinc concentration in
bottom water .430 .266 .567 .531 .707 .716 .651 .589 .469 .723
Zinc concentration in
surface water .415 .238 .390 .410 .605 .609 .632 .523 .411 .612
Frequency distribution is a mathematical method which calculates the number of instances in
which a variable takes each of its possible values. It is a table that shows the frequency of
different outcomes in a sample. Every single entry in the table comprises the frequency or count
of the occurrences of values within a precise group or interval, and in this manner, the table
summarizes the distribution of values in the sample.
The frequency distribution for the above data is calculated below:
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Location Zinc
concentration in
bottom water
Frequency (f) Zinc
concentration in
surface water
Frequency (F)
1 0.430 1 0.415 1
2 0.266 1 0.238 1
3 0.567 1 0.390 1
4 0.531 1 0.410 1
5 0.707 1 0.605 1
6 0.716 1 0.609 1
7 0.651 1 0.632 1
8 0.589 1 0.523 1
9 0.469 1 0.411 1
10 0.723 1 0.612 1
The Statistical Mean refers to the mean or average that is used to derive the central tendency of
the data in question. It is determined by adding all the data points in a population and then
dividing the total by the number of points. The resulting number is known as the mean or the
average.
Standard Deviation is a statistic used as a measure of the dispersion or variation in a
distribution, equal to the square root of the arithmetic mean of the squares of the deviations from
the arithmetic mean.
The Mean and Standard Deviation for the above data are calculated using excel with the help of
formulas shown below:
Document Page
The computed values of Mean and Standard Deviation for the above data calculated using excel
are shown below:
Hypothesis testing is an act in statistics whereby an analyst tests an assumption regarding a
population parameter. The methodology employed by the analyst depends on the nature of the
data used and the reason for the analysis.
Document Page
Does the data suggest that the true average concentration in the bottom water exceeds that of
surface water? In order to know this one need to perform a paired t test.
To perform a paired t-test, the following assumptions are made.
1. Is this a paired sample? - Yes.
2. Is this a large sample? - No.
3. Since, the sample size is not large enough (less than 30), we need to check whether the
differences follow a normal distribution.
The can be computed as shown below:
N Mean SD SE Mean
bottom 1
0 0.5649 0.1468 0.0464
surface 1
0 0.4845 0.1312 0.0415
Difference 1
0 0.0804 0.0523 0.0165
After obtaining and then performing a probability plot on the
differences, following graph is drawn.
Therefore, it can be concluded that the difference possibly will come from a normal distribution.
Step 1. Setting up the hypotheses as:
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Where, ’d’ is well-defined as the difference of bottom – surface.
Step 2. Writing down the significance level as .
Step 3. Determining the critical value and the rejection region as:
Step 4. Computing the value of the test statistic as:
Step 5. Check whether the test statistic falls in the rejection region and determine whether to reject
Ho.
Thus, one should reject H0.
Step 6. State the conclusion in words.
At , it can be concluded that, on average, the bottom zinc concentration is higher than the
surface zinc concentration.
chevron_up_icon
1 out of 5
circle_padding
hide_on_mobile
zoom_out_icon