This article discusses hypothesis testing for Desklib, an online library for study material with solved assignments, essays, dissertation, etc. It covers topics such as independent t-test, confidence interval, difference between two proportions, dependent t-test, and more.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
23 Answer 1: a)Units: 18 Treatment Variable: Choice of batteries – Energizer and Ultra-cell Response Variable: Playing time in hours Two independent groups have to be tested by independent t-test. Letμeandμube the average battery life of Energizer and Ultra-cell batteries. Summary statistics table is as follows, Table1: Descriptive Values for Both Batteries BatteriesMeanStd. deviationn Energizer8.27890.21749 Ultra-cell8.24330.16289
23 Independent t-test 1.μe-μu= the difference between average playing life time in hours of Energizer and Ultra-cell batteries. 2.Null Hypothesis: H0:μe-μu=0 3.Alternate Hypothesis: H1:μe-μu¿0 (two-tailed) 4.From summary statistics of the sample survey,xe − −xu − = 8.2789 - 8.2433 = 0.0356 5.S.E (xe − −xu − ) = 0.0905 (from t‐procedurestoolonCanvas) and the test statistic was calculatedusingtheformula t=estimate−hypothesisedvalue standarderror=0.3933as t=0.0356−0 0.0905=0.3933with 9 degrees of freedom. 6.P-value =P(|t|>0.3933)=0.7044(from t‐procedurestoolonCanvas) Figure1: Critical Region for |t|=0.3933 7.P-value interpretation: The p-value was greater than 0.05, and there was not enough support of H1 against the null hypothesis H0. The observed difference in average playing hours for the two type of batteries (D = 0.0356 hours) was not statistically significant at 5% level of significance. Hence, due to lack of sufficient evidence the null hypothesis
23 could not be rejected. Therefore, there was no statistically significant difference between averages of playing time due to both the batteries (Hinton, 2014). 8.Approximate confidence interval at 5% level of significance forμe-μuwas calculated as,CI=(μe−μu)±tmultiplier∗SE=0.0356±2.306∗0.0905=[−0.1731,0.2443]wheret- multiplier was obtained from t‐procedurestoolonCanvas. 9.Confidence Interval elucidation: With 95% probability or confidence it can be stated that the average battery hours of Energizer batteries would be approximately anywhere between 0.17 hours less than and 0.24 hours more than average battery hours of Ultra-cell batteries. The right hand limit of the confidence interval of 0.24 hours or approximately 15 minutes would be practically significant result. 10.Conclusion: From the sample data of 9 batteries of both the brands, not enough evidence was found to establish any significant difference in battery life for playing electronic game. Though, average battery life of Energizer batteries was greater than that of ultra- cell batteries, the difference was not statistically significant to opt for any particular brand. b)The true value of the parameter was the hypothesized difference in average battery life of Energizer and Ultra-cell batteries (which was zero). The value of the parameter was well within the confidence interval (at 5% level of significance), indicating that the conclusion from p-value was true. The null hypothesis could not be rejected.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
23 Answer 2 a)The sampling situation for the particular scenario was: One sample with multiple inclusive response categories. b)Thesamplingforscrutinizingthedifferencebetweentheestimatedproportionsof responses supporting legalizingcannabis- based products and feeling that the law should stay unchanged was tested by t-test for difference between two proportions. 1.Letpyandpndenote the proportions of responses supporting and refuting the legislation. Hence,py-pndenotes the difference of the two above mentioned proportions. 2.Null hypothesis: H0:py-pn= 0
23 3.Alternate hypothesis: H1:py-pn¿0(two tailed) 4.Estimated difference:py ^¿ ¿-pn ^¿ ¿= 384 500−79 500=0.61 5.The test statistic formula used t=estimateddifference−hypothesizeddifference standarderror For estimated difference = 0.61 and hypothesized difference = 0, the value of the standard error was calculated using t‐procedurestoolonCanvas as S.E (py ^¿ ¿-pn ^¿ ¿) = 0.0333 at 5% level of significance. So the test statistic was calculated as t=0.61−0 0.0333=18.3183with degrees of freedom =∞ 6.P-value =P(|t|>18.3183)=0.0000(from t‐procedurestoolonCanvas) Figure2: Rejection Region for |t|=18.3183 7.Interpretation of p-value: The p-value was greater than 0.05 at 5% level of significance and there was enough evidence in favor of the alternate hypothesis against the null hypothesis. Hence, at 5% levelofsignificance,evidencesfromthedifferenceinproportionsof adultNewZealandersin support and against the legislation of cannabis‐basedproducts usage formedicinalpurposes were sufficient to reject the null hypothesis.
23 8.Theconfidenceintervalwascalculatedas CI=¿¿ where t-estimate = 1.96 was obtained from t‐procedurestoolonCanvas. 9.Interpretation of Confidence Interval: The estimated value of difference between the proportions of viewsin support and in against of adultNewZealanders, with 95% confidence, should somewhere between 0.5447 and 0.6752. The limits also signified that viewsin support were higher than views in against by 54.47% to 67.52%, indicating the practical significance of the confidence interval limits. 10.Conclusion: The claim in the null hypothesis was rejected, which signified that proportion of responses of adultNewZealanders in favor of the legislation was significantly different (greater) than that of the responses of adultNewZealanders against the legislation.
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
23 Answer 3: a)(i) The hypothesized value of difference in productivity score was zero. (ii) The estimated difference in productivity score (Case 1) was greater than (right hand side) the hypothesized value at 5% level of significance. The standard deviation of the sampling distribution (standard error) was 0.682, indicating that the estimated values of the difference in productivity scores were not close to the hypothetical value. Hence, the hypothesized value was observed to be outside the confidence interval of the estimated difference, which was constructed using the standard error. b)At 5% level of significance, all the cases excluding Case 3 demonstrated that the sample mean difference was statistically significant. c)(i) In Case 1 the sample mean difference was practically significant. (ii) In Case 4 and Case5 the sample mean differences were not practically significant. d)In Case 2 and Case 3 sufficient evidence of practically significant mean difference was not available. Hence, the nothing could be concluded from the confidence intervals of Case 2 and Case 3. e)The observed mean difference in Case 6 was statistically significant at 5% level of significance. With 95% confidence, it was inferred that the estimated mean difference of the two payout systems would be somewhere between 0.96 hours to 5.58 hours. If the actual difference of the means of the two pay-out systems would be as low as 0.96 hours and high as 5.58 hours. In both cases, the results would be practically insignificant considering the management’s decision. The model in Case 6 was statistically significant, but practically insignificant. Statistically, Bonus pay-out system would be preferred. But, considering the management’s consideration about the difference in productivity score the company would like to stick to its previous model for pay-out.
23 Answer 4 a)(i) Units were 12 short stories from mystery, ironic, and literary fields. (ii) Treatment was inclusion or exclusion of spoiler paragraph. (iii) Response variable was the enjoyment rating of the readers. b)(i) The graphs of the two treatments and their difference were constructed using iNZight and have been represented in Figure 3 and Figure 4.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
23 Figure3: Side-by-Side Box Plots for the Two Treatments with Confidence Interval Figure4: Box Plot for Difference in Treatment Scores with Confidence Interval
23 (ii) From Figure 3 side-by-side box plots, it was evident that median of enjoyment scores of the stories was greater in case of spoiler paragraph in front of the story. The median enjoyment score for stories with spoiler was nearly 7.0, whereas the median score of enjoyment for the stories without spoiler paragraph was near the 6.0 mark. With spoiler paragraph the distribution of enjoyment scores was highly left skewed compared to that of the scores without spoiler paragraph. The box plot of difference in the enjoyment scores were plotted and has been represented in Figure 4. The median of difference in enjoyment scores was around 0.5, and the distribution was observed to follow Gaussian distribution. c)The difference between average enjoyment scores for the two treatments was verified by t-test as follows. The descriptive values for both treatments have been provided in Table 2. Table2: Descriptive Values for Enjoyment Scores TreatmentMeanNStd. Deviation Std.Error Mean With spoiler6.21 7121.2202.3522 No spoiler5.72 5121.2563.3627 1.Parameter: Letμsandμnsare the average enjoyment scores for stories with and without spoiler paragraph. Hence,μs-μnsdenotes the difference in average enjoyment scores. 2.Null hypothesis: H0:μs-μns= 0 3.Alternate hypothesis: H1:μs-μns¿0(two-tailed) 4.Estimate:Thedifferenceinaverageenjoymentscoresfromthesampledatawas x − s−x − ns=6.217−5.725=0.4917
23 5.Test statistic:t= (x − s−x − ns)−0 std.Error=0.4917−0 0.1003=4.900where standard error SE = 0.1003 (from SPSS) with 11 degrees of freedom. 6.P-value =P(|t|>0.4917)=0.000(from SPSS) (Cronk, 2017) Figure5: Rejection Region for |t| = 4.9 7.Interpretation of P-value: There was very strong evidence against the null hypothesis, and average enjoyment score differences for 12 stories was found to be significantly different for two treatments. Average enjoyment score in case of spoiler paragraph added to the story was significantly (statistically) different (greater) than that of the stories without any spoiler paragraphs at 5% level of significance. 8.Confidence Interval: Approximate confidence interval at 5% level of significance forμs-μnswas calculated as, CI=(μs−μns)±tmultiplier∗SE=[0.2708,0.7125](from SPSS output). 9.Confidence Interval elucidation: With 95% probability or confidence it can be stated that the average enjoyment score of 12 stories with spoiler would be approximately anywhere between 0.27 less than and 0.71 more than average enjoyment score of 12 stories without spoiler. Both hand limits of the confidence interval would be practically significant result.
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
23 10.Conclusion: From the sample data of 12 stories for both the treatments, very strong evidence was found to establish significant difference in enjoyment score for reading stories. Average enjoyment score of stories with spoiler was greater than that of stories without spoiler. Hence, readers were enjoying reading stories with spoiler paragraph at front. d)The dependent variable was the difference of enjoyment scores from two treatments. The difference scores were continuous in nature. There were two categorical values of the treatment attribute. From Figure 4 it was noted that no outlier values were present. The distribution of difference of enjoyment scores seemed to be normally distributed from Figure 4. But, using Shapiro-Wilk test in SPSS it was established that the differences of enjoyment scores were not normally distributed (SW = 0.955, p = 0.712). Therefore, the fourth condition for dependent t-test was not satisfied (Kim, 2015).
23 Answer 5 a)(i) Units were 1000cyclistswhocompletedthe180kilometerride event in the years 2010 to 2017. (ii) Treatment was four age groups in the study. (iii) Response variable was the timetocompleteK2. b)(i) The graphs of the four treatments were constructed using iNZight and have been represented in Figure 5 and Figure 6.
23 Figure6: Distribution of Time to Complete K2 for Four Age Categories (ii) Time taken by the cyclists to complete the event of K2RoadCycleClassic was noted to increase with the age of the cyclist. The median hours to complete the event increased for older age groups. From the spread of the box plots and distribution pattern of the histograms is was evident that due to increase in age, the variation in completion time also increased. Outliers were also noted for each age group. c)SPSS output for F-test has been provided in Table 3. Table3: ANOVA Output from SPSS Time in hours Sum of SquaresdfMean SquareFSig. Between Groups64.327321.44228.104.000 Within Groups759.914996.763 Total824.241999
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
23 d)Assumptions: Event completion time of four age groups was independent of each other, completion time for the cyclists of all the four age groups was normally distributed, and the variances of the four age groups were significantly equal. e)Smallest standard deviation was found for the 18-34 age group (M = 6.62, SD = 0.762) and the highest standard deviation was found for the 55+ age group (M = 7.32, SD = 0.913). The ratio was calculated as 0.762 0.913=0.835. f)The completion time of the cyclists of four age groups was independent of each other. From Box plots of Figure 6, the symmetric nature of the distribution of completion times for the age groups were evident. From Shapiro Wilk test the distribution of completion times for the age groups were found to be normal (Appendix – Table 7). The assumption of homogeneity or equality of variances for event completion time of the four age groups was found to be true (L = 2.498, p = 0.058) (Appendix – Table 8) (Ross, & Willson, 2017). g)(i) Null hypothesis: Variances in average event completion time for the four age groups were equal H0:(σ1 2=σ2 2=σ3 2=σ4 2) (ii) Alternate hypothesis: Variances in average event completion time for the four age groups were significantly different. (iii) Difference in average time taken by the cyclists for completion of the event were statistically significant (F = 28.104, p < 0.05) for the four age groups at 5% level of significance. The youngest cyclists were found to be more agile in completing the event compared to the older age groups, and the trend was also practically significant. h)(i) From Tukey HSD post hoc analysis difference in average completion time between the age groups of 18-34 and 35-44 was not statistically significant (MD = - 0.336, p = 0.982) at 5% level of significance. The 95% confidence interval was found to be [-0.2665, 0.1992]. With 95% confidence, it was possible to claim that the average time (in hours) taken by the cyclists of age group 18-34 (years) was 0.27 hours less than, and 0.20 hours greater than that of the cyclists of 35-44 (years) age group (Murphy, Myors, & Wolach, 2014). (ii) At 5% level statistical significance in pair wise difference of average completion time to complete K2 was noted for the age groups of 18-34 and 45-44, 18-34 and 55+, 35-44 and 45-54, 35-44 and 55+, and 45-54 and 55+.
23 (iii) At 5% level of significance the slowest of the four age groups were the cyclists of age group of 55+ years. i)The cyclists took minimum of 5.45 hours and maximum of 11.29 hours to complete the K2 event. Age was a significant factor in completion time of the participants. Cyclists, aged between 18 years to 44 years were found to complete the event with almost identical average completion time. Cyclists, aged above 55 years were significantly slow compared to participants of other age groups. Though, in every age group some cyclists were found to complete the event with significant difference in completion time (outliers) (Mertler, & Reinhart, 2016).
23 a)(i) Scenario 1: Sex and Free Scenario 2: First and Spent Scenario 3: Age and Purchase Scenario 4: Quantity_Stone and Quantity_Other Scenario 5: Shop and Spent (ii) Table4: Variable Type Details VariableType SexCategorical AgeNumerical FirstCategorical SpentNumerical Quantity_StoneNumerical Quantity_OtherNumerical ShopCategorical FreeCategorical
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
23 PurchaseCategorical b)The tools that could be used for the scenarios have been provided as below. Table5: Tools for Different Scenarios ScenarioTool(s) Scenario 1 Table of counts Side-by-side bar charts of proportions Side-by-side stacked bar charts Scenario 2Side-by-side plots on the same scale Scenario 3Side-by-side plots on the same scale Scenario 4 Scatter plot Tile density plot Scenario 5Side-by-side plots on the same scale c)The analyses which can be used for investigating the scenarios have been provided in the following table.
23 Table6: Scenario with Test Matching ScenarioTest Scenario 1E Scenario 2D Scenario 3F Scenario 4C Scenario 5F
23 References Cronk, B. C. (2017).How to use SPSS®: A step-by-step guide to analysis and interpretation. Routledge. Hinton, P. R. (2014).Statistics explained. Routledge. Kim, T. K. (2015). T test as a parametric statistic.Korean journal of anesthesiology,68(6), 540- 546. Mertler, C. A., & Reinhart, R. V. (2016).Advanced and multivariate statistical methods: Practical application and interpretation. Routledge. Murphy, K. R., Myors, B., & Wolach, A. (2014).Statistical power analysis: A simple and general model for traditional and modern hypothesis tests. Routledge. Ross, A., & Willson, V. L. (2017). One-Way Anova. InBasic and Advanced Statistical Tests(pp. 21-24). SensePublishers, Rotterdam.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
23 Appendix: Table7: Normality Check of the Timing of the Cyclists – SPSS Outputs Age Category Kolmogorov-SmirnovaShapiro-Wilk StatisticdfSig.StatisticdfSig. Timein hours 18-34.115142.000.930142.000 35-44.122271.000.872271.000 45-54.094408.000.943408.000 55+.101179.000.958179.000 a.Lilliefors Significance Correction Table8: Levene’s Homogeneity Test of Variances