Descriptive Analytics And Visualisation Assignment 2022
VerifiedAdded on 2022/09/21
|12
|1725
|16
Assignment
AI Summary
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
DESCRIPTIVE ANALYTICS AND VISUALISATION
ASSIGNMENT ONE
STUDENT ID:
[Pick the date]
ASSIGNMENT ONE
STUDENT ID:
[Pick the date]
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Introduction
The objective of the given report is to present the findings of the data analysis of the provided
sample data of 200 people considering the questions raised in the email sent by Edmond
Kendrick. The analysis of the given sample data has been carried out using the appropriate
descriptive and inferential statistical tools so as to derive useful conclusions about the
population. It has been assumed that the data provided is random and representative of the
underlying target population.
Discussion
Based on the results obtained from the statistical analysis, the various claims highlighted in
the email have been addressed in the chronological order.
Question 1
a) The objective is to estimate if there is significant difference in the proportion of “MILD”or
“MEDIUM” by gender. In order to test this claim, the suitable hypothesis test would be
two sample mean proportion test. A two tail z test has been conducted. Table 1 in
Appendix 1 indicates that the null hypothesis would be rejected. This would indicate that
we can conclude with 95% confidence that the proportion of “MILD” or “MEDIUM”
claims differ significantly across the two genders (male and female).
b) The objective is to determine if the given sample data lends support to the claim that
representation of a claimant by private attorney leads to higher claim amount on average.
The appropriate hypothesis test to be conducted here would two sample independent t test.
A t test has been preferred ahead of z test as the standard deviation for the true population
of the two samples is unknown. A right tail t test has been conducted assuming unequal
variance for the two data sets. Table 2 in Appendix 1 indicates that the null hypothesis
would be rejected. This implies that the claim regarding higher claim amount on average
for those claimant who are represented by private attorney seems true.
c) The key objective is to determine the validity of the claim that private attorney
representation is higher for severe cases as compared to medium cases. A two sample
proportion test has been conducted. A right tail z test has been done here. Table 3 in
Appendix 1 indicates the failure to reject the null hypothesis at 5% level of significance.
The objective of the given report is to present the findings of the data analysis of the provided
sample data of 200 people considering the questions raised in the email sent by Edmond
Kendrick. The analysis of the given sample data has been carried out using the appropriate
descriptive and inferential statistical tools so as to derive useful conclusions about the
population. It has been assumed that the data provided is random and representative of the
underlying target population.
Discussion
Based on the results obtained from the statistical analysis, the various claims highlighted in
the email have been addressed in the chronological order.
Question 1
a) The objective is to estimate if there is significant difference in the proportion of “MILD”or
“MEDIUM” by gender. In order to test this claim, the suitable hypothesis test would be
two sample mean proportion test. A two tail z test has been conducted. Table 1 in
Appendix 1 indicates that the null hypothesis would be rejected. This would indicate that
we can conclude with 95% confidence that the proportion of “MILD” or “MEDIUM”
claims differ significantly across the two genders (male and female).
b) The objective is to determine if the given sample data lends support to the claim that
representation of a claimant by private attorney leads to higher claim amount on average.
The appropriate hypothesis test to be conducted here would two sample independent t test.
A t test has been preferred ahead of z test as the standard deviation for the true population
of the two samples is unknown. A right tail t test has been conducted assuming unequal
variance for the two data sets. Table 2 in Appendix 1 indicates that the null hypothesis
would be rejected. This implies that the claim regarding higher claim amount on average
for those claimant who are represented by private attorney seems true.
c) The key objective is to determine the validity of the claim that private attorney
representation is higher for severe cases as compared to medium cases. A two sample
proportion test has been conducted. A right tail z test has been done here. Table 3 in
Appendix 1 indicates the failure to reject the null hypothesis at 5% level of significance.
Hence, it can be concluded with 95% confidence that the industry claim regarding higher
representation of severe claims by private attorneys is not true.
Question 2
a) The claim to be tested is that the percentage of “SEVERE” claims for orthopaedic
surgeons ais lower in comparison to all other specialists. The two sample proportions test
has been used to test the above claim. A left tail z test has been performed on the sample
data. Table 1 in Appendix 2 indicates the the failure to reject the null hypothesis at 5%
level of significance. Hence, it can be concluded with 95% confidence that you claim
regarded lower percentage of “SEVERE” claims for orthopaedic surgeons is not supported
by the sample data.
b) The claim to be tested is that average claim amount for “SEVERE” claim is higher for
orthopaedic in comparison with other specialists. The appropriate hypothesis test to be
conducted here would two sample independent t test. A t test has been preferred ahead of z
test as the standard deviation for the true population of the two samples is unknown. A
right tail t test has been conducted assuming unequal variance for the two data sets. Table
2 in Appendix 2 indicates that the null hypothesis would not be rejected. This implies that
your assertion does not find support from the given sample data.
Question 3
a) The objective is to estimate if the claim amount tends to differ across the different marital
status of the claimants. Since there are four categories of marital status for the claimants,
hence t based approach is quite exhaustive. Hence, for each of these four categories, 95%
confidence interval for mean claim amount has been constructed. If there is no difference
in the average claim amount by marital status of claimants, then the confidence intervals
of atleast two categories would overlap. However, Table 1 of Appendix 3 indicates that
there is no overlap in the confidence interval of divorced and widowed. This implies that
average claim amount is driven by the marital status of claimants.
b) The objective is to estimate if the claim amount tends to differ across the different surgeon
specialities. Since there are four categories of surgeon specialists, hence t based approach
representation of severe claims by private attorneys is not true.
Question 2
a) The claim to be tested is that the percentage of “SEVERE” claims for orthopaedic
surgeons ais lower in comparison to all other specialists. The two sample proportions test
has been used to test the above claim. A left tail z test has been performed on the sample
data. Table 1 in Appendix 2 indicates the the failure to reject the null hypothesis at 5%
level of significance. Hence, it can be concluded with 95% confidence that you claim
regarded lower percentage of “SEVERE” claims for orthopaedic surgeons is not supported
by the sample data.
b) The claim to be tested is that average claim amount for “SEVERE” claim is higher for
orthopaedic in comparison with other specialists. The appropriate hypothesis test to be
conducted here would two sample independent t test. A t test has been preferred ahead of z
test as the standard deviation for the true population of the two samples is unknown. A
right tail t test has been conducted assuming unequal variance for the two data sets. Table
2 in Appendix 2 indicates that the null hypothesis would not be rejected. This implies that
your assertion does not find support from the given sample data.
Question 3
a) The objective is to estimate if the claim amount tends to differ across the different marital
status of the claimants. Since there are four categories of marital status for the claimants,
hence t based approach is quite exhaustive. Hence, for each of these four categories, 95%
confidence interval for mean claim amount has been constructed. If there is no difference
in the average claim amount by marital status of claimants, then the confidence intervals
of atleast two categories would overlap. However, Table 1 of Appendix 3 indicates that
there is no overlap in the confidence interval of divorced and widowed. This implies that
average claim amount is driven by the marital status of claimants.
b) The objective is to estimate if the claim amount tends to differ across the different surgeon
specialities. Since there are four categories of surgeon specialists, hence t based approach
is quite exhaustive. Hence, for each of these four categories, 95% confidence interval for
mean claim amount has been constructed. If there is no difference in the average claim
amount by surgeon specialists, then the confidence intervals of atleast two categories
would overlap. However, Table 2 of Appendix 3 indicates that there is overlap in the
confidence interval for all the four surgeon specialities. This implies that average claim
amount is not driven by the underlying surgeon specialities.
c) The objective is to estimate if the average proportion of claimants represented by private
attorney tends to differ across the different marital status of the claimants. Since there are
four categories of marital status for the claimants, hence t based approach is quite
exhaustive. Hence, for each of these four categories, 95% confidence interval for mean
claim amount has been constructed. If there is no difference in the average proportion by
marital status of claimants, then the confidence intervals of atleast two categories would
overlap. However, Table 3 of Appendix 3 indicates that there is overlap in the confidence
interval of all four categories. This implies that average proportion of claimants
represented by private attorney does not differ by marital status.
Question 4
A pivot table has been drawn which considers the impact of private attorney representation
and insurance on the amount claimed. Based on that pivot table obtained from the data in the
experiment tab, a graph has been obtained which is shown as Graph 1 in the Appendix 4.
Based on this graph, it is apparent that in the given sample data, for every category of
insurance, average claim amount is higher when there is private attorney representing the
claimant. Also, the claim amount tends to vary across the insurance type is the representation
by private attorney is not considered. Based on Table 1 in the Appendix 4, claimants having
Medicare/Medicaid have the highest claim on average while those availing private insurance
have the lowest claim on average.
Question 5
As a data analyst, it is important that deadlines are met so that the clients can get their
analysis in a timely manner. If this is not done, then the analysis may lose relevance
considering that information has assumed immense significance in the current business
environment. In order to ensure timely delivery, I plan each task in an organised manner by
mean claim amount has been constructed. If there is no difference in the average claim
amount by surgeon specialists, then the confidence intervals of atleast two categories
would overlap. However, Table 2 of Appendix 3 indicates that there is overlap in the
confidence interval for all the four surgeon specialities. This implies that average claim
amount is not driven by the underlying surgeon specialities.
c) The objective is to estimate if the average proportion of claimants represented by private
attorney tends to differ across the different marital status of the claimants. Since there are
four categories of marital status for the claimants, hence t based approach is quite
exhaustive. Hence, for each of these four categories, 95% confidence interval for mean
claim amount has been constructed. If there is no difference in the average proportion by
marital status of claimants, then the confidence intervals of atleast two categories would
overlap. However, Table 3 of Appendix 3 indicates that there is overlap in the confidence
interval of all four categories. This implies that average proportion of claimants
represented by private attorney does not differ by marital status.
Question 4
A pivot table has been drawn which considers the impact of private attorney representation
and insurance on the amount claimed. Based on that pivot table obtained from the data in the
experiment tab, a graph has been obtained which is shown as Graph 1 in the Appendix 4.
Based on this graph, it is apparent that in the given sample data, for every category of
insurance, average claim amount is higher when there is private attorney representing the
claimant. Also, the claim amount tends to vary across the insurance type is the representation
by private attorney is not considered. Based on Table 1 in the Appendix 4, claimants having
Medicare/Medicaid have the highest claim on average while those availing private insurance
have the lowest claim on average.
Question 5
As a data analyst, it is important that deadlines are met so that the clients can get their
analysis in a timely manner. If this is not done, then the analysis may lose relevance
considering that information has assumed immense significance in the current business
environment. In order to ensure timely delivery, I plan each task in an organised manner by
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
breaking each task into small sub-tasks and determining the expected time requirement to
complete each sub-task. Further, I always keep buffer as there is a possibility of some urgent
work coming or getting stuck owing to which there could be time overruns in the original
plan. I try to stick to these plans and whenever there is any negative deviation, I try my level
best to manage the same during the project so that the original timelines are not pushed. This
implies that at times, I have to put in longer hours than expected but this is part of my job
responsibility. Further, I always take measures to reduce negative deviation from my
schedule as I believe this is a sign of inefficiency on my part. If I have to stretch beyond my
normal working hours, I typically view this as inefficiency or else I would have met my
targets and goals during the assigned time. This way I am constantly able to push myself and
deliver my work in a timely manner without compromising quality.
Conclusion
Based on the given sample data, it has been found that the proportion of “MILD” or
“SEVERE” claims tend to be dependent on the claimants gender. Also, the average claim
amount is higher when the claimants are represented by private attorneys. However, the
sample data does not support the claim that private attorney representation is higher for
“SEVERE” claims when compared with “MEDIUM” claims. Both the claims regarding
average amount in “SEVERE” claims being higher when orthopaedic surgeon is involved and
average percentage claims in “SEVERE” claims lower higher when orthopaedic surgeon is
involved were not supported by the sample data. Also, it can be concluded that the average
claim amount for the population of claimants would be influenced by their marital status but
not by the surgeon specialist involved in the claim. One limitation with regards to the above
conclusions is that they are correct with 95% confidence and hence a 5% likelihood exists for
the results to be wrong.
complete each sub-task. Further, I always keep buffer as there is a possibility of some urgent
work coming or getting stuck owing to which there could be time overruns in the original
plan. I try to stick to these plans and whenever there is any negative deviation, I try my level
best to manage the same during the project so that the original timelines are not pushed. This
implies that at times, I have to put in longer hours than expected but this is part of my job
responsibility. Further, I always take measures to reduce negative deviation from my
schedule as I believe this is a sign of inefficiency on my part. If I have to stretch beyond my
normal working hours, I typically view this as inefficiency or else I would have met my
targets and goals during the assigned time. This way I am constantly able to push myself and
deliver my work in a timely manner without compromising quality.
Conclusion
Based on the given sample data, it has been found that the proportion of “MILD” or
“SEVERE” claims tend to be dependent on the claimants gender. Also, the average claim
amount is higher when the claimants are represented by private attorneys. However, the
sample data does not support the claim that private attorney representation is higher for
“SEVERE” claims when compared with “MEDIUM” claims. Both the claims regarding
average amount in “SEVERE” claims being higher when orthopaedic surgeon is involved and
average percentage claims in “SEVERE” claims lower higher when orthopaedic surgeon is
involved were not supported by the sample data. Also, it can be concluded that the average
claim amount for the population of claimants would be influenced by their marital status but
not by the surgeon specialist involved in the claim. One limitation with regards to the above
conclusions is that they are correct with 95% confidence and hence a 5% likelihood exists for
the results to be wrong.
Appendices
Appendix 1
Table 1: Two sample proportion test
Appendix 1
Table 1: Two sample proportion test
Table 2: Two sample independent T test
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Table 3: Two Sample Proportion Mean Difference Test
Appendix 2
Table 1: Two sample proportion mean difference test
Table 1: Two sample proportion mean difference test
Table 2: Two sample independent T test
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Appendix 3
Table 1: 95% confidence interval for average amount by marital status
Table 2: 95% confidence interval for average amount by specialist surgeon type
Table 3: 95% confidence interval for proportion of claimants represented by private
attorney by specialist surgeon type
Table 1: 95% confidence interval for average amount by marital status
Table 2: 95% confidence interval for average amount by specialist surgeon type
Table 3: 95% confidence interval for proportion of claimants represented by private
attorney by specialist surgeon type
Appendix 4
Graph 1
Table 1
Graph 1
Table 1
1 out of 12
Related Documents
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.