Analyzing Bodfish Cruise, Polling Errors - Statistical Homework

Verified

Added on 2023/06/14

AI Summary

This assignment provides a comprehensive analysis of statistical concepts through various problems. It addresses the Central Limit Theorem and its applicability based on sample size, followed by an examination of the Bodfish Lot Cruise case, determining the adequacy of sample size for confidence interval estimation. The assignment further summarizes key themes related to polling biases and inaccuracies, highlighting issues such as leading questions, framing effects, and unrepresentative sampling. It explores the impact of poor sampling and confirmation bias on poll results, offering insights into potential problems consumers should be mindful of when interpreting poll data. Lastly, it discusses the potential causes of polling errors in the 2016 Brexit and U.S. presidential election, attributing them primarily to unrepresentative sample populations and potentially leading questions. The analysis references relevant articles to support its conclusions.

1. Write your answer to conceptual problem C.1 on page 390 in the text.
It is not correct to assume the standard deviation is normally distributed when the
population (n) is too small. According to Albright and Winston, Central Limit Theorem
postulates that “when you sum or average n randomly selected values from any
distribution, normal or otherwise, the distribution of the sum or average is approximately
normal, provided that n is sufficiently large” (Albright & Winston, p. 322). Furthermore,
that population must be greater than or equal to 30. However, it’s important to note that
even in circumstances where the population includes 30 or more variables, “if the
population distribution is very nonnormal—extremely skewed…the normal
approximation might not be accurate unless [the population] is considered greater than
30”, yet if the “population distribution is…approximately symmetric”, then a population of
less than 30 variables may still be a normal distribution (Albright & Winston, p. 323).
Furthermore, if we do not know the population size, rather than assuming normal
distribution, we should instead estimate the standard deviation using the t-distribution.
2. Answer the questions at the end of Case 8.4 The Bodfish Lot Cruise (pp. 399-
400).
It is possible that the Bodfish cruise was a bit excessive. In order to get a normal
distribution of the data, we need a population or sample size of at least 30. In the case
of the Bodfish Lot Cruise, the sample size was almost three times more than the
suggested sample size of 30. When I analyzed the data using all 89 cruises lines, I
found that with a confidence level (mean) of 95%, the lower limit of the number of trees
in a plot was 34.67 while the upper limit was 44.4 trees. When I calculated the
confidence intervals for the sample using only the first 30 cruise lines, with a confidence
level (mean) of 95%, the lower limit was only 5.24 trees less than the sample population
of 89 cruises. The upper limit was only 2.37 trees more than the upper limit using 89
cruise lines. Furthermore, the sample mean only went down by one tree when I only
used 30 cruise lines. The similarity between these intervals and means suggests that
using only 30 cruise lines could save time and money in the decision-making process.
Ralph Butts could use a half of the sample size at about 44.5 cruise lines. This would
still exceed the suggested sample size of 30 variables. However, using only a quarter of
the cruise lines at 22.24 cruise lines may be a bit low and could affect the authenticity of
the data.
3. Summarize, in bullet format, five key themes from the following two articles:
Themes: Polls can become inherently biased when:

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

1. Questions are leading through multiple choice formats where all of the potential
answers could not be included. When these multiple-choice questions omit
answers like “neither” or “undecided” voters are not able to express their true
opinions (Johnson, 2016).
2. The ordering of questions is deliberately designed to affect decisions. Johnson
argues that “the order of questions, and whether certain negative or positive
issues are addressed before or after a series of questions, can also influence the
results” (Johnson, 2016).
3. Alberto argues that the “framing” that Johnson mentions does not affect poll
results at all. After an experiment proved that despite the order of questions the
results were the same, Alberto suggested that the biggest issue is not framing,
but rather expert sampling (Alberto, 2017).
4. Analysts or news agencies base trends on a single poll. Polls are “snapshots” of
the data and might not provide an accurate portrayal of the political climate and
public response (Johnson, 2016).
5. Samples are not representative enough. If samples are chosen just to meet a
quota rather than through a truly random process, data could be skewed
(Alberto, 2017).
How does poor sampling and confirmation bias affect poll results?
Poor sampling can lead to skewed results. As Alberto mentioned, in the “polling miss in
2015” the biggest issue what that the sample was “unrepresentative” of the population
with an over-representation of the “Labour supporters” and an under-representation of
the “Conservative supporters” (Alberto, 2017). This meant that the data was skewed to
show that the overall population leant toward ideologies held by the Labour party.
Skewing data by over-representing certain populations is an example of confirmation
bias. Perhaps the pollsters held beliefs that matched those of the Labour party and thus,
even sub-consciously, sought responses from people from the same party to confirm
that their ideologies were popular and representative of the population as a whole.
What potential problems should consumers of poll results be mindful of when taking a
poll? List at least 3.
1. Consumers of poll results should be aware that questions may be leading or
framed to influence thought processes. Consumers should be aware of this and
try to consider each question independently.
2. Questions could be purposefully confusing. Consumers should carefully read
questions to be sure that hidden meanings are not buried in the complexity of the
syntax.
3. If the poll is multiple choice, their actual opinion might not be included as an
option in the answer. For example, as an undecided voter, her or she might not
be able to say so if that answer has been omitted.
What questions should you ask when you hear poll results?

1. Was the sample population chosen randomly?
2. What are the demographics of the population sampled?
3. What are the survey questions and are they leading?
4. What is the margin of error on this poll?
5. Does the polling agency have political leanings?
4. During 2016 two major events seemed to have polling problems, including Great
Britain's decision to leave the European Union (Brexit), with polls showing
that G.B. would stay with the E.U., and the U.S presidential election, with polls
showing that H. R. Clinton was going to win the election by a comfortable margin.
Post your thoughts, based on the articles included, about potential causes of
these two polling errors.
I believe that the sample populations for these polls were likely not representative. If
we consider the demographics of conservative and liberal voters, we see a trend of
younger people identifying as liberal and older generations identifying as
conservative. Furthermore, younger generations tend to spend much more time
online on social media. I saw polls posted by news agencies like Meet the Press and
others on Facebook gauging the political climate, the public’s values, opinions of
candidates, et cetera. Perhaps the polls were skewed in favor of Clinton because the
polls reached more of the younger, liberal population than the conservative
population.
I have seen many people say the upsets were a result of people being to shy to tell
the truth on surveys or admit to their conservative values, but Alberto claims this is
not the case, and rather those people were likely never even surveyed. He explained
that in the polls, “most voting intention questions include “Don’t know”, “Undecided”
and Don’t want to tell” as response options”. Albert questions why anyone would lie
about who they would vote for when they could just say “Don’t know” or “I don’t want
to tell you”? (Albert, 2017, p. 3). This again supports my theory that the sample
populations were likely not representative.
Perhaps there was also some leading questions involved in polls. For example, if
options like “neither” were not offered in the polls, people would be forced to choose
the lesser of two evils in their opinion. This could skew the data as well. Perhaps that
person who said she favored Clinton in a poll actually voted for Gary Johnson on
election day. She couldn’t say that in the poll because Johnson was not an option, so
the expectation for Clinton’s number of voters was artificially inflated.
All of these could have caused the issues with the polls in Brexit and the 2016
election, but I believe that using an unrepresentative sample population was likely the
biggest issue.

References
Alberto, C. (2017). Why do polls keep failing everywhere?. Statistics Views. Retrieved
from
https://ubonline.ubalt.edu/access/content/group/1182PUAD628WB1/Discussion
%20Forum%20Articles/PUAD628%20Why%20do%20polls%20keep%20failing
%20everywhere.pdf
Albright, S.C., & Winston, W. (2015). Business analytics: Data analysis and decision
making (5th ed.). Delhi, India; Cengage Learning.
Johnson, J. (2016). Can you rig a presidential poll?. Huffington Post. Retrieved from
https://ubonline.ubalt.edu/access/content/group/1182PUAD628WB1/Discussion
%20Forum%20Articles/PUAD628%20Can%20you%20rig%20a%20poll
%20_question%20mark_.pdf

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

DISCUSSION POST
With regards to normal distribution while the given answer highlighted the need of
big enough sample, emphasis on the random sampling method and knowing the
population standard deviation is found missing. In the Bodfish Lot Cruise case,
while the approach is right but there have been errors in the determination of
requisite 95% confidence interval. Further, the discussion on the adequacy of
sample size should have been related to the confidence interval computed. In
relation to the problems of polls, negligence and inexperienced staff may also be
responsible for incorrect framing of questions. Also, the issues with sampling
designs should also been highlighted in the problems as it is a key issues.
However, the questions that the consumers of poll results must ask are quiet
impressive and exhaustive. Also, the discussion regarding the poll issues in the
two recent polls seems to have been answered in a comprehensive manner with
apt examples thus highlighting enhanced understanding.