University of East Anglia: Bootstrap Method and Literature Review

Verified

Added on 2022/12/27

AI Summary

This report critically examines the bootstrap method in comparison to the use of existing academic literature for predictive analysis, particularly within the context of risk management and trading. The paper begins by highlighting the issue of data snooping and its potential to introduce bias into predictive models. It then argues that the bootstrap method, with its use of large, randomly generated datasets and its lack of prior assumptions about data distribution, offers significant advantages over relying on historical data. Specifically, the report emphasizes the bootstrap method's ability to reduce bias, improve accuracy, and accommodate multiple repetitions in statistical calculations, while also automatically disregarding outlier data. The paper concludes by suggesting that the bootstrap method is a more reliable approach to financial modeling, especially when dealing with complex, non-linear problems.

BOOTSTRAP METHOD AND EXISTING LITERATURE
1
BOOTSTRAP METHOD AND EXISTING LITERATURE
Name of Institution:
Name of Student:
Date:

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

BOOTSTRAP METHOD AND EXISTING LITERATURE
2
Historically, statisticians have continuously used already existing data to predict future trends.
The approach commonly taken is to generate a null hypothesis about the parameter being tested
and at the same time generate a conflicting alternative hypothesis. In this kind of test, the
primary objective of the researcher is to reject the null hypothesis based on the inferences drawn
and subsequently fail to reject the alternative hypothesis.
The major challenge that arises from this approach to predictive data analytics is data snooping.
This is basically the likelihood by a researcher to be swayed towards the research findings from
the past research (Notes, 2019). This aspect leads to biasness in the findings and consequently in
the projections made. Bootstrapping was specificity adopted to help eliminate the effect of data
snooping when making predictions. This paper highlights the main disadvantages of using
already existing academic literature over the use of bootstrap in predictive analysis.
To start with, bootstrap uses a large, randomly generated dataset as the sample (Friedman, et al.,
2013). The large nature of the sample population and the fact that they are randomly generated
reduces the probability of bias in the findings generated after data analysis. This is an aspect that
is not we captured when already existing literature is used. As a result, a more accurate
prediction is like to be made if the bootstrap technique was to be used over the historical data.
Also, bootstrap does not make prior assumptions on the distribution of the sample data
(Sengupta, 2016). When historical data is used, it is mostly assumed that the sample distribution
follows a normal distribution with a mean, say zero. This makes the use of historical data
limiting thereby increasing the chances of getting biased results. The accuracy of the stock price

BOOTSTRAP METHOD AND EXISTING LITERATURE
3
projections using the bootstrap technique would, therefore, be higher than if available data would
be used.
The scope of the hypothesis used when historical data is used in most cases is limited. For
instance, when measuring how profitable a business is, the null hypothesis may be ‘’the mean
profit-after-tax equals zero’’ while the alternative hypothesis would indicate that the mean profit-
after-tax will be less than or equals to zero. The threshold for rejecting a null hypothesis when
the bootstrap technique is used is generally higher than the traditional approach is used, making
it more reliable.
Furthermore, the bootstrap technique allows for multiple repetitions in the calculation of the
required statistic(s) (Brownlee, 2018). When existing literature is used, the calculation is mostly
done once. Repetitive calculation of the statistic of interest and the use of the individual means to
estimate it makes the result more reliable and accurate. Normally, the minimum recommended
number of repetitions should range from 30 to 40 repetitions.
Lastly, when the bootstrap technique is used, it automatically disregards the outlier data from the
sample data set generated, a case not witnessed in the alternative technique under discussion.
When there are no outlier data in the sample us, the measures of central tendency for the data
used would be more accurate and factual than when they are used in the analysis. This aspect
makes bootstrap very unique and that explains why it has continuously been used in big data, a
continuously growing field.

BOOTSTRAP METHOD AND EXISTING LITERATURE
4
References
Brownlee, J., 2018. A Gentle Introduction to the Bootstrap Method. [Online]
Available at: https://machinelearningmastery.com/a-gentle-introduction-to-the-bootstrap-
method/
[Accessed 1 May 2019].
Friedman, N., Goldszmidt, M. & Wyner, A., 2013. Data Analysis with Bayesian Networks: A
Bootstrap Approach. Machine Learning.
Notes, L., 2019. Risk Management and Trading, East Anglia: s.n.
Sengupta, S., 2016. Statistical analysis of networks with community structure and bootstrap
methods for big data, Illinois: http://hdl.handle.net/2142/92763.