Statistics: Nonparametrics Lab 6 - Density Estimation and Smoothing

Verified

Added on  2023/01/19

|9
|630
|28
Homework Assignment
AI Summary
This assignment focuses on non-parametric statistical methods, specifically density estimation and smoothing techniques using the R programming language. The first question involves analyzing the eruption times of the Kiama Blowhole, including fitting histograms using Sturge's, Scott's, and FD rules, estimating density using Gaussian and Epanechnikov kernels with different bandwidths, comparing kernel density estimates to a normal distribution, assessing normality, and testing the median eruption time. The second question involves analyzing body fat data, including plotting relationships between body fat and skinfold measurements, creating residual plots, applying different regression techniques (ksmooth and lowess), and bootstrapping confidence intervals for regression coefficients.
Document Page
Assessment
Question 1
(a). Rules for Number of Bins and Histograms
Rule Number of Bins
Sturge’s rule 7
Scott’s rule 6
FD rule 8
Histogram According to Sturge’s rule
Histogram According to Scott’s rule
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Histogram According to GD rule
(b). Density Plots using Kernel Density Estimators
Gaussian Kernel
NRD
Document Page
UCV
BCV
Document Page
Epanech-nikov kernel
NRD
UCV
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
BCV
(c). Comparison of Epanech-nikov kernel and bw=”ucv”. With the best fitting normal
The best fitting normal distribution is Gaussian for bw=”bcv”. Comparing this with Epanech-nikov kernel at
bw=”ucv”, we see that the normal distribution is a better fit for the eruption time data. This is because the
bandwidth in Gaussian plot is considerably larger compared to the bandwidth covered by the Epanech-nikov
kernel at bw=”ucv”. The normal distribution estimates that majority of the data is between 0 and 100 while the
Epanech-nikov kernel plot estimates that majority of the data is between 0 and 50.
(d). Normality Assessment
Document Page
Yes, I believe the data is normally distributed even through it is a bit skewed to the right.
(e). Testing that the Median is equivalent to 30
Based on the results below the median is not equivalent to 30
Wi l coxon si gned r ank t est wi t h cont i nui t y cor r ect i on
dat a: x
V = 1246. 5, p- val ue = 0. 1682
al t er nat i ve hypot hesi s: t r ue l ocat i on i s not equal t o 30
Question 2
(a). Plot of Bfat Versus SSF
(b). Residual Plot
Document Page
(c). Different Regressions
Ksmooth in r
Lowess Regression
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
(d). Program and Output
> plot(ais$Bfat,ais$SSF,xlab = 'Bfat', ylab = 'SSf', main="Adjusted kernel regression")
> lines(ksmooth(ais$Bfat,ais$SSF,"normal",bandwidth=12))
> plot(ais$Bfat,ais$SSF,xlab = 'Bfat', ylab = 'SSf', main="Adjusted Lowess Regression")
> lines(lowess(ais$Bfat,ais$SSF, f=1/3))
Document Page
(e). Bootstrapping Confidence Interval for the Relationship between Bfat and SSF
Using the R command below we are able to use bootstrap to gather the confidence interval for the coefficients
of the intercept and the independent variable. Through bootstrapping we are able to compute confidence
intervals from original values subject to the standard error. Looking at the results below, we can computed the
95% confidence interval for both coefficients.
> meme<-Boot(fatty,R=1999)
> summary(meme, high.moments = TRUE)
Number of boot st r ap r epl i cat i ons R = 1999
or i gi nal boot Bi as boot SE boot Med boot Skew boot Kur t osi s
( I nt er cept ) 0. 58595 0. 108860 1. 61562 0. 66005 0. 082363 0. 166875
ai s$Bf at 5. 06653 - 0. 009145 0. 13811 5. 06174 - 0. 150524 0. 052488
The 95% confidence interval for critical Z=1.96 is:
Origina
l
bootSE Lower Limit Upper
Limit
B0 0.58595 1.61562 -2.5806652 3.7525652
B1 5.06653 0.13811 4.7958344 5.3372256
chevron_up_icon
1 out of 9
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]