Statistical Inference and Modeling: Homework 1, Spring 2019

Verified

Added on 2023/04/20

AI Summary

This document presents a comprehensive solution to a statistics homework assignment from Columbia University's Statistical Inference and Modeling course. The assignment covers several key concepts, including estimation of quantiles for an exponential distribution, finding the Maximum Likelihood Estimator (MLE) for quantiles, constructing approximate and exact confidence intervals, and hypothesis testing. The solution includes detailed calculations, derivations, and explanations for each exercise. Exercise 1 focuses on the exponential distribution, calculating the pth quantile, finding the MLE of the quantile, and deriving confidence intervals. Exercise 2 explores independent discoveries in science and technology, specifically the Poisson distribution. Exercise 3 delves into goodness of fit and statistical investigation models. Exercise 4 focuses on regression analysis and interpreting results. Exercise 5 addresses hypothesis testing, including one-tailed tests and simulation of data with outliers.

Statistics

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Table of Contents
Exercise-1.............................................................................................................................................1
Excerise-2.............................................................................................................................................3
Excerise-3.............................................................................................................................................5
Exercise-4.............................................................................................................................................8
Exercise-5...........................................................................................................................................11
References..........................................................................................................................................13

Exercise-1
ANSWER:
The population of the pth quintile value is computed as follows,
Let us consider the QD(p| D0 >Dn ) as the pthequation,
F (d, λ) =λ e− λd for d=0
QD(p| D0 >Dn)= d pD+=λ pX
Assume the pthquintile of Z with 0<p<1 are the individual values which consider the pthroot
Q (p), whose equation is F (QD (p)) =p.
ANSWER:
The MLE population is,
λ (QD (p))= log (Dn)
1

=[∑
1
n
log ( D 1 )−2nlog ( D2 ) − 1
λ ∑
i=0
1
exp (f ( λ e− λd ))
ANSWER:
The approximate confidence interval of QD (p) based on its MLE is equated below.
The approximate confidence interval and speculation tests carries the anticipated values as
the example of populace, and the size of the developed sample’s confidence interval is,
100(1- λ)% confidence interval which can be considered as QD ( p ) .
(QD (p) ± QD( p)λ)=[ D± λ(QD p
2 ) √ D
n ]
ANSWER:
We can calculate the pivot point estimator with the considered value QD (5) Dn, which is the
mean of the distribution.
Dn= D1+ D 2+… .. Dn
n
P(−0.5≤ √ D λ e−λd ≤ .0 .6 )=0.5
Ranging the terms equivalent to,
P(QD (5 )− 0.5
√n ≤ Q≤ QD ( 5 )+ 0.6
√ n )=0.5
Interval,
(QD (5 )− 0.5
√n ≤ Dn+ 0.6
√ n )
As our 95% confidence interval of QD ( 5 ) .
2

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Excerise-2
ANSWER:
γ ( μ , σ2 / x)=∏
i=1
n
Ε[ R1
3 ]=∏
i=1
n 1
√ 2 π σ 2 exp {−(γ −μ)2
2 σ2 }
The γ functions are,
γ ( μ , σ2 / R)=∑
i=1
n
log(Ε[ R1
3 ])=- n
2log (2 π)- n
2log (σ 2)- 1
2 σ2 ∑
i=1
n
¿ ¿
2.a)
ANSWER:
Derive the bias γ
^γ= Ri
E [R] = ^γ *i
E [ ^γ] =E [Ri]
= 1
n E [Ri] = 1
n X ^γ
Bias ( γ) = E [ ^γ]- γ
γ−γ =var ( ^γ )=0
b)
ANSWER:
γ ( μ , σ2 / R)=∑
i=1
n
log(Ε[ R1
3 ])=- n
2log (2 π)- n
2log ( σ 2)- 1
2 σ2 ∑
i=1
n
¿ ¿
Consistent for ^γ ¿ ^γ = 1
n ∑
i=1
n
Ri
3
The consistent of ^γ= 1
n ∑
i=1
n
Ri
3 function of the argument value is zero.
3

3.
ANSWER:
The calculation of the unbiased estimation of the estimate values μ3 is the unbiased estimator.
As the section can find the argument function 1
n ∑
i=1
n
Ri
3 as estimated for the ^γunbiased (Miller,
2012).
4.
ANSWER:
The minimum variance unbiased estimator can be used for the lower variance then, any other
unbiased estimator for all the parameters of the possible values can be consider γby using the
Rao-Blackwell Theorem.
5.
ANSWER:
First, can the proof of the Rao Blackwell theorem on the values of expectation condition γ be
independent of γ be sufficient.
Secondly, γ is unbiased since iterating the expectations yields,
E[γ]=E{E(γ|R)=E(γ)=R(γ)
λ=R/γ
The minimum unbiased values are,
V(R+ λ(R'-R) =V(R)+ R2/γ-2R2/γ=V(R)-R2/γ
4

Excerise-3
1. We have chosen the summary on goodness of fit and this summary was written by the
author Merton. The summary of this paragraph can be used for the two tables, to find the
accessing frequency of the multiple data. The table 1 can calculate the fit of the poison
distribution’s frequency of multiple observed frequencies, by using the method of
Merton, as a general information used for the table and can calculate μvalues
(Raghuvanshi, 2016). Table 2 can be used for the fit of the poission distribution of
multiple reconstructed frequencies. Table 1 and Table 2 can be utilized for energy
dissemination of the genuinely great fits’ esteems. We have chance, where one out of
thousand errors occur, because of simple possibility or irregular clamour in the
information.
2. Tables 1 and 2 demonstrates that the Poisson dispersions genuinely offers great fits, the
match is immaculate in neither case ideal Au is 1.2, an esteem marginally less than that
in the Table 1. All things considered that, two qualities are close enough to presume its
genuineness, where it most likely to some degree is more than the solidarity and
unquestionably under two.
3. The benefits of the statically investigation model can be utilized for the Poisson
recurrence work as thousands of any disparities are because of the negligible shot or
irregular commotion in the information.
4. The exception of the variance 96% to share the distribution.
5. Likelihood plotting is a graphical technique that permits a visual appraisal of the model
fit. When the model’s parameters have been assessed, the likelihood plot can be made.
The following figure demonstrates an examination of the likelihood plots of two
selections of the dispersions by utilizing similar informational collection. MLE
arrangements alongside the middle positions (for example, indicates the plotted
concurring middle positions and the line as per the MLE arrangements), this isn't totally
delegate. The MLE strategy is really free of any sort of positions. Therefore, the MLE
arrangement frequently show up, not to follow the information on the likelihood plot.
6. We can use the algorithm of the linear and distributed model, and use the following code.
5

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

7.
The Asymptotic distribution of the first likelihood function’s first derivative is,
8. Plots the inclusion likelihood of the ostensible 95% standard confidence interval with p =
0.5 and n = 10 to 100. At n = 97, the inclusion is still just about 0.933 inclusion likelihood
hardly gets relentlessly closer to the ostensible certainty level as “n” increments. At n = 17,
the inclusion likelihood is 0.951, however at a lot bigger esteem n = 40, the inclusion is just
0.919.
9. This investigation of table 1 and table 2’s results can calculate the values of μ =0.992. The
investigator’s desired level of confidence is evaluated (most commonly 95%, but any level
between 0 to100 percent can be selected). We can analyse Z as the value from the standard
normal distribution for the selected confidence level (e.g., for a 95% confidence level, Z=
1.96). 95% probability which the confidence interval have, will contain the true population
mean value as 0.95%.
7

Exercise-4
1.
1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 103 109 115 121 127 133 139 145 151 157
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
"age.tot"
2. Translating the incline of a relapse line. The slant is deciphered in variable based math as it
ascends over run. For instance, the incline is 2, you can compose this as 2/1 and state it as
you move along the line, as the estimation of the X variable increments by 1, the estimation
of the Y variable increments by 2.
3. The two expected parameters are,
a) Main variance(x)
b) Linear regression(x)
4. The noise variable can consider,
5. The x-esteem for each term tests the invalid theory that the coefficient is equivalent to zero
(no impact). A low p-esteem (< 0.05) shows that you can dismiss the invalid speculation.
However, the y-esteem for East (0.092) is more prominent than the normal alpha dimension
of 0.05, which shows that it isn't measurably noteworthy.
6.
 Nature is not interested in units of estimation.
8

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

 Changing the units of your logical factors in a relapse display.
 The qualities increment, by and large of the estimation.
 The easy plot of the line graph.
7.
 Eat right and lose abundance weight.
 Control high blood pressure.
 Control diabetes.
 Try not to smoke.
8.
"age.tot" "18 2.44" "19 3.86" "19 -1.22" "20 2.3" "21
0.98" "21 -0.5" "21 2.74" "22 -0.12" "22 -1.21" "22
0.99" "22 2.07" "22 1.55" "22 5.19" "23 -0.38" "23
1.62" "23 0.2" "23 0.15" "23 -1.28" "23 0.78" "23 2.94"
"23 -2.12" "23 2.46" "23 4.63" "23 1.64" "24 -0.28" "24
0.67" "24 -0.43" "24 -2.1" "24 -0.82" "24 3.92" "24
1.63" "24 1.24" "24 -0.66" "24 0.15" "24 3.17" "24 -
0.58" "25 2.69" "25 -0.24" "25 1.74" "25 5.07" "25 -3.3"
"25 1.51" "25 3.93" "25 -0.87" "25 0.38" "25 -0.47" "26
2.33" "26 -0.05" "26 -1.1" "26 2.07" "26 -1.35" "26 -
2.45" "26 -1.39" "27 0.59" "27 1.79" "27 1.27" "27
0.41" "27 -1.57" "27 0.83" "27 1.76" "27 -1.44" "27
2.55" "27 1.81" "27 2.82" "27 1.37" "28 -1.37" "28
1.27" "28 1.22" "28 -0.05" "29 -1.09" "29 3.83" "29
0.89" "29 1.19" "29 -0.07" "30 1.52" "30 0.56" "30
0.05" "30 -0.12" "31 1.69" "31 -0.84" "31 0.6" "31 -0.6"
“31 0.05" "31 1.31" "31 -1.8" "32 1.8" "32 2.38" "32
2.86" "32 4.53" "32 1.32" "32 0.27" "32 -2.29" "33 0.3"
"33 1.99" "33 -1.04" "33 1.44" "33 1.58" "33 1.38" "34
0.44" "34 1.15" "34 -0.09" "34 0.55" "35 -0.91" "35
1.68" "36 -1.41" "36 -1.47" "36 0.33" "36 -1.94" "36
3.65" "37 -0.87" "38 -0.97" "38 1.04" "39 -1.55" "41
1.87" "41 -4" "41 -0.16" "42 -2.13" "43 -0.86" "44 -
0.93" "45 -0.53" "45 -0.6" "46 -1.62" "47 -3.81" "48
0.11" "51 -0.56" "53 3.22" "54 -1.87" "54 2.56" "54 -
1.25" "54 -2.95" "55 -0.01" "56 -2.9" "57 -1.31" "57 -
4.89" "60 -0.1" "60 -2.81" "60 -1.72" "61 -3.6" "61 -
0.51" "62 -0.5" "62 -3.12" "62 -4.46" "63 -3.56" “63
0.88" "65 -1.56" "65 -6.45" "68 -3.1" "69 -1.62" "70
1.01" "71 -3.77" "72 -2.53" "73 -2.45" "73 -0.33" "74 -
5.73" "80 -5.14" "82 -2.08"
9. Plots the inclusion likelihood of the ostensible 95% standard interim for p = 0 2. The
quantity of preliminaries n changes from 25 to 100. It is obvious from the plot that the
swaying is critical and the inclusion likelihood does not relentlessly draw nearer to the
ostensible certainty level as it increments.
10. I am utilizing one next to the other box plots with the real information focuses plotted in
the above layer, however littered to stay away from over plotting. I utilized coord_flip () to
make it less demanding to look at the circulations and on the grounds that short and wide plot
fits the page better. I delineated the case plot in red and utilized red for the exception hues so
that they can be recognized from the plotted information focuses.
9

11. The directed lines utilizing Leave-one-out (LOO) cross approval, our outcomes
demonstrated a right grouping rate of 83%. We likewise contrasted the proposed model and
another dependent in a general mind’s useful availability.
10