Supplemental notes on simple linear regression for the basic least squares estimation problem, moments of the estimators, and the fundamental optimality property of these estimators.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
ESE 302Tony E. Smith NOTES ON SIMPLE LINEAR REGRESSION 1. INTRODUCTION The purpose of these notes is to supplement the mathematical development of linear regression in Devore (2008). This development also draws on the treatment in Johnston (1963) and Larsen and Marx (1986). We begin with the basicleast squaresestimation problem, and next develop the moments of the estimators. Finally the fundamental optimality property of these estimators is established in terms of theGauss-Markov Theorem. 2. LINEAR LEAST SQUARES ESTIMATION The basic linear model assumes the existence of a linear relationship between two variables,andxy, which is disturbed by some random error,. Hence for each value ofxthe correspondingy-value is arandom variableof the form (2.1)01Yx where0and1are designated, respectively, as theintercept parameterand theslope parameter of the linear function,01x. Ifnvalues (:1,..,)ixinofxare observed, with corresponding errors (:1,..,)iin,then the resulting random variables, (:1,..,)iYin, are given by (2.2)01iiiYx,i= 1,…,n In this context it is assumed that the random errors,(:1,..,)iin, areindependently and identically distributed(iid) with mean zero and variance2 , so that (2.3)E()0i,i= 1,…,n (2.4)var2 ()i,i= 1,…,n If values ofycorresponding to (:1,..,)ixinare also observed, and are denoted by (:1,...,)iyin, then theleast squares estimation problemis to find estimates,0 ˆand1 ˆ, of the unknown parameter values,0and1, which minimize the sum of squared residuals [designated as01(,)fbbin Devore , p. 455]:
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
(2.5)2 01011 (,)() n iii Syx This function is easily seen to be convex and differentiable in0and1, so that the unique solution (0 ˆ,1 ˆ) is given by the first-order conditions: (2.6)0101 0 ˆˆˆˆ0(,)2()(1)iii Syx (2.7)0101 1 ˆˆˆˆ0(,)2()()iiii Syxx If we let1 ii xx n and1 ii yy n , then by (2.6) (2.8)0101 11ˆˆˆˆ00iiiiiiii ynxyx nn 01 ˆˆ0yx 01 ˆˆyx and by (2.7) (2.9)01 ˆˆ0iiiiyxx To simplify (2.9) let theestimatedy-valuecorresponding to (0 ˆ,1 ˆ) be defined by (2.10)01 ˆˆˆiiyx,i= 1,…,n and rewrite (2.9) as (2.11)ˆ0iiiiyyx Note also from (2.8) that (2.12)01 ˆˆˆˆiiiiiiiii yyyyxny 0101 ˆˆˆˆ ii nxnynxny 0
To solve for1 ˆwe first observe by subtracting (2.8) from (2.10) that (2.13)1 ˆˆiiyyxx 1 ˆˆiiiiyyyyxx,i= 1,…,n Hence, multiplying both sides byixxand summing overi, we obtain (2.14) 2 1 ˆˆiiiiiiiii yyxxyyxxxx But since (2.11) and (2.12) imply (2.15) ˆˆˆ0iiiiiiiiiii yyxxyyxxyy we may conclude from (2.14) that (2.16) 12 ˆiii ii yyxx xx [See expression (12.2) in Devore, p. 456.] Finally, by employing (2.8) , we may solve for0 ˆin terms of1 ˆas (2.17)01 ˆˆyx [See expression (12.3) in Devore, p. 456.] 3. MOMENTS OF THE ESTIMATORS The estimators in (2.16) and (2.17) depend on the values of the random variables, (:1,..,)iYin, and hence are themselves random variables . In particular, if thesample meanof theiY’s is denoted by (3.1)0101 111 iiiiiii YYxx nnn , then it follows at once from (2.16) that1 ˆis a random variable of the form
(3.2) 12 ˆiii ii YYxx xx and, similarly, that0 ˆis a random variable of the form (3.3)01 ˆˆYx To compute the moments of the slope estimator,1 ˆ, it is convenient to simplify expression (3.2) as follows. By breaking (3.2) into two terms (3.4) 122 ˆiiiii iiii YxxYxx xxxx and observing that (3.5)10iiiiii xxxnxnxx n we see that the second term vanishes, and hence that the estimator1 ˆcan be written as alinear combinationof theiY’s (3.6)1 ˆiiiwY where the coefficientsiware of the form (3.7) 2 i i jj xx w xx ,i= 1,…,n and hence arenon-random(i.e., depend only on the given values of theix’s) . To analyze (3.6) we begin with several observations about the coefficient values in (3.7) . First observe from (3.5) that (3.8) 20 ii ii ii xx wxx and moreover that (3.9) 2 21 ii iii ii xx w xxxx
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
which together with (3.8) also implies (3.10)1iiiiiiiiiii w xw xxww xx To compute the mean of1 ˆ, observe from (2.2) and (2.3) that (3.11)0101E()E()iiiiYxx so that by (3.6) , together with (3.8) and (3.10) , (3.12)101 ˆE()E()=iiiiii wYwx 011(0)(1)iiiii ww x 1 Thus1 ˆis anunbiased estimatorof1. Moreover, since (3.1) and (2.3) imply that (3.13)0101 1 E()E()ii Yxx n it follows from (3.3) together with (3.13) that (3.14)010110 ˆˆE()= E()E()Yxxx and thus that0 ˆis also anunbiased estimatorof0 To compute the variance of1 ˆ, we again observe from (3.6) that (3.15)10101 ˆ()()iiiiiiiiii wxwxw iii constw and hence (from the independence of thei’s that (3.16)2 1 ˆvar()varvar()iiiiii ww Hence we may conclude from (2.4) and (3.7) that (3.17) 2 222 12 ˆvar()=i i ii jj xx w xx
22 2 222 [] ii ijij xx xxxx [See expression (12.4) in Devore, p. 470.] Similarly, to determine the variance of0 ˆ,we observe from the above relations that (3.18)01 11ˆˆ()iiiiiiiinnYxYxwYxw Y 01 1() ()iiiinxwx 01 11() ()()iiiiiinnxwxxw 1()iiinconstxw and hence that (3.19) 2 0 1ˆvar()=var()iiixw n 2 2222 2 112 iiiii xwxwx w nnn 222 2 12iiii xwxw nn 222 22 22 ()1(0)= ()() ii iiii xxnxx nxxnxx 22222 22 22 2222 = ()() iiiiii iiii xxxnxxnxnx nxxnxx 2 2 2 () ii ii x nxx 4. GAUSS MARKOV THEOREM Finally we establish the fundamental optimality property of the above estimators. To do so, recall that for an independent random sample1(,..,)nYYfrom a population with mean, E()Y, the sample mean,nY, was shown to be abest linear unbiased(BLU) estimator of. This optimality property turns out to be shared by the least-squares estimators (0 ˆ,1 ˆ) above .
This result, known as theGauss-Markov Theorem, provides the single strongest justification for linear least-squares estimation, and can be stated as follows: GAUSS MARKOV THEOREM.For any linear function,0011,Laaof (0 ˆ,1 ˆ),the least squares estimator,0011 ˆˆˆ,Laahas minimum variance among all linear unbiased estimators of L. Proof:We shall prove this assertion only for the linear function with coefficients 01(0,1),aai.e., for the estimate,1 ˆ, of the slope parameter,1, (which is by far the most important of the two individual parameters). The argument for any linear function of0and1 is essentially the same. To begin with, observe from (3.6) that1 ˆis indeed a linear estimator, i.e., is a linear function of the random variables (iY:i= 1,…,n). Moreover, it was shown in (3.12) that1 ˆis also an unbiased estimator of1. Hence it remains only to show that the variance of 1 ˆnever exceeds that of any other linear unbiased estimator. To do so, consider any other linear estimator, say (4.1)1=iiic Y and suppose that1is also unbiased estimator . Then by (3.12) we must have (4.2)11= E() =E()iiicY 0101=()iiiiiiii cxcc x But since unbiasedness requires that (4.2) hold for all values of the unknown parameters0and 1, it follows by setting01and10that (4.3)0iic and in turn, by setting11, that (4.4)1iiic x Hence, in a manner identical with (3.15), these two conditions are seen to imply that (4.5)111iiiiii c Yc and thus that the variance of1is given by
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
(4.6)222 1var()var()iiiii cc To compare this with var(1 ˆ) , observe first that if the differences between the coefficients of1 and1 ˆin (4.1) and(3.6) are denoted by,1, ..,,iiidcwinthen (4.6) can be rewritten as (4.7)22222 1var()()(2)iiiiiiiiii wdwd wd But by (4.3) and (4.4) together with (3.8) and (3.10) we must have (4.8)0(0)0iiiiiiiiii cwddd (4.9)110iiiiiiiiiiiiiii c xw xd xd xd x which together imply that (4.10)22 ()0 ()() iiiiii iiiii jjjj d xxdxx d wdxxxx Hence, recalling (3.7) , we see that (4.7) reduces to (4.11)222222 11 ˆvar()var()iiiiii wdd and may conclude from the nonnegativity of22 iidthat (4.12)11 ˆvar()var() Thus1 ˆhasminimum varianceamong all linear unbiased estimators, and the result is established. 5. REFERENCES Devore, J.L., (2008)Probability and Statistics for Engineering and the Sciences, Seventh Edition, Duxbury Press, Belmont, California. Larsen, R.J. and M.L. Marx, (1986)An Introduction to Mathematical Statistics and its Applications, Second Edition, Prentice-Hall, Englewood Cliffs, N.J. Johnston, J., (1963)Econometric Methods, McGraw-Hill, N.Y.