# Linear Regression

Page Contents

## Model

This is a recap of the theory developed in linear regression as part of linear normal models .

If ~Y is a random variable which is dependent on the variable ~X in the sense:

( ~Y | ~X = ~x ) _ ~ _ N ( &alpha. + &beta. ~x , &sigma.^2 )

Suppose we have ~n observations of the pair ( ~Y , ~X ) which we will denote ( ~y_~i , ~x_~i ). Alternatively we can regard these as observations of the ~n independent random variables ( ~Y_~i , ~X_~i ), ~i = 1 , ... ~n, and we can write the model as

~Y_~i _ ~ _ N ( &alpha. + &beta. ~x_~i , &sigma.^2 )

The individual density functions are therefore

f ( ~y_~i ) _ = _ fract{1,&sqrt.\${(2&pi.&sigma.&powtwo.)}} exp rndb{ fract{&minus.( ~y_~i - &alpha. - &beta. ~x_~i )^2,2&sigma.&powtwo.} }

So the likelihood function is:

L ( ~#y ) _ = _ (2&pi.&sigma.&powtwo.)^{-~n/2} exp rndb{ fract{- &sum._~i ( ~y_~i - &alpha. - &beta. ~x_~i )^2,2&sigma.&powtwo.} }

where _ ~#y = ( ~y_1, ~y_2, ... ~y_~n ) . _ The log of the likelihood is

l ( ~#y ) _ = _ -fract{~n,2} ln ( 2&pi. ) _ - _ ~n ln ( &sigma. ) _ - _ fract{&sum._~i ( ~y_~i - &alpha. - &beta. ~x_~i )^2,2&sigma.&powtwo.}

_ _ _ _ _ _ = _ ~{constant} _ - _ ~n ln ( &sigma. ) _ - _ ~S ./ 2&sigma.^2

putting _ ~S _ = _ &sum._~i ( ~y_~i - &alpha. - &beta. ~x_~i )^2

For fixed &sigma., l has maximum ( and &therefore. also L ) when ~S has a minimum, i.e. at the following values of &beta. and &alpha. respectively:

 est{&beta.} _ #:= _ fract{ &sum. ~x_~i ~y_~i - &sum. ~x_~i &sum. ~y_~i ./ ~n, &sum. ~x_~i^2 - ( &sum. ~x_~i )^2 ./ ~n} _ = _ fract{ &sum. ~x_~i ~y_~i - ~n \$~x \$~y, &sum. ~x_~i^2 - ~n \$~x ^2 } est{&alpha.} _ #:= _ \$~y - est{&beta.} \$~x

## Sums of Squares

We now introduce the folling quantities:

~S_{~x~x} _ #:= _ &sum. ( ~x_~i - \$~x ) ^2 _ = _ &sum. ~x_~i ^2 - ( &sum. ~x_~i ) ^2 ./ ~n

~S_{~y~y} _ #:= _ &sum. ( ~y_~i - \$~y ) ^2 _ = _ &sum. ~y_~i ^2 - ( &sum. ~y_~i ) ^2 ./ ~n

~S_{~x~y} _ #:= _ &sum. ( ~x_~i - \$~x ) ( ~y_~i - \$~y ) _ = _ &sum. ~x_~i ~y_~i - ( &sum. ~x_~i ) ( &sum. ~y_~i ) ./ ~n

These are loosely known as "~#{sums of squares}", although they are also sometimes, more appropriately, referred to as "#~{sums of squares of deviations}" and "#~{sums of squares of differences}", and in the case of the third quantity as the "#~{sum of the product of deviations}"

The term "~{sum of squares}" is not well defined in statistics, and is used for a whole plethora of expressions of the form "~{the sum, over a certain range, of squares of the difference of an observation and the average of those observations over that range}". See, for example, residual sum of squares below.

We can now define the estimates in terms of these sums:

 est{&beta.} _ = _ ~S_{~x~y} ./ ~S_{~x~x} est{&alpha.} _ #:= _ \$~y - est{&beta.} \$~x

## Residuals

Now that we have calculated est{&alpha.} and est{&beta.}, we define the #~{residual} of an observation as

&rho._~i _ = _ ~y_~i - est{&alpha.} - est{&beta.} ~x_~i

Note that a residual is an ~{observable quantity}, i.e. it is the distance from the observation to the estimated regression line ( est{&alpha.} + est{&beta.}~x ), and not the distance to the hypothetical (unknown) line ( &alpha. + &beta.~x ).

The #~{residual sum of squares} is defined as

RSS _ #:= _ &sum. ( ~y_~i - est{&alpha.} - est{&beta.} ~x_~i ) ^2 _ = _ &sum. ( ( ~y_~i - \$~y ) - est{&beta.} ( ~x_~i - \$~x ) ) ^2 _ _ [ substituting for est{&alpha.} ]

_ _ _ = _ &sum. ( ~y_~i - \$~y )^2 + est{&beta.}^2 &sum. ( ~x_~i - \$~x ) ^2 - 2est{&beta.} &sum. ( ~y_~i - \$~y ) ( ~x_~i - \$~x )

_ _ _ = _ ~S_{~y~y} + ( ~S_{~x~y}^2 ./ ~S_{~x~x}^2 ) ~S_{~x~x} - 2( ~S_{~x~y} ./ ~S_{~x~x} ) ~S_{~x~y} _ = _ ~S_{~y~y} - ~S_{~x~y}^2 ./ ~S_{~x~x}

 RSS _ = _ ~S_{~y~y} - est{&beta.} ~S_{~x~y}

## Sampling Distribution of Estimates

est{&beta.} ~ N ( &beta. , &sigma.^2 ./ ~S_{~x~x} ) _ => _ ( est{&beta.} - &beta. ) &sqrt.\$~S_{~x~x} ~ N ( 0 , &sigma.^2 )

RSS ./ &sigma.^2 ~ &chi.^2 ( ~n - 2 ) _ => _ RSS ./ ( ~n - 2 ) ~ &sigma.^2 &chi.^2 ( ~n - 2 ) ./ ( ~n - 2 )

The quantities _ ( est{&beta.} - &beta. ) &sqrt.\$~S_{~x~x} _ and _ RSS ./ ( ~n - 2 ) _ are independent, so

fract{( est{&beta.} - &beta. ) &sqrt.\$~S_{~x~x},&sqrt.\${ RSS ./ ( ~n - 2 ) }} _ ~ _ t^2 ( ~n - 2 )

[ Note: _ &sqrt.\${ RSS ./ ( ~n - 2 ) ~S}_{~x~x} _ is the estimated standard deviation for est{&beta.} ]

So to test the hypothesis _ H: &beta. = 0 , _ we can test the quantity

fract{ est{&beta.},&sqrt.\${ RSS ./ ( ~n - 2 ) ~S}_{~x~x}}

in a t^2 ( ~n - 2 ) distribution.

## ANOVA Table for Regression

We have

TSS _ = _ ~S_{~y~y}

RSS _ = _ ~S_{~y~y} - est{&beta.} ~S_{~x~y} , _ _ where _ est{&beta.} _ = _ ~S_{~x~y} ./ ~S_{~x~x}

ESS _ = _ TSS - RSS _ = _ ~S_{~x~y}^2 ./ ~S_{~x~x}

So we can complete the ANOVA table using the "sums of squares":

 Sum of Squares d.f. Mean Square ~r Explained ~S_{~x~y}^2 ./ ~S_{~x~x} 1 ESS ./ 1 MS_E ./ MS_R Residual ~S_{~y~y} - ~S_{~x~y}^2 ./ ~S_{~x~x} ~n - 2 RSS ./ ~n - 2 Total ~S_{~y~y} ~n - 1

The significance probability for the hypothesis _ H: &beta. = 0 , _ is 1 - F ( ~r ), where ~r is the mean square ratio, MS_E ./ MS_R, and F is the distribution for the F ( 1 , ~n - 2 ) distribution.