Comparing Regression Slopes

Page Contents

Dual Slopes

Suppose that the observations are divided into two groups of observations, with ~n_1 and ~n_2 observations respectively in each group, . We will assume a linear regression model for each group, but with different parameters for each group, ~n_1 + ~n_2 = ~n.

I.e. _ &mu. = &alpha._~k + &beta._~k ~x _ for observations in group ~k = 1, 2.

This model can be represented by the linear space _ L = span ( #~v_1 , #~v_2 , #~x_1 , #~x_2 ) _ where

#{~v}_1 = ( 1, ... , 1, 0, ... , 0 )

#{~v}_2 = ( 0, ... , 0, 1, ... , 1 )

#{~x}_1 = ( ~x_1, ... , ~x_{~n_1}, 0, ... , 0 )

#{~x}_2 = ( 0, ... , 0, ~x_{~n_1+1}, ... , ~x_~n )

p ( ~#y ) _ = _ est{&alpha._1} #~v_1 + est{&alpha._1} #~v_2 + est{&beta.}_1 #~x_1 + est{&beta.}_2 #~x_2

note that _ #~v_1 #. #~v_2 = 0 , _ #~v_1 #. #~x_2 = 0 , _ etc., so

~#y #. #~x_1 _ = _ p ( ~#y ) #. #~x_1 _ = _ est{&alpha._1} #~v_1 #. #~x_1 + est{&beta.}_1 #~x_1 #. #~x_1

~#y #. #~v_1 _ = _ p ( ~#y ) #. #~v_1 _ = _ est{&alpha._1} #~v_1 #. #~v_1 + est{&beta.}_1 #~x_1 #. #~v_1

And similarly for #~v_2 and #~x_2. _ These have exactly the same form as the equivalent expressions for simple regression , but divided into two groups, which is exactly what you'd expect. The parameters can be estimated, therefore, by taking the estimates for the individual groups:

_ est{&beta.}_~k _ = _ ~S_{~x~y}^{(~k)} ./ ~S_{~x~x}^{(~k)}

_ est{&alpha._~k} _ = _ $~y^{(~k)} - est{&beta.}^{(~k)} $~x^{(~k)} , _ _ _ ~k = 1 , 2

where $~y^{(~k)}, $~x^{(~k)} are the means of ~y and ~x in the ~k^{th} group, and

~S_{~x~x}^{(~k)} = &sum. ( ~x - $~x^{(~k)} ) ^2 , _ etc.

the sums being over the values in the individual groups.

RSS for Dual Slope

The residual sum of squares in this case is

RSS_~d _ = _ || #~y - p ( #~y ) ||^2 _ = _ || #~y - ( $~y ^{(1)} - est{&beta.}_1 $~x ) #~v_1 - est{&beta.}_1 #~x_1 - ( $~y ^{(2)} - est{&beta.}_2 $~x ) #~v_2 - est{&beta.}_2 #~x_2 ||^2

_ _ _ = _ || ( #~y_1 - $~y ^{(1)} #~v_1 ) - est{&beta.}_1 ( #~x_1 - $~x ^{(1)} #~v_1 ) + ( #~y_2 - $~y ^{(2)} #~v_2 ) - est{&beta.}_2 ( #~x_2 - $~x ^{(2)} #~v_2 ) ||^2

where _ #{~y}_1 = ( ~y_1, ... , ~y_{~n_1}, 0, ... , 0 ) , _ #{~y}_2 = ( 0, ... , 0, ~y_{~n_1+1}, ... , ~y_~n ) .

Because of the orthogonality of the two sets of vectors (#~v_1 #. #~x_2 = 0 , _ etc.), the norm can be split in two:

RSS_~d _ = _ || ( #~y_1 - $~y ^{(1)} #~v_1 ) - est{&beta.}_1 ( #~x_1 - $~x ^{(1)} #~v_1 ) ||^2 _ + _ || ( #~y_2 - $~y ^{(2)} #~v_2 ) - est{&beta.}_2 ( #~x_2 - $~x ^{(2)} #~v_2 ) ||^2

_ _ _ = _ RSS_1 + RSS_2

_ _ _ = _ S_{~y~y}^{(1)} - ( S_{~y~x}^{(1)} )^2 ./ S_{~x~x}^{(1)} _ + _ S_{~y~y}^{(2)} - ( S_{~y~x}^{(2)} )^2 ./ S_{~x~x}^{(2)}

this has ~n - 4 degrees of freedom, so

RSS_~d _ = _ S_{~y~y}^{(1)} - est{&beta.}_1 S_{~y~x}^{(1)} + S_{~y~y}^{(2)} - est{&beta.}_2 S_{~y~x}^{(2)}

~s_D^2 _ = _ RSS_~d ./ ~n - 4

Uniform Slope

Note that the simple linear regression model is a submodel of the one just described, and we could do the usual model reduction test for this. Accepting this would mean that the two groups of observations had the same intercept and the same slope.

However there are a couple of intermediate models which it could be interesting to test for first: whether the two groups have the same intercept but differing slope, or whether they have differing interceps and the same slope. It is the second model we concentrate on here

The space for this model is generated by the three vectors:

#{~v}_1 = ( 1, ... , 1, 0, ... , 0 )

#{~v}_2 = ( 0, ... , 0, 1, ... , 1 )

#~x = ( ~x_1, ... , ~x_~n )

p ( ~#y ) _ = _ est{&alpha._1} #~v_1 + est{&alpha._1} #~v_2 + est{&beta.} #~x

Note that the estimates for &alpha._1 and &alpha._2 will not be the same as in the previous model. We still have _ #~v_1 #. #~v_2 = 0 , _ so :

~#y #. #~x _ = _ p ( ~#y ) #. #~x _ = _ est{&alpha._1} #~v_1 #. #~x + est{&alpha._2} #~v_2 #. #~x + est{&beta.} #~x #. #~x

~#y #. #~v_1 _ = _ p ( ~#y ) #. #~v_1 _ = _ est{&alpha._1} #~v_1 #. #~v_1 + est{&beta.} #~x #. #~v_1

~#y #. #~v_2 _ = _ p ( ~#y ) #. #~v_2 _ = _ est{&alpha._2} #~v_2 #. #~v_2 + est{&beta.} #~x #. #~v_2

The last two equations give:

_ est{&alpha._1} _ = _ $~y ^{(1)} - est{&beta.} $~x ^{(1)}

_ est{&alpha._2} _ = _ $~y ^{(2)} - est{&beta.} $~x ^{(2)}

So the first equation becomes

~#y #. #~x _ = _ ( $~y ^{(1)} - est{&beta.} $~x ^{(1)} ) ~n_1 $~x ^{(1)} + ( $~y ^{(2)} - est{&beta.} $~x ^{(2)} ) ~n_2 $~x ^{(2)} + est{&beta.} #~x #. #~x

~#y #. #~x - ~n_1 $~y ^{(1)} $~x ^{(1)} - ~n_2 $~y ^{(2)} $~x ^{(2)} _ = _ est{&beta.} #~x #. #~x - est{&beta.} $~x ^{(1)} ~n_1 $~x ^{(1)} - est{&beta.} $~x ^{(2)} ~n_2 $~x ^{(2)}

Now the sums _ ~#y #. #~x , _ and _ ~#x #. #~x _ can be split over the first ~n_1 terms and the last ~n_2 terms so this reduces to:

~S_{~y~x}^{(1)} + ~S_{~y~x}^{(2)} _ = _ est{&beta.} ( ~S_{~x~x}^{(1)} + ~S_{~x~x}^{(2)} )

est{&beta.} _ = _ fract{~S_{~y~x}^{(1)} + ~S_{~y~x}^{(2)},~S_{~x~x}^{(1)} + ~S_{~x~x}^{(2)}}

RSS for Uniform Slope

The residual sum of squares in the case of a uniform slope is

RSS_~u _ = _ || #~y - p ( #~y ) ||^2 _ = _ || #~y - ( $~y ^{(1)} - est{&beta.} $~x ^{(1)} ) #~v_1 - ( $~y ^{(2)} - est{&beta.} $~x ^{(2)} ) #~v_2 - est{&beta.} #~x ||^2

Now _ #~x _ = _ #~x_1 + #~x_2 , _ so:

RSS_~u _ = _ || ( #~y_1 - $~y ^{(1)} #~v_1 ) - est{&beta.} ( #~x_1 - $~x ^{(1)} #~v_1 ) + ( #~y_2 - $~y ^{(2)} #~v_2 ) - est{&beta.} ( #~x_2 - $~x ^{(2)} #~v_2 ) ||^2

proceeding in the same way as for the two slope case above we get:

RSS_~u _ = _ || ( #~y_1 - $~y ^{(1)} #~v_1 ) - est{&beta.} ( #~x_1 - $~x ^{(1)} #~v_1 ) ||^2 _ + _ || ( #~y_2 - $~y ^{(2)} #~v_2 ) - est{&beta.} ( #~x_2 - $~x ^{(2)} #~v_2 ) ||^2

_ _ _ = _ S_{~y~y}^{(1)} + est{&beta.}^2 S_{~x~x}^{(1)} - 2 est{&beta.} S_{~y~x}^{(1)} + S_{~y~y}^{(2)} + est{&beta.}^2 S_{~x~x}^{(2)} - 2 est{&beta.} S_{~y~x}^{(2)}

_ _ _ = _ S_{~y~y}^{(1)} + S_{~y~y}^{(2)} - est{&beta.} ( S_{~y~x}^{(1)} + S_{~y~x}^{(2)} )

_ _ _ = _ S_{~y~y}^{(1)} + S_{~y~y}^{(2)} - ( S_{~y~x}^{(1)} + S_{~y~x}^{(2)} )^2 ./ ( S_{~x~x}^{(1)} + S_{~x~x}^{(2)} )

and this has ~n - 3 degrees of freedom.

Test for Uniform Slope

Suppose we have two groups of observations as described in the Dual Slope case above, and we want to test to see if the two slopes are the same. This is testing for model reduction from the dual case to the Uniform Slope case.

The total sum of squares is therefore the residual sum of squares for the reduced model ( RSS_~u ), the residual sum of squares (in the ANOVA table) is the residual sum of squares for the original model ( RSS_~d ), and the explained sum of squares is the difference. The degrees of freedom are calculated similarly

These quantities are displayed in the following table:

	Sum of Squares	d.f.	Mean Square	~r
Explained	RSS_~u - RSS_~d	1	RSS_~u - RSS_~d ./ 1	MS_E ./ MS_R
Residual	RSS_~d	~n - 4	RSS_~d ./ ~n - 4
Total	RSS_~u	~n - 3