Linear Spaces and Projection Matrices

Page Contents

Linear Space associated with a Factor

Suppose F is a factor on a set of observations Y. Now suppose we index the observations, _ Y = \{ ~y_1, ~y_2, ... , ~y_~n\}. Then with each non-empty level ~f of F, we can associate a vector _ #~v &in. &reals.^~n , _ #~v = (~v_1, ... , ~v_~n )^T _ where _ ~v_~i = 1 _ <=> _ F ~y_~i = ~f , _ 0 otherwise.

#{Example}
If ~n = 5 and the factor F has three non-empty levels, _ ~f_1 , _ ~f_2 , _ and _ ~f_3 , _ where _ ~y_1 &rightarrow. ~f_1, _ ~y_2 &rightarrow. ~f_1, _ ~y_3 &rightarrow. ~f_3, _ ~y_4 &rightarrow. ~f_2, _ ~y_5 &rightarrow. ~f_3, _ then writing #~v_~j for the vector associated with ~f_~j:

#~v_1 _ = _ matrix{1/1/0/0/0}, _ _ _ #~v_2 _ = _ matrix{0/0/0/1/0}, _ _ _ #~v_3 _ = _ matrix{0/0/1/0/1}

These vectors are orthogonal ( ~#v_~j&dot.~#v_~k = 0 _ if _ ~j != ~k ) and therefore linearly independent. So they form a basis for a subspace L_F of &reals.^~n . _ L_F _ = _ span \{#~v_~i\}_{~i = 1, ... , ~n} , _ dim ( L_F ) _ = _ | F | _ _ the number of (non-empty) levels of F.

Design Matrix

Suppose the factor F has ~k non-empty levels, we can now construct a ~n # ~k matrix whose columns are just the vectors associated with the levels of the factor, this is called the ~#{design matrix} of the factor. The design matrix in the above example is

( X_F )_{~i , ~j} _ = _ array{ 1, _ if _ F( ~i ) = ~f_~j/ 0,otherwise}

Note that changing the order of indexing of the levels would produce a slightly different design matrix, with interchanged columns. It would also give a different linear space, albeit one that is isomorphic to the first, with the same basis, just in a different order. We will demonstrate later that changing the order does not effect the statistical properties of the factor.

Linear Map associated with a Factor

The design matrix induces a linear map &psi.#: &reals.^{~k} &rightarrow. &reals.^{~n}

Clearly _ L_F _ = _ im &psi.

Projection associated with a Factor

Given that we have found the linear space, L_F, associated with a factor on the (indexed) set of observations Y (with respect to a given indexing of the levels of the factor), another quantity that will prove useful in the subsequent theory is the (orthogonal) #~{projection} onto L_F. The factor will split the observations into |F| groups of observations.

This is the situation described in One-way Analysis of Variance (Linear Normal Models). There we saw that the orthogonal projection onto L_F, which we will call ~p_F, is given by:

~p_F (#{~x}) _ = _ ( ${~x}_{F(1)}, ${~x}_{F(2)}, ... ${~x}_{F(~n)} )

Where

${~x}_{F(~i)} _ = _ fract{sum{~{x_r},{\{ ~{x_r} | F(~r) = F(~i) \}},}, ~n_{F(~i)}}

or simplifying, by putting F(~i) = ~j

${~x}_{~j} _ = _ fract{sum{~{x_r},{\{ ~{x_r} | F(~r) = ~j \}},}, ~n_{~j}} _ _ _ _ the average or mean of observations in the ~j^{th} level.

Projection Matrix

Following from the above, we can see that ~p_F has the associated ~n × ~n matrix P_F, given by:

P_F _ = _ ( ~{p_{i, r}} ) _{~i = 1...~n, ~r = 1...~n} #, _ _ _ _ _ _ _ _ _ _ _ ~{p_{i, r}} _ = _ array{ {1/~{n_j}}, F(~i) = ~j/ 0,otherwise}

This is easier to see if we take a concrete case, so for instance, in the above example,

P_F _ = _ {matrix{ { 1/2 } , { 1/2 } , 0, 0, 0 / { 1/2 } , { 1/2 } , 0, 0, 0 / 0, 0, { 1/2 } , 0, { 1/2 } / 0, 0, 0, 1, 0 / 0, 0, { 1/2 } , 0, { 1/2 } }}

applying this to the vector _ ~#u _ = _ ( ~u_1, ~u_2, ~u_3, ~u_4, ~u_5)

{matrix{ { 1/2 } , { 1/2 } , 0, 0, 0 / { 1/2 } , { 1/2 } , 0, 0, 0 / 0, 0, { 1/2 } , 0, { 1/2 } / 0, 0, 0, 1, 0 / 0, 0, { 1/2 } , 0, { 1/2 } }} {matrix{ ~u_1/ ~u_2/ ~u_3/ ~u_4/ ~u_5}} _ = _ {matrix{{(~u_1+~u_2) ./ 2}/{(~u_1+~u_2) ./ 2}/{(~u_3+~u_5) ./ 2}/~u_4/{(~u_3+~u_5) ./ 2}}}

Projection Matrix from Design Matrix

In fact we can connect the projection matrix with the design matrix for the factor through the formula:

P_F _ = _ X_F ( X_F^TX_F )^{-1} X_F^T

To see this suppose ~#v_1 ... ~#v_~k are any ~k non-zero orthogonal vectors, _ then we can write

X_F _ = _ #( ~#v_1 ... ~#v_~k #) , _ _ _ _ X_F^T _ = _ matrix{ ~#v_1^T/ ... /~#v_~k^T }

So, remembering that #~u^T #~w = #~u&dot.#~w

X_F^T X_F _ _ = _ _ matrix{ ~#v_1&dot.~#v_1 , ~#v_1&dot.~#v_2 , ... , ~#v_1&dot.~#v_~k /, , ... , / , , ... , / ~#v_~k&dot.~#v_1 , ~#v_~k&dot.~#v_2 , ... , ~#v_~k&dot.~#v_~k } _ _ = _ _ matrix{ || #~v_1 ||^2 , 0 , ... , 0 / 0, || #~v_2 ||^2 , ... , 0 /, , ... , / 0, 0 , ... , || #~v_~k ||^2 } _ _ _ _ _ by orthogonality

( X_F^T X_F )^{-1} _ _ = _ _ matrix{ {1 ./ || #~v_1 ||^2} , 0 , ... , 0 / 0, {1 ./ || #~v_2 ||^2} , ... , 0 / , , ... , / 0, 0 , ... , {1 ./ || #~v_~k ||^2} } _ _ _ _ _ as all the || #~v_~j ||^2 will be non-zero.

X_F ( X_F^T X_F )^{-1} _ _ = _ _ #( ~#v_1 ... ~#v_~k #) ( X_F^T X_F )^{-1} _ _ = _ _ matrix{ {#~v_1 ./ || #~v_1 ||^2} , {#~v_2 ./ || #~v_2 ||^2} , ... , {#~v_~k ./ || #~v_~k ||^2} }

Now _ _ _ X_F^T ~#y _ = _ matrix{ ~#v_1^T/ ... /~#v_~k^T } ~#y _ = _ matrix{ ~#v_1&dot.#~y / ... / ~#v_~k&dot.#~y }

X_F ( X_F^T X_F )^{-1} X_F^T ~#y _ _ = _ _ sum{ ~#v_~j&dot.#~y #~v_~j ./ || #~v_~j ||^2, ~j = 1,~k }

which is just the expression for the projection onto the space generated by ~#v_1 ... ~#v_~k. _ Leaving out the #~y, we get the matrix sum:

X_F ( X_F^T X_F )^{-1} X_F^T _ _ = _ _ sum{ ~#v_~j #~v_~j^T ./ || #~v_~j ||^2, ~j = 1,~k }

In the case of factor vectors _ || #~v_~j ||^2 = ~n_~j , _ the number of elements mapped to the ~j^{th} level, so this is just the projection matrix defined in the previous section.


In the example

and consequently