~X and ~Y are jointly distributed ( 2-dimensional ) normal random variables if
f ( ~x , ~y ) _ = _ fract{1,2&pi. &sigma._~X &sigma._~Y &sqrt.( 1 - &rho.^2 )} _ _ # e^A
Where:
A = rndb{fract{ - 1,2( 1 - &rho.^2 )}} rndb{ fract{( ~x - &mu._~X )^2,&sigma._~X^2} + fract{( ~y - &mu._~Y )^2,&sigma._~Y^2} - 2&rho.fract{( ~x - &mu._~X )( ~y - &mu._~Y ),&sigma._~X &sigma._~Y}}
Where &rho., _ - 1 =< &rho. =< 1, _ is called the correlation coefficient of ~x and ~y.
We write
#~X _ = _ ( ~x, ~y ) _ _ ~ _ _ N_2 ( #{&mu.} , &Sigma. )
where _ _ #{&mu.} _ = _ ( &mu._~X , &mu._~Y ) _ _ and _ _ &Sigma. _ = _ matrix{&sigma._~X^2, &rho.&sigma._~X&sigma._~Y/ &rho.&sigma._~X&sigma._~Y,&sigma._~Y^2}
The quantity _ &rho.&sigma._~X&sigma._~Y _ is known as the #~{covariance}, and is denoted _ &sigma._{~X~Y} , _ so
&Sigma. _ _ = _ _ matrix{&sigma._~X^2, &sigma._{~X~Y}/ &sigma._{~X~Y},&sigma._~Y^2}
Another way of writing the distribution function is:
f ( ~x,~y ) _ = _ fract{1,&sqrt.( 2&pi.&sigma._~X^2 )} exp rndb{ fract{ - ( ~x - &mu._~X )^2,2&sigma._~X^2}} fract{1,&sqrt.( 2&pi.&omega.^2 )} exp rndb{ fract{ - ( ~y - ( &alpha. + &beta.~x ) )^2,2&omega.^2}} _ _ _ _ ... #{( 1 )}
Where _ _ &omega.^2 _ = _ &sigma._~Y^2( 1 - &rho.^2 ) , _ _ &beta. _ = _ &rho.( &sigma._~Y ./ &sigma._~X ), _ _ &alpha. _ = _ &mu._~Y - &beta.&mu._~X
[ An equivalent formula can be obtained by interchanging _ ~x, _ &mu._~X _ with _ ~y , _ &mu._~Y .]
The marginal distribution of ~X is given by
f ( ~x ) _ = _ fract{1,&sqrt.( 2&pi.&sigma._~X^2 )} exp rndb{ fract{ - ( ~x - &mu._~X )^2,2&sigma._~X^2}}
and the conditional density of ~Y given that ~X = ~x is
f ( ~y | X = ~x ) _ = _ fract{1,&sqrt.( 2&pi.&omega.^2 )} exp rndb{ fract{ - ( ~y - ( &alpha. + &beta.~x ) )^2,2&omega.^2}}
i.e.
( ~Y | ~X = ~x ) _ ~ _ N ( &alpha. + &beta.~x, &omega. )
[ Equivalent formulae for the distribution of ~Y and the conditional distribution of ~x given that _ ~Y = ~y _ can be obtained.]
Given a sample of ~n independent observations, #~x, ... , #~x_{~n}, _ #~x_{~i} = ( ~{x_i}, ~{y_i} ) from a two-dimensional normal distribution described above, then the maximum likelihood function is
L( &mu._~X, &mu._~Y, &sigma._~X, &sigma._~Y, &rho. ) _ _ _ _ where _ &mu._~X &in. &reals., &mu._~Y &in. &reals., &sigma._~X &in. &reals.^+, &sigma._~Y &in. &reals.^+, &rho. &in. ( - 1, +1 )
or
L( &mu._~X, &sigma._~X, &alpha., &beta., &omega. ) _ _ _ _ _ where _ &mu._~X &in. &reals., &sigma._~X &in. &reals.^+, &alpha. &in. &reals., &beta. &in. &reals., &omega. &in. &reals.^+
but
L( &mu._~X, &sigma._~X, &alpha., &beta., &omega. ) _ _ = _ _ L_1( &mu._~X, &sigma._~X ) L_2( &alpha., &beta., &omega. )
these functions can be maximised independently.
L_1 is just the ML function for a one-dimensional normal distribution, and from this we can deduce the ML estimators:
est{&mu._~X} _ = _ ${~x} _ = _ fract{1,~n} sum{~{x_i}, ~i = 0, ~n} _ _ _ _ _ _ est{&sigma._~X}^2 _ = _ fract{S_{~{xx}},~n}, _ _ _ _
where S_{~{xx}} _ = _ sum{( ~{x_i} - ${~x} )^2, ~i = 0, ~n}
and, as the distribution is symmetric in X and Y, then an equivalent argument for Y yields:
est{&mu._~Y} _ = _ ${~y} _ = _ fract{1,~n} sum{~{y_i}, ~i = 0, ~n} _ _ _ _ _ _ est{&sigma._~Y}^2 _ = _ fract{S_{~{yy}},~n},
where _ _ _ _ S_{~{yy}} _ = _ sum{( ~{y_i} - ${~y} )^2, ~i = 0, ~n}
Note that it is more usual to use the following estimates for &sigma._~X and &sigma._~Y :
~{s_x}^2 _ = _ fract{S_{~{xx}},~n - 1} _ _ and _ ~{s_y}^2 _ = _ fract{S_{~{yy}},~n - 1}
L_2 has exactly the form of the ML function in regression analysis, from which we deduce:
est{&beta.} _ = _ fract{S_{~{xy}},S_{~{xx}}} _ _ _ _ _ where S_{~{xy}} _ = _ sum{( ~{x_i} - ${~x} )( ~{y_i} - ${~y} ), ~i = 0, ~n}
and so:
est{&rho.} _ = _ fract{est{&sigma._~X} est{&beta.},est{&sigma._~Y}} _ = _ fract{S_{~{xy}},&sqrt.( S_{~{xx}}S_{~{yy}} )},
Finally, the estimators for &alpha. and &omega. are then given in terms of the above estimators:
est{&alpha.} _ = _ ${~y} - est{&beta.}${~x}
est{&omega.}^2 _ = _ ~{s_y}^2( 1 - est{&rho.}^2 )