Now that we've seen the maximum likelihood method applied to both discrete and continuous distributions, we present a summary of the results on this page.
In an experiment, we observe an instance, ~x, of a random variable ~X which we assume has a known distribution (e.g. normal, binomial, etc.), but the distribution is dependent on an unknown parameter &theta. , _ &theta. &in. &Theta. . _ The #~{likelihood function} is:
L( &theta. ) _ #:= _ L( &theta. | ~x ) _ #:= _ _ array{ p ( ~x | &theta. ) , _ _ discrete distribution/ _ / _ f ( ~x | &theta. ) , _ _ continuous distribution}
The #~{maximum likelihood estimator} of &theta. , _ est{&theta.} , _ is such that
L( est{&theta.} ) = sup_{&theta. &in. &Theta.} \{ L( &theta. ) \} |
Now suppose we want to test a hypothesis _ H_0: &theta. &in. &Theta._0 &subset. &Theta. . _ Let &theta._0 maximize L( &theta.) over &Theta._0, _ i.e. _ L( &theta._0 ) = sup_{&theta. &in. &Theta._0} \{ L( &theta. ) \}
Define the #~{likelihood ratio} as
LR( ~x ) _ = _ fract{L( &theta._0 | ~x ),L( est{&theta.} | ~x )} _ = _ fract{sup_{&theta. &in. &Theta._0} \{ L( &theta. ) \}, sup_{&theta. &in. &Theta.} \{ L( &theta. ) \}} |
I.e. LR is the ratio of the maximum likelihood for the hypothesis (&theta. &in. &Theta._0) to the maximum likelihood for the alternative hypothesis (&theta. &in. &Theta.).
Note that LR(~x) will be between 0 and 1, and the greater its value, the more acceptable the hypothesis is.
The #~{log likelihood function} is defined as being:
LL ( ~x ) _ = _ -2 log_~e LR ( ~x ) |
and we have
LR ( ~y ) < LR ( ~x ) _ <=> _ LL ( ~y ) > LL ( ~x )
LR ( ~y ) = LR ( ~x ) _ <=> _ LL ( ~y ) = LL ( ~x )
The #~{significance probability} (SP) of ~x as:
SP ( ~x ) _ = _ P\{ ~y | LR(~y) &leq. LR(~x) \} _ = _ F_{LR} ( LR(~x) ) |
where F_{LR} is the distribution function of the random variable LR(~X). _ Alternatively
SP ( ~x ) _ = _ P\{ ~y | LL(~y) >= LL(~x) \} _ = _ 1 - F_{LL} ( LL(~x) ) _ * |
where F_{LL} is the distribution function of the random variable LL(~X).
* Note that for discrete distributions 1 - F_{LL} ( LL(~x) ) = P\{ ~y | LL(~y) > LL(~x) \}, so SP(~x) = 1 - F_{LL} ( LL(~x) ) + P(~x)
The ~k% #~{confidence interval} as the range of values of &theta._0 for which SP > (100 - ~k)% - i.e. the range of values of &theta._0 for which we would accept the hypothesis H: &theta. = &theta._0, at the (100 - ~k)% level.
[ E.g. the usual value to use for the level of the test, &alpha., is 0.05 or 5%. The corresponding confidence interval is then (100 - 5)% or 95% ]
See examples of MLE statistics calculated for some common distributions.