The log likelihood was defined as _ _ LL(~x) _ = _ -2 log_~e LR(~x)
So for the binomial model:
LL(~x) _ _ = _ _ - 2 ~x log_~e rndb{fract{~n &theta._0,~x}} _ - _ 2 ( ~n - ~x ) log_~e rndb{fract{~n ( 1 - &theta._0 ),~n - ~x}}
Or we can write this as
LL( ~x ) _ _ = _ _ - 2 ~x log_~e rndb{fract{~n &theta._0,~x}} _ - _ 2 ~y log_~e rndb{fract{~n &zeta._0 ,~y}}
( where _ ~y = ~n - ~x , _ and _ &zeta._0 = 1 - &theta._0 )
_ _ _ _ _ _ _ = _ _ - 2 ~x log_~e rndb{1 - fract{~x - ~n &theta._0,~x}} _ - _ 2 ~y log_~e rndb{1 - fract{~y - ~n &zeta._0,~y}}
We showed that the distribution of the log-likelihood function could be used to calculate the significance probability. We will now show that we can use the Chi-squared distribution to determine the significance probability.
Using the Taylor expansion around 0, we have
log_~e ( 1 - ~u ) _ = _ - ~u _ - _ fract{~u^2,2} _ - _ ~f ( ~u )
where _ | ~f ( ~u ) | =< ~K | ~u |^3 , _ for _ | ~u | =< 1/2 . _ Using this on the expression for log-likelihood above, we have:
LL( ~x ) _ _ = _ _ 2 ~x rndb{fract{~x - ~n &theta._0,~x}} _ + _ ~x script{rndb{fract{~x - ~n &theta._0,~x}},,,2,} _ + _ 2 ~x ~f rndb{fract{~x - ~n &theta._0,~x}}
_ _ _ _ _ _ _ _ _ _ + _ 2 ~y rndb{fract{~y - ~n &zeta._0,~y}} _ + _ ~y script{rndb{fract{~y - ~n &zeta._0,~y}},,,2,} _ + _ 2 ~y ~f rndb{fract{~y - ~n &zeta._0,~y}}
Note that the first and fourth terms cancel out, as _ ~x + ~y = ~n , _ and _ &theta. + &zeta. = 1 . _ So
LL( ~x ) _ _ = _ _ fract{( ~x - ~n &theta._0 )^2,~x} _ + _ 2 ~x ~f rndb{fract{~x - ~n &theta._0,~x}} _ + _ fract{( ~y - ~n &zeta._0 )^2,~y} _ + _ 2 ~y ~f rndb{fract{~y - ~n &zeta._0,~y}}
Now
0 _ =< _ mod{~x ~f rndb{fract{~x - ~n &theta._0,~x}}} _ =< _ ~K fract{| ~x - ~n &theta._0 |^3,~x^2} _ = _ ~K fract{| ~x - ~n &theta._0 |^3,~n^2} fract{~n^2,~x^2}
which tends to 0 as _ ~n -> &infty. , _ since, if ~x is binomially distributed with parameter &theta._0, then _ ~x - ~n &theta._0 -> 0 _ and _ ~x / ~n -> &theta._0 _ as _ ~n -> &infty. .
Similarly _ ~y ~f ( ( ~y - ~n &zeta._0 ) ./ ~y ) _ -> _ 0 , so, for large ~n, we have the approximation:
LL( ~x ) _ _ ~~ _ _ fract{( ~x - ~n &theta._0 )^2,~x} _ + _ fract{( ~y - ~n &zeta._0 )^2,~y}
_ _ _ _ _ _ _ = _ _ fract{( ~x - ~n &theta._0 )^2,~n &theta._0} fract{~n &theta._0 ,~x} _ + _ fract{( ~y - ~n &zeta._0 )^2,~n &zeta._0} fract{~n &zeta._0,~y}
_ _ _ _ _ _ _ ~~ _ _ fract{( ~x - ~n &theta._0 )^2,~n &theta._0} _ + _ fract{( ~y - ~n &zeta._0 )^2,~n &zeta._0} _ _ =#: _ _ &Chi.^2
Since _ ~x ./ ~n _ -> _ &theta._0 , _ and _ ~y ./ ~n _ -> _ &zeta._0 _ as _ ~n -> &infty.. _ _ Note that:
&Chi.^2 _ _ = _ _ fract{( ~x - ~n &theta._0 )^2,~n &theta._0} _ + _ fract{( ~x - ~n &theta._0 )^2,~n ( 1 - &theta._0 )} _ _ = _ _ fract{( ~x - ~n &theta._0 )^2,~n &theta._0 ( 1 - &theta._0 )}
Now if _ ~x ~ B ( ~n , &theta. ) _ then
fract{( ~x - ~n &theta. ),&sqrt.${~n &theta. ( 1 - &theta. )}}
has approximately the standard normal distribution for large ~n, so, under the hypothesis &Chi.^2 is the square of a standard normal distributed variable, and is therefore Chi-square distributed with one degree of freedom.
The significance probability .
SP ( ~x ) _ = _ P\{ ~y | LR(~y) &leq. LR(~x) \}
_ _ _ _ _ _ _ = _ P\{ ~y | - 2 log_~e LR(~y) >= - 2 log_~e LR(~x) \}
_ _ _ _ _ _ _ = _ 1 - ~F ( - 2 log_~e LR(~x) )
Where ~F is the distribution function of - 2 log_~e LR(~x). But as we have seen, for large ~n, this is approximately
_ _ = _ _ 1 - ~F rndb{fract{( ~x - ~n &theta._0 )^2,~n &theta._0 ( 1 - &theta._0 )}}
where ~F is the &chi.^2 (1 d.f.) distribution function.
#{Example}
Suppose we wanted to test to see if a coin is "fair, i.e. the probabilities of a "head" or a "tail" on a toss are equal and equal to 1/2. The coin is tossed 1000 times, with the result of 534 "heads" and 466 "tails".
In this case then _ ~n = 1000 , _ ~x = 534 , &theta._0 = 0.5 . _ so
&Chi.^2 _ = _ fract{(534 - 1000 # 0.5)^2, 1000 # 0.5 # 0.5} _ = _ fract{34^2,250} _ = _ 4.624
This has a probability distribution value of 96.85% in the &chi.^2 distribution with one degree of freedom, ( see the Mathyma chi-squared distribution look-up facility ) so the significance probability is 3.15%, which makes the hypothesis of it being a fair coin unlikely, and we would reject the hypothesis if we were testing at a 5% level.
For what results would we accept that it is a fair coin? I.e. what is the confidence interval? This depends of course on the level of the test. Suppose that we test at the 5% level, i.e. any result with a significance probability less than 5% will cause us to reject the hypothesis. The value which gives a 5% significance probability is the value with a 95% cumulated distribution in the &chi.^2 (1) distribution, this is 3.8415.
Now _ _ 3.8415 ~~ 961 / 250 = 31^2 /250 . So a result in the interval _ 500 +- 31 , _ i.e. _ ~x &in. ( 469 , 531 ) will lead us to accept the hypothesis of it being a fair coin.