Your American History Reference Guide!
- Chi-square distribution

HistoryMania Information Site on Chi-square distribution American History American History Search        American History Browse welcome to our free resource site for all enthusiasts!

Chi-square distribution

For any positive integer k, the chi-square distribution with k degrees of freedom is the probability distribution of the random variable

X=Z_1^2 + \cdots + Z_k^2

where Z_1, \cdots, Z_k are independent standard normal variables (zero expected value and unit variance). This distribution is usually written

X\sim\chi^2_k

The chi-square test can be used to test independence as well as goodness of fit.

An example of a test of independence would be if sex and political affiliation are connected. So you would gather your sample, your expected value, find your critical value, and if the chi-square test is greater than the critical value, you can reject the null, otherwise, you fail to reject the null. (you never accept the null)

The chi-square probability density function is

p_k(x) = \frac{(1/2)^{k/2}}{\Gamma(k/2)} x^{k/2 - 1} e^{-x/2} \quad \mbox{ for }x > 0

and pk(x) = 0 for x \le 0. Here Γ denotes the gamma function. Tables of this distribution — usually in its cumulative form — are widely available (see the External links below for online versions), and the function is included in many spreadsheets (for example OpenOffice.org calc or Microsoft Excel) and all statistical packages.

If p independent linear homogeneous constraints are imposed on these variables, the distribution of X conditional on these constraints is \chi^2_{k-p}, justifying the term "degrees of freedom". The characteristic function of the Chi-square distribution is

φ(t) = (1 - 2it) - k / 2

The chi-square distribution has numerous applications in inferential statistics, for instance in chi-square tests and in estimating variances. It enters the problem of estimating the mean of a normally distributed population and the problem of estimating the slope of a regression line via its role in Student's t-distribution. It enters all analysis of variance problems via its role in the F-distribution, which is the distribution of the ratio of two independent chi-squared random variables.

The normal approximation

If X\sim\chi^2_k, then as k tends to infinity, the distribution of X tends to normality. However, the tendency is slow (the skewness is \sqrt{8/k} and the kurtosis is 12 / k) and two transformations are commonly considered, each of which approaches normality faster than X itself:

Fisher showed that \sqrt{2X} is approximately normally distributed with mean \sqrt{2k-1} and unit variance.

Wilson and Hilferty showed in 1931 that \sqrt[3]{X/k} is approximately normally distributed with mean 1 - 2 / (9k) and variance 2 / (9k).

The expected value of a random variable having chi-square distribution with k degrees of freedom is k and the variance is 2k. The median is given approximately by

k-\frac{2}{3}+\frac{4}{27k}-\frac{8}{729k^2}

Note that 2 degrees of freedom leads to an exponential distribution.

The chi-square distribution is a special case of the gamma distribution.

The information entropy is given by:

H = \int_{-\infty}^\infty p(x)\ln(p(x)) dx = \frac{k}{2} + \ln  \left(   2 \Gamma   \left(    \frac{k}{2}   \right)  \right) + \left(1 - \frac{k}{2}\right) \psi(k/2)

where ψ(x) is the Digamma function.

See also

External links

The contents of this article are licensed from Wikipedia.org under the
GNU Free Documentation License. How to see transparent copy
Search | Browse | Contact | Legal info