Your American History Reference Guide!
- Mahalanobis distance

HistoryMania Information Site on Mahalanobis distance American History American History Search        American History Browse welcome to our free resource site for all enthusiasts!

Mahalanobis distance

In statistics, Mahalanobis distance is a distance measure invented by P. C. Mahalanobis in 1936. It is based on correlations between variables by which different patterns can be identified and analysed. It is a useful way of determining similarity of an unknown sample set to a known one. It differs from Euclidean distance in that it takes into account the correlations of the data set.

Formally, the Mahalanobis distance from a group of values with mean \mu = ( \mu_1, \mu_2, \mu_3, \dots , \mu_p ) and covariance matrix Σ for a multivariate vector x = ( x_1, x_2, x_3, \dots, x_p ) is defined as:

D_M(x) = \sqrt{(x - \mu)' \Sigma^{-1} (x-\mu)}.\,

Mahalanobis distance can also be defined as dissimilarity measure between two random vectors \vec{x} and \vec{y} of the same distribution with the covariance matrix Σ :

d(\vec{x},\vec{y})=\sqrt{(\vec{x}-\vec{y})'\Sigma^{-1} (\vec{x}-\vec{y})}.\,

If the covariance matrix is the identity matrix then it is the same as Euclidean distance. If covariance matrix is diagonal, then it is called normalized Euclidean distance:

d(\vec{x},\vec{y})= \sqrt{\sum_{i=1}^p  {(x_i - y_i)^2 \over \sigma_i^2}},

where σi is the standard deviation of the xi over the sample set.


The contents of this article are licensed from Wikipedia.org under the
GNU Free Documentation License. How to see transparent copy
Search | Browse | Contact | Legal info