Clear Understanding on Mahalanobis Distance

Clear Understanding on Mahalanobis Distance

In multivariate/ multicharacteristics data, a measure of divergence or distance between groups in terms of multiple characteristics is required.

Lets consider, you are interested in measuring the difference (distance) between groups G1 and G2 (each of p-dimensional). A common assumption is to take the p-dimensional random vector X , from each group, as having same variation about its mean within either group.

The difference between the groups can be considered in terms of difference between mean vectors of X, in each group relative to the common within-group variation (using common (pooled) covariance matrix).

The most often used measure for multiple characteristics data is, Mahalanobis distance (Mahalanobis Δ, where Δ is Uppercase Delta).

The square of Mahalanobis distance is given by,

Δ² = (µ₁-µ₂)^TΣ^-1(µ₁-µ₂) or Δ² = (µ₁-µ₂)^′Σ^-1(µ₁-µ₂)

where the superfix T or ′ denotes matrix transpose,

Σ denotes Covariance matrix of X in each group of G1 and G2.

As Σ is nonsingular matrix, it is positive-definite. Hence, Δ² is metric.

If the variables in X were uncorrelated in each group and were scaled so that they had unit variances, then Σ would be the identity matrix, I, and Mahalanobis Δ, corresponds to using squared Euclidean distance between the group-mean vectors µ_{1 and}µ_2.

For nonsingular matrix, like Σ, Transpose of matrix is equal to the Inverse of matrix.

The presence of transpose of inverse or transpose of covariance matrix, Σ of X, in the quadratic form ,in Mahalanobis distance formula, is to allow for the different scales on which variables are measures and for non-zero correlation between the variables.

Alternately, The quadratic form of Σ has effect of transforming the variables to uncorrelated standardized variables, Y, and computing the squared Euclidean distance between the mean vectors of Y in two groups.

To understand Quadratic form of matrix, if A is squared matrix, we can compute quadratic form by using vector, X.

By looking at the exponents in the final expression, you can see why this is called a quadratic form or transformation of A.

It is now known that many standard distance measures such as Kolmogorov's variational distance, the Hellinger distance, Rao's distance, etc., are increasing functions of Mahalanobis distance under assumptions of normality and homoscedasticity and in certain other situations.

Sample Version of the Mahalanobis Distance, D²:

In practice, the means µ₁ and µ₂, and the common covariance matrix Σ of the two groups G₁ and G₂ are generally unknown and must be estimated from random samples of sizes n₁ and n₂ from G₁ and G₂, yielding sample means x̅₁ and x̅₁ and (bias-corrected) sample covariance matrices S₁ and S₂.

The common covariance matrix Σ can then be estimated by the pooled estimate, given by,

where N = n₁ + n₂ - 2.

The sample version of the Δ² is denoted by D²and is given by

The sample Mahalanobis distance, D², is known to overestimate its population counter part, Δ² .

In the situation where D² is used, knowledge of D²is needed.

It follows under the assumption of normality that cD² is distributed as a noncentral F-distribution with p and N-p+ 1 degrees of freedom and noncentrality parameter cΔ², where c=k (N-p+ l)/(PN) and k=(n₁n₂ )/(n₁+n₂).

When Mahalanobis distance is used to test that an observed random sample x1,...., xn is from a multivariate normal distribution, under the null hypothesis D_j² should be distributed independently (approximate), with common distribution that can be approximated by a chi-squared distribution with p degrees of freedom, where j is jth random sample (where j = n, number of sample)

Mahalanobis formula, in term of respective random sample is,

where x̅ denotes sample mean and S denotes the (bias-corrected) sample covariance matrix of the n observations in the observed sample.

Alternatively, we can form the modified Mahalanobis distances d₁, ..., d_j, where

where x_(j) and S_(j) denote respectively the sample mean and (bias-corrected) sample covariance matrix of the n -1 observations after the deletion of x_j, (j = 1, ... ,n).

In this case, the d_j² can be taken to be approximately independent with the common distribution of qd_j² given exactly by a F-distribution with p and n - p -1 degrees of freedom, where q = (n -1 ) (n-p-1 )/ {(pn)(n-2)}.

Interesting question which can be answered based on Mahalanobis Distance

How different are the metabolic characteristics of normal persons, chemical diabetics and overt diabetics as determined by a total glucose tolerance test and how to make a diagnosis?
On the basis of remote sensing data from satellite, how do you classify various tracts of land by vegetation type, rock type, etc.?

Answer to above questions are of importance in developing methods for medical diagnosis, and in developing GIS, Geographical Information System.

Other questions that can be answered are,

Problem of pattern recognition or discriminant analysis (using Optimal discriminant function, measured in terms of Δ²).
In Classification problem (how is it different that discriminant analysis).

Reference

https://www.ias.ac.in/article/fulltext/reso/004/06/0020-0026 Article on MD by GJ MaLachlan, Resonance 1999.
Chapter 17 Quadratic Form of a Matrix | Matrix Algebra for Educational Scientists (zief0002.github.io)

Search This Blog

SaaS

Clear Understanding on Mahalanobis Distance

Comments

Post a Comment

Popular posts from this blog

Robust covariance matrix estimation in SPSS

Vignettes for Matrix concepts, related operations