Coeficiente de concordancia simple
índice | símbolo | fórmula |
---|---|---|
SMC (concordancia simple) | S | (a + d)/(a + b + c + d) |
Jaccard | J | a/(a + b + c) |
Czekanovski | C | 2a/(2a + b + c) |
Russell & Rao | R | a/(a + b + c + d) |
distancia euclidiana (disimilaridad) | ∂ | [∑(xAi - xBi)2]½ |
as =
- a = ssp comunes
- b = exclusivas de grupo 1
- c = exclusivas de grupo 2
- d = spp ausentes en común
The simple matching coefficient (SMC) or Rand similarity coefficient is a statistic used for comparing the similarity and diversity of sample sets.[1]
A 0 1 B 0 {\displaystyle M_{00}}M_{00} {\displaystyle M_{10}}M_{10} 1 {\displaystyle M_{01}}M_{01} {\displaystyle M_{11}}M_{11} Given two objects, A and B, each with n binary attributes, SMC is defined as:
{\displaystyle {\begin{aligned}{\text{SMC}}&={\frac {\text{number of matching attributes}}{\text{number of attributes}}}\\[8pt]&={\frac {M_{00}+M_{11}}{M_{00}+M_{01}+M_{10}+M_{11}}}\end{aligned}}}{\displaystyle {\begin{aligned}{\text{SMC}}&={\frac {\text{number of matching attributes}}{\text{number of attributes}}}\\[8pt]&={\frac {M_{00}+M_{11}}{M_{00}+M_{01}+M_{10}+M_{11}}}\end{aligned}}} where:
{\displaystyle M_{11}}M_{11} is the total number of attributes where A and B both have a value of 1. {\displaystyle M_{01}}M_{01} is the total number of attributes where the attribute of A is 0 and the attribute of B is 1. {\displaystyle M_{10}}M_{10} is the total number of attributes where the attribute of A is 1 and the attribute of B is 0. {\displaystyle M_{00}}M_{00} is the total number of attributes where A and B both have a value of 0. The simple matching distance (SMD), which measures dissimilarity between sample sets, is given by {\displaystyle 1-{\text{SMC}}}{\displaystyle 1-{\text{SMC}}}.[2]
SMC is linearly related to Hamann similarity: {\displaystyle SMC=(Hamann+1)/2}{\displaystyle SMC=(Hamann+1)/2}. Also, {\displaystyle SMC=1-D^{2}/n}{\displaystyle SMC=1-D^{2}/n}, where {\displaystyle D^{2}}D^{2} is the squared Euclidean distance between the two objects (binary vectors) and n is the number of attributes.