Diferencia entre revisiones de «Índice de Czekanowski»
Ir a la navegación
Ir a la búsqueda
Sin resumen de edición |
Sin resumen de edición |
||
Línea 6: | Línea 6: | ||
d = spp ausentes en común | d = spp ausentes en común | ||
[http://people.revoledu.com/kardi/tutorial/Similarity/SimpleMatching.html Tutorial sobre medición de similitud] | |||
[http://media.wiley.com/product_data/excerpt/61/04714696/0471469661.pdf Tutorial incompleto, buscar referencia. Explicación simple sobre índices de similitud y distancia] | |||
<!-- | |||
With binary variables, we traditionally focus on the notion of similarity rather | |||
than distance (or dissimilarity). Consider two binary vectors x and y that consist | |||
of two strings [xk], [yk] of binary data; compare them coordinatewise and do the | |||
simple counting of occurrences: | |||
number of occurrences when xk and yk are both equal to 1 | |||
number of occurrences when xk = 0 and yk = 1 | |||
number of occurrences when xk = 1 and yk = 0 | |||
number of occurrences when xk and yk are both equal to 0 | |||
These four numbers can be organized in a 2 by 2 co-occurrence matrix (contingency | |||
table) that visualizes how “close” these two strings are to each other. | |||
::1 0 | |||
::1 a b | |||
::0 c d | |||
Evidently the zero nondiagonal entries of this matrix point at the ideal matching | |||
(the highest similarity). Based on these four entries, there are several commonly | |||
encountered measure of similarity of binary vectors x and y. The simplest matching coefficient computes as the following ratio: | |||
a + d | |||
a + b + c + d | |||
(1.4) | |||
The Russell and Rao measure of similarity consists of the quotient | |||
a | |||
a + b + c + d | |||
(1.5) | |||
The Jacard index involves the case when both inputs assume values equal to 1: | |||
a | |||
a + b + c | |||
(1.6) | |||
The Czekanowski index is practically the same as the Jacard index, but by adding | |||
the weight factor of 2, it emphasizes the coincidence of situations where entries | |||
of x and y both assume values equal to 1: | |||
2a | |||
2a + b + c | |||
(1.7) | |||
--> | |||
[[Categoría:Glosario]] [[Categoría:Esbozo]] | [[Categoría:Glosario]] [[Categoría:Esbozo]] |
Revisión del 20:35 1 oct 2006
Cz =
a = ssp comunes b = exclusivas de grupo 1 c = exclusivas de grupo 2 d = spp ausentes en común
Tutorial sobre medición de similitud
Tutorial incompleto, buscar referencia. Explicación simple sobre índices de similitud y distancia