Is similar to “k-means” clustering. But does not create new points as centroids. Instead uses existing data points and tries to find “k” centroids among them.

At the start randomly chooses “k” data points and computes distance of other points from them.

“K-medoids” is NON-hierarchical clustering – clusters are separate.

Implementation in R: command “pam”

I will use the same example as for method “K-means”:

Let’s try 3 clusters:

> library(cluster)
> pamfit3<-pam(aggtime,3)
> pamfit3
Medoids:
     ID     
[1,]  2  1.9
[2,] 10 26.8
[3,] 12 55.0
Clustering vector:
 [1] 1 1 1 1 1 1 1 1 2 2 2 3 3 3
Objective function:
   build     swap 
3.914286 2.500000 

Available components:
 [1] "medoids"    "id.med"     "clustering" "objective" 
 [5] "isolation"  "clusinfo"   "silinfo"    "diss"      
 [9] "call"       "data"      

> pamfit3$clusinfo
     size max_diss  av_diss diameter separation
[1,]    8      5.9 2.000000      6.9       17.2
[2,]    3      1.8 1.000000      3.0       17.2
[3,]    3      9.0 5.333333     16.0       20.0
> pamfit3$silinfo
$widths
   cluster neighbor sil_width
5        1        2 0.9078341
2        1        2 0.9074610
3        1        2 0.9055208
1        1        2 0.8980392
4        1        2 0.8921623
6        1        2 0.8429248
7        1        2 0.8216028
8        1        2 0.7279635
10       2        1 0.9361022
11       2        1 0.9148936
9        2        1 0.8892734
12       3        2 0.7183099
14       3        2 0.6657754
13       3        2 0.4626168

$clus.avg.widths
[1] 0.8629386 0.9134231 0.6155674

$avg.width
[1] 0.8207486

Now 4 clusters:

> pamfit4
Medoids:
     ID     
[1,]  3  1.5
[2,]  7  6.1
[3,] 10 26.8
[4,] 12 55.0
Clustering vector:
 [1] 1 1 1 1 1 2 2 2 3 3 3 4 4 4
Objective function:
   build     swap 
1.764286 1.642857 

Available components:
 [1] "medoids"    "id.med"     "clustering" "objective" 
 [5] "isolation"  "clusinfo"   "silinfo"    "diss"      
 [9] "call"       "data"      

> pamfit4$clusinfo
     size max_diss   av_diss diameter separation
[1,]    5      0.6 0.3400000      1.0        3.6
[2,]    3      1.7 0.7666667      2.3        3.6
[3,]    3      1.8 1.0000000      3.0       17.2
[4,]    3      9.0 5.3333333     16.0       20.0
> pamfit4$silinfo
$widths
   cluster neighbor sil_width
3        1        2 0.9144295
1        1        2 0.9021739
5        1        2 0.8928571
4        1        2 0.8787425
2        1        2 0.8740876
7        2        1 0.7532189
8        2        1 0.6855346
6        2        1 0.6428571
10       3        2 0.9262295
11       3        2 0.9024768
9        3        2 0.8705036
12       4        3 0.7183099
14       4        3 0.6657754
13       4        3 0.4626168

$clus.avg.widths
[1] 0.8924581 0.6938702 0.8997366 0.6155674

$avg.width
[1] 0.7921295

 
And 10 clusters – they look absolutely great but they really do not make sense:

> pamfit10
Medoids:
      ID     
 [1,]  4  0.9
 [2,]  5  1.8
 [3,]  7  6.1
 [4,]  8  7.8
 [5,]  9 25.0
 [6,] 10 26.8
 [7,] 11 28.0
 [8,] 12 55.0
 [9,] 13 48.0
[10,] 14 64.0
Clustering vector:
 [1]  1  2  2  1  2  3  3  4  5  6  7  8  9 10
Objective function:
     build       swap 
0.10714286 0.08571429 

Available components:
 [1] "medoids"    "id.med"     "clustering" "objective" 
 [5] "isolation"  "clusinfo"   "silinfo"    "diss"      
 [9] "call"       "data"      
> pamfit10$clusinfo
      size max_diss   av_diss diameter separation
 [1,]    2      0.2 0.1000000      0.2        0.4
 [2,]    3      0.3 0.1333333      0.4        0.4
 [3,]    2      0.6 0.3000000      0.6        1.7
 [4,]    1      0.0 0.0000000      0.0        1.7
 [5,]    1      0.0 0.0000000      0.0        1.8
 [6,]    1      0.0 0.0000000      0.0        1.2
 [7,]    1      0.0 0.0000000      0.0        1.2
 [8,]    1      0.0 0.0000000      0.0        7.0
 [9,]    1      0.0 0.0000000      0.0        7.0
[10,]    1      0.0 0.0000000      0.0        9.0
> pamfit10$silinfo
$widths
   cluster neighbor sil_width
4        1        2 0.7600000
1        1        2 0.6842105
5        2        1 0.7500000
2        2        1 0.7222222
3        2        1 0.3000000
6        3        4 0.7391304
7        3        4 0.6470588
8        4        3 0.0000000
9        5        6 0.0000000
10       6        7 0.0000000
11       7        6 0.0000000
12       8        9 0.0000000
13       9        8 0.0000000
14      10        8 0.0000000

$clus.avg.widths
 [1] 0.7221053 0.5907407 0.6930946 0.0000000 0.0000000
 [6] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000

$avg.width
[1] 0.3287587