K-means represents one of the most popular clustering algorithm. However, it has some limitations: it requires the user to specify the number of clusters in advance and selects initial centroids randomly. The final k-means clustering solution is very sensitive to this initial random selection of cluster centers. The result might be (slightly) different each time you compute k-means.

In this chapter, we described an hybrid method, named **hierarchical k-means clustering** (hkmeans), for improving k-means results.

Contents:

#### Related Book

Practical Guide to Cluster Analysis in R## Algorithm

The algorithm is summarized as follow:

- Compute hierarchical clustering and cut the tree into k-clusters
- Compute the center (i.e the mean) of each cluster
- Compute k-means by using the set of cluster centers (defined in step 2) as the initial cluster centers

Note that, k-means algorithm will improve the initial partitioning generated at the step 2 of the algorithm. Hence, the initial partitioning can be slightly different from the final partitioning obtained in the step 4.

## R code

The R function *hkmeans*() [in *factoextra*], provides an easy solution to compute the hierarchical k-means clustering. The format of the result is similar to the one provided by the standard kmeans() function (see Chapter @ref(kmeans-clustering)).

To install factoextra, type this: *install.packages(“factoextra”)*.

We’ll use the USArrest data set and we start by standardizing the data:

`df <- scale(USArrests)`

```
# Compute hierarchical k-means clustering
library(factoextra)
res.hk <-hkmeans(df, 4)
# Elements returned by hkmeans()
names(res.hk)
```

```
## [1] "cluster" "centers" "totss" "withinss"
## [5] "tot.withinss" "betweenss" "size" "iter"
## [9] "ifault" "data" "hclust"
```

To print all the results, type this:

```
# Print the results
res.hk
```

```
# Visualize the tree
fviz_dend(res.hk, cex = 0.6, palette = "jco",
rect = TRUE, rect_border = "jco", rect_fill = TRUE)
```

```
# Visualize the hkmeans final clusters
fviz_cluster(res.hk, palette = "jco", repel = TRUE,
ggtheme = theme_classic())
```

## Summary

We described hybrid **hierarchical k-means clustering** for improving k-means results.

Dear Alboukadel

If I were to use hierarchical k-means clustering, what function method should I use in the eclust function when doing cluster validation to initialise the function?

Hi Bryan,

In the current version of factoextra, the hkmeans method is not implemented in the eclust() function.

I created an issue on github: https://github.com/kassambara/factoextra/issues/78