- MATLAB for Machine Learning
- Giuseppe Ciaburro
- 187字
- 2025-04-04 18:32:34
Cluster analysis
Cluster analysis is a multivariate analysis technique through which it is possible to group the statistical units so as to minimize the logic distance of each group and the logic distance between the groups. The logic distance is quantified by means of measures of similarity/dissimilarity between the defined statistical units.
The Statistics and Machine Learning Toolbox provides several algorithms to carry out cluster analysis. Available algorithms include:
- k-means
- k-medoids
- Hierarchical clustering
- GMM
- HMM
When the number of clusters is unknown, we can use cluster evaluation techniques to determine the number of clusters present in the data based on a specified metric.
A typical cluster analysis result is shown in the following figure:

In addition, the Statistics and Machine Learning Toolbox allows viewing clusters by creating a dendrogram plot to display a hierarchical binary cluster tree. Then, we optimize the leaf order to maximize the sum of the similarities between adjacent leaves. Finally, for grouped data with multiple measurements for each group, we create a dendrogram plot based on the group means computed using a multivariate analysis of variance.