Clustering Time Series with k-Medoids Based Algorithms

Abstract

Time Series Clustering (TSCL) involves grouping unlabelled time series into homogeneous groups. A popular approach to TSCL is to use the partitional clustering algorithms k-means or k-medoids in conjunction with an elastic distance function such as Dynamic Time Warping (DTW). We explore TSCL using nine different elastic distance measures. Both partitional algorithms characterise clusters with an exemplar series, but use different techniques to do so: k-means uses an averaging algorithm to find an exemplar, whereas k-medoids chooses a training case (medoid). Traditionally, the arithmetic mean of a collection of time series was used with k-means. However, this ignores any offset. In 2011, an averaging technique specific to DTW, called DTW Barycentre Averaging (DBA), was proposed. Since, k-means with DBA has been the algorithm of choice for the majority of partition-based TSCL and much of the research using medoids-based approaches for TSCL stopped. We revisit k-medoids based TSCL with a range of elastic distance measures. Our results show k-medoids approaches are significantly better than k-means on a standard test suite, independent of the elastic distance measure used. We also compare the most commonly used alternating k-medoids approach against the Partition Around Medoids (PAM) algorithm. PAM significantly outperforms the default k-medoids for all nine elastic measures used. Additionally, we evaluate six variants of PAM designed to speed up TSCL. Finally, we show PAM with the best elastic distance measure is significantly better than popular alternative TSCL algorithms, including the k-means DBA approach, and competitive with the best deep learning algorithms.

Publication
International Workshop on Advanced Analytics and Learning on Temporal Data