Bayesian Hierarchical Clustering // The method performs bottom-up
hierarchical clustering, using a Dirichlet Process (infinite
mixture) to model uncertainty in the data and Bayesian model
selection to decide at each step which clusters to merge. This
avoids several limitations of traditional methods, for example
how many clusters there should be and how to choose a
principled distance metric. This implementation accepts
multinomial (i.e. discrete, with 2+ categories) or time-series
data. This version also includes a randomised algorithm which
is more efficient for larger data sets.