msmbuilder.clustering._clarans

msmbuilder.clustering._clarans(metric, ptraj, k, num_local_minima, max_neighbors, local_swap=True, initial_medoids='kcenters', initial_assignments=None, initial_distance=None, verbose=True)[source]

Run the CLARANS clustering algorithm on the frames in a trajectory

Parameters:

metric : msmbuilder.metrics.AbstractDistanceMetric

A metric capable of handling ptraj

ptraj : prepared trajectory

ptraj return by the action of the preceding metric on a msmbuilder trajectory

k : int

number of desired clusters

num_local_minima : int

number of local minima in the set of all possible clusterings to identify. Execution time will scale linearly with this parameter. The best of these local minima will be returned.

max_neighbors : int

number of rejected swaps in a row necessary to declare a proposed clustering a local minima

local_swap : bool, optional

If true, proposed swaps will be between a medoid and a data point currently assigned to that medoid. If false, the data point for the proposed swap is selected randomly.

initial_medoids : {‘kcenters’, ‘random’, ndarray}, optional

If ‘kcenters’, run kcenters clustering first to get the initial medoids, and then run the swaps to improve it. If ‘random’, select the medoids at random. Otherwise, initial_medoids should be a numpy array of the indices of the medoids.

initial_assignments : {None, ndarray}, optional

If None, initial_assignments will be computed based on the initial_medoids. If you pass in your own initial_medoids, you can also pass in initial_assignments to avoid recomputing them.

initial_distances : {None, ndarray}, optional

If None, initial_distances will be computed based on the initial_medoids. If you pass in your own initial_medoids, you can also pass in initial_distances to avoid recomputing them.

verbose : bool, optional

Print information about the swaps being attempted

Returns:

generator_indices : ndarray

indices (with respect to ptraj) of the frames to be considered cluster centers

assignments : ndarray

the cluster center to which each frame is assigned to (1D)

distances : ndarray

distance from each of the frames to the cluster center it was assigned to