msmbuilder.clustering._clarans¶
- msmbuilder.clustering._clarans(metric, ptraj, k, num_local_minima, max_neighbors, local_swap=True, initial_medoids='kcenters', initial_assignments=None, initial_distance=None, verbose=True)[source]¶
Run the CLARANS clustering algorithm on the frames in a trajectory
Parameters: metric : msmbuilder.metrics.AbstractDistanceMetric
A metric capable of handling ptraj
ptraj : prepared trajectory
ptraj return by the action of the preceding metric on a msmbuilder trajectory
k : int
number of desired clusters
num_local_minima : int
number of local minima in the set of all possible clusterings to identify. Execution time will scale linearly with this parameter. The best of these local minima will be returned.
max_neighbors : int
number of rejected swaps in a row necessary to declare a proposed clustering a local minima
local_swap : bool, optional
If true, proposed swaps will be between a medoid and a data point currently assigned to that medoid. If false, the data point for the proposed swap is selected randomly.
initial_medoids : {‘kcenters’, ‘random’, ndarray}, optional
If ‘kcenters’, run kcenters clustering first to get the initial medoids, and then run the swaps to improve it. If ‘random’, select the medoids at random. Otherwise, initial_medoids should be a numpy array of the indices of the medoids.
initial_assignments : {None, ndarray}, optional
If None, initial_assignments will be computed based on the initial_medoids. If you pass in your own initial_medoids, you can also pass in initial_assignments to avoid recomputing them.
initial_distances : {None, ndarray}, optional
If None, initial_distances will be computed based on the initial_medoids. If you pass in your own initial_medoids, you can also pass in initial_distances to avoid recomputing them.
verbose : bool, optional
Print information about the swaps being attempted
Returns: generator_indices : ndarray
indices (with respect to ptraj) of the frames to be considered cluster centers
assignments : ndarray
the cluster center to which each frame is assigned to (1D)
distances : ndarray
distance from each of the frames to the cluster center it was assigned to