msmbuilder.clustering.SubsampledClarans.__init__

SubsampledClarans.__init__(metric, trajectories=None, prep_trajectories=None, k=None, num_samples=None, shrink_multiple=None, num_local_minima=10, max_neighbors=20, local_swap=False, parallel=None)[source]

Run the CLARANS algorithm (see the Clarans class for more description) on multiple subsamples of the data drawn randomly.

Parameters:

metric : msmbuilder.metrics.AbstractDistanceMetric

A metric capable of handling ptraj

trajectories : Trajectory or list of msmbuilder.Trajectory

data to cluster

prep_trajectories : np.ndarray or None

prepared trajectories instead of msmbuilder.Trajectory

k : int

number of desired clusters

num_samples : int

number of random subsamples to draw

shrink_multiple : int

Each of the subsamples drawn will be of size equal to the total number of frames divided by this number

num_local_minima : int, optional

number of local minima in the set of all possible clusterings to identify. Execution time will scale linearly with this parameter. The best of these local minima will be returned.

max_neighbors : int, optional

number of rejected swaps in a row necessary to declare a proposed clustering a local minima

local_swap : bool, optional

If true, proposed swaps will be between a medoid and a data point currently assigned to that medoid. If false, the data point for the proposed swap is selected randomly

parallel : {None, ‘multiprocessing’, ‘dtm}

Which parallelization library to use. Each of the random subsamples are run independently