Clustering: msmbuilder.clustering¶
MSMBuilder uses clustering on MD trajectories to discretize phase space. A number of clustering algorithms are provided, and each can be used with a variety of metrics (link to metrics page?) to produce a large set of possible discretizations.
Currently, the following clustering algorithms are available
KCenters, HybridKMedoids, Clarans, Hierarchical
Abstract Classes¶
- class msmbuilder.clustering.BaseFlatClusterer(metric, trajectories=None, prep_trajectories=None)[source]¶
(Abstract) base class / mixin that Clusterers can extend. Provides convenience functions for the user.
To implement a clusterer using this base class, subclass it and define your init method to do the clustering you want, and then set self._generator_indices, self._assignments, and self._distances with the result.
For convenience (and to enable some of its functionality), let BaseFlatCluster prepare the trajectories for you by calling BaseFlatClusterer’s __init__ method and then using the prepared, concatenated trajectory self.ptraj for your clustering.
BaseFlatClusterer.get_distances() Extract the distance from each frame to its assigned cluster kcenter BaseFlatClusterer.get_assignments() Assign the trajectories you passed into the constructor based on BaseFlatClusterer.get_generators_as_traj() Get a trajectory containing the generators
Flat Clustering Classes¶
- class msmbuilder.clustering.KCenters(metric, trajectories=None, prep_trajectories=None, k=None, distance_cutoff=None, seed=0)[source]¶
Bases: msmbuilder.clustering.BaseFlatClusterer
KCenters.__init__(metric[, trajectories, ...]) Run kcenters clustering algorithm. KCenters.get_distances() Extract the distance from each frame to its assigned cluster kcenter KCenters.get_assignments() Assign the trajectories you passed into the constructor based on KCenters.get_generators_as_traj() Get a trajectory containing the generators
- class msmbuilder.clustering.HybridKMedoids(metric, trajectories=None, prep_trajectories=None, k=None, distance_cutoff=None, local_num_iters=10, global_num_iters=0, norm_exponent=2.0, too_close_cutoff=0.0001, ignore_max_objective=False)[source]¶
Bases: msmbuilder.clustering.BaseFlatClusterer
HybridKMedoids.__init__(metric[, ...]) Run the hybrid kmedoids clustering algorithm on a set of trajectories HybridKMedoids.get_distances() Extract the distance from each frame to its assigned cluster kcenter HybridKMedoids.get_assignments() Assign the trajectories you passed into the constructor based on HybridKMedoids.get_generators_as_traj() Get a trajectory containing the generators
- class msmbuilder.clustering.Clarans(metric, trajectories=None, prep_trajectories=None, k=None, num_local_minima=10, max_neighbors=20, local_swap=False)[source]¶
Bases: msmbuilder.clustering.BaseFlatClusterer
Clarans.__init__(metric[, trajectories, ...]) Run the CLARANS clustering algorithm on the frames in a trajectory Clarans.get_distances() Extract the distance from each frame to its assigned cluster kcenter Clarans.get_assignments() Assign the trajectories you passed into the constructor based on Clarans.get_generators_as_traj() Get a trajectory containing the generators
- class msmbuilder.clustering.SubsampledClarans(metric, trajectories=None, prep_trajectories=None, k=None, num_samples=None, shrink_multiple=None, num_local_minima=10, max_neighbors=20, local_swap=False, parallel=None)[source]¶
Bases: msmbuilder.clustering.BaseFlatClusterer
SubsampledClarans.__init__(metric[, ...]) Run the CLARANS algorithm (see the Clarans class for more description) on SubsampledClarans.get_distances() Extract the distance from each frame to its assigned cluster kcenter SubsampledClarans.get_assignments() Assign the trajectories you passed into the constructor based on SubsampledClarans.get_generators_as_traj() Get a trajectory containing the generators
Hierarchical Clustering¶
- class msmbuilder.clustering.Hierarchical(metric, trajectories, method='single', precomputed_values=None)[source]¶
Hierarchical.get_assignments([k, ...]) Assign the frames into clusters. Hierarchical.load_from_disk(filename) Load up a clusterer from disk Hierarchical.save_to_disk(filename) Save this clusterer to disk.
Clustering Functions¶
_kcenters(metric, ptraj[, k, ...]) | Run kcenters clustering algorithm. |
_hybrid_kmedoids(metric, ptraj[, k, ...]) | Run the hybrid kmedoids clustering algorithm to cluster a trajectory |
_clarans(metric, ptraj, k, num_local_minima, ...) | Run the CLARANS clustering algorithm on the frames in a trajectory |
Utility Functions¶
_assign(metric, ptraj, generator_indices) | Assign the frames in ptraj to the centers with indices generator_indices |
concatenate_trajectories(trajectories) | Concatenate a list of trajectories into a single long trajectory |
unconcatenate_trajectory(trajectory, lengths) | Take a single trajectory that was created by concatenating seperate trajectories and unconcenatenate it, returning the original trajectories. |
split(longlist, lengths) | Split a long list into segments |
stochastic_subsample(trajectories, ...) | Randomly subsample from a trajectory |
deterministic_subsample(trajectories, stride) | Given a list of trajectories, return a single trajectory |
empty_trajectory_like(traj) | Get a trajectory with the right metadata, but no xyz coordinates |
p_norm(data[, p]) | p_norm of an ndarray with XYZ coordinates |