multipers.ml package
Submodules
multipers.ml.accuracies module
multipers.ml.invariants_with_persistable module
multipers.ml.kernels module
- class multipers.ml.kernels.DistanceList2DistanceMatrix
Bases:
BaseEstimator
,TransformerMixin
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- class multipers.ml.kernels.DistanceMatrices2DistancesList
Bases:
BaseEstimator
,TransformerMixin
Input (degree) x (distance matrix) or (axis) x (degree) x (distance matrix D) Output _ (D1) x opt (axis) x (degree) x (D2, , with indices first)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- predict(X)
- transform(X)
- class multipers.ml.kernels.DistanceMatrix2DistanceList
Bases:
BaseEstimator
,TransformerMixin
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- class multipers.ml.kernels.DistanceMatrix2Kernel(sigma=1, axis=None, weights=1)
Bases:
BaseEstimator
,TransformerMixin
Input : (degree) x (distance matrix) or (axis) x (degree) x (distance matrix) in the second case, axis HAS to be specified (meant for cross validation) Output : kernel of the same shape of distance matrix
- Parameters:
sigma (float | Iterable[float])
axis (int | None)
weights (Iterable[float] | float)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- Return type:
ndarray
- class multipers.ml.kernels.DistancesLists2DistanceMatrices
Bases:
BaseEstimator
,TransformerMixin
Input (D1) x opt (axis) x (degree) x (D2 with indices first) Output opt (axis) x (degree) x (distance matrix (D1,D2))
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- Parameters:
X (ndarray)
- transform(X)
multipers.ml.mma module
- class multipers.ml.mma.FilteredComplex2MMA(n_jobs=-1, expand_dim=None, prune_degrees_above=None, progress=False, minpres_degrees=None, plot=False, **persistence_kwargs)
Bases:
BaseEstimator
,TransformerMixin
Turns a list of list of simplextrees or slicers to MMA approximations.
- Parameters:
n_jobs (int)
expand_dim (int | None)
prune_degrees_above (int | None)
minpres_degrees (Iterable[int] | None)
plot (bool)
- _infer_bounding_box(X)
- _input_checks(X)
- static _is_filtered_complex(input)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- class multipers.ml.mma.MMA2IMG(degrees, bandwidth=0.1, power=1, normalize=False, resolution=50, plot=False, box=None, n_jobs=-1, flatten=False, progress=False, grid_strategy='regular', kernel='linear', signed=False)
Bases:
BaseEstimator
,TransformerMixin
- Parameters:
degrees (list)
bandwidth (float)
power (float)
normalize (bool)
resolution (list | int)
plot (bool)
signed (bool)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- class multipers.ml.mma.MMA2Landscape(resolution=[100, 100], degrees=[0, 1], ks=range(0, 5), phi=<function sum>, box=None, plot=False, n_jobs=-1, filtration_quantile=0.01)
Bases:
BaseEstimator
,TransformerMixin
Turns a list of MMA approximations into Landscapes vectorisations
- Parameters:
degrees (list[int] | None)
ks (Iterable[int])
phi (Callable)
plot (bool)
filtration_quantile (float)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- Return type:
list[ndarray]
- class multipers.ml.mma.MMAFormatter(degrees=None, axis=None, verbose=False, normalize=False, weights=None, quantiles=None, dump=False, from_dump=False)
Bases:
BaseEstimator
,TransformerMixin
- Parameters:
degrees (list[int] | None)
verbose (bool)
normalize (bool)
- static _get_module_bound(x, degree)
Output format : (2,num_parameters)
- static _infer_axis(X)
- static _infer_bounds(X, degrees=None, axis=[slice(None, None, None)], quantiles=None)
Compute bounds of filtration values of a list of modules.
Output Format
m,M of shape : (num_axis,num_degrees,2,num_parameters)
- _infer_degrees(X)
- static _infer_grid(X, strategy, resolution, degrees=None)
Given a list of PyModules, computes a multiparameter discrete grid, with a given strategy, from the filtration values of the summands of the modules.
- Parameters:
X (List[PyModule_f64 | PyModule_f32 | PyModule_i32 | PyModule_i64])
strategy (str)
resolution (int)
- static _infer_num_parameters(X, ax=slice(None, None, None))
- static _maybe_from_dump(X_in)
- _sklearn_auto_wrap_output_keys = {'transform'}
- static copy_transform(mod, degrees, translation, rescale_factors, new_box)
- fit(X_in, y=None)
- set_fit_request(*, X_in='$UNCHANGED$')
Configure whether metadata should be requested to be passed to the
fit
method.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True
(seesklearn.set_config()
). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- X_instr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
X_in
parameter infit
.
- selfobject
The updated object.
- Parameters:
self (MMAFormatter)
X_in (bool | None | str)
- Return type:
- set_transform_request(*, X_in='$UNCHANGED$')
Configure whether metadata should be requested to be passed to the
transform
method.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True
(seesklearn.set_config()
). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed totransform
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it totransform
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- X_instr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
X_in
parameter intransform
.
- selfobject
The updated object.
- Parameters:
self (MMAFormatter)
X_in (bool | None | str)
- Return type:
- transform(X_in)
- class multipers.ml.mma.SimplexTree2MMA(n_jobs=-1, expand_dim=None, prune_degrees_above=None, progress=False, minpres_degrees=None, **persistence_kwargs)
Bases:
FilteredComplex2MMA
- Parameters:
n_jobs (int)
expand_dim (int | None)
prune_degrees_above (int | None)
minpres_degrees (Iterable[int] | None)
- _sklearn_auto_wrap_output_keys = {'transform'}
multipers.ml.one module
multipers.ml.point_clouds module
- class multipers.ml.point_clouds.PointCloud2FilteredComplex(bandwidths=[], masses=[], threshold=-inf, complex='rips', sparse=None, num_collapses=-2, kernel='gaussian', log_density=True, expand_dim=1, progress=False, n_jobs=None, fit_fraction=1, verbose=False, safe_conversion=False, output_type=None, reduce_degrees=None)
Bases:
BaseEstimator
,TransformerMixin
- Parameters:
threshold (float)
complex (Literal['alpha', 'rips', 'delaunay'])
sparse (float | None)
num_collapses (int)
kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)
log_density (bool)
expand_dim (int)
progress (bool)
n_jobs (int | None)
fit_fraction (float)
verbose (bool)
safe_conversion (bool)
output_type (Literal['slicer', 'simplextree', 'slicer_vine', 'slicer_novine'] | None)
reduce_degrees (Iterable[int] | None)
- _define_bandwidths(X)
- _define_sts()
- _get_codensities(x_fit, x_sample)
- _get_distance_quantiles_and_threshold(X, qs)
- _get_sts_alpha(x, return_alpha=False)
- Parameters:
x (ndarray)
- _get_sts_delaunay(x)
- Parameters:
x (ndarray)
- _get_sts_rips(x)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- Parameters:
X (ndarray | list)
- transform(X)
- class multipers.ml.point_clouds.PointCloud2SimplexTree(bandwidths=[], masses=[], threshold=inf, complex='rips', sparse=None, num_collapses=-2, kernel='gaussian', log_density=True, expand_dim=1, progress=False, n_jobs=None, fit_fraction=1, verbose=False, safe_conversion=False, output_type=None, reduce_degrees=None)
Bases:
PointCloud2FilteredComplex
- Parameters:
threshold (float)
complex (Literal['alpha', 'rips', 'delaunay'])
sparse (float | None)
num_collapses (int)
kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)
log_density (bool)
expand_dim (int)
progress (bool)
n_jobs (int | None)
fit_fraction (float)
verbose (bool)
safe_conversion (bool)
output_type (Literal['slicer', 'simplextree', 'slicer_vine', 'slicer_novine'] | None)
reduce_degrees (Iterable[int] | None)
- _sklearn_auto_wrap_output_keys = {'transform'}
multipers.ml.signed_measures module
- class multipers.ml.signed_measures.DegreeRips2SignedMeasure(degrees, min_rips_value, max_rips_value, max_normalized_degree, min_normalized_degree, grid_granularity, progress=False, n_jobs=1, sparse=False, _möbius_inversion=True, fit_fraction=1)
Bases:
BaseEstimator
,TransformerMixin
- Parameters:
degrees (Iterable[int])
min_rips_value (float)
max_normalized_degree (float)
min_normalized_degree (float)
grid_granularity (int)
progress (bool)
sparse (bool)
- _sklearn_auto_wrap_output_keys = {'transform'}
- _transform1(data)
- Parameters:
data (ndarray)
- fit(X, y=None)
- Parameters:
X (ndarray | list)
- transform(X)
- class multipers.ml.signed_measures.FilteredComplex2SignedMeasure(degrees=[], rank_degrees=[], filtration_grid=None, progress=False, num_collapses=0, n_jobs=None, resolution=None, plot=False, filtration_quantile=0.0, expand=False, normalize_filtrations=False, grid_strategy='exact', seed=0, fit_fraction=1, out_resolution=None, individual_grid=None, enforce_null_mass=False, flatten=True, backend=None)
Bases:
BaseEstimator
,TransformerMixin
Input
Iterable[SimplexTreeMulti]
Output
Iterable[ list[signed_measure for degree] ]
- signed measure is either
(points : (n x num_parameters) array, weights : (n) int array ) if sparse,
else an integer matrix.
Parameters
degrees : list of degrees to compute. None correspond to the euler characteristic
filtration grid : the grid on which to compute.
If None, the fit will infer it from - fit_fraction : the fraction of data to consider for the fit, seed is controlled by the seed parameter - resolution : the resolution of this grid - filtration_quantile : filtrations values quantile to ignore - grid_strategy:str : ‘regular’ or ‘quantile’ or ‘exact’ - normalize filtration : if sparse, will normalize all filtrations. - expand : expands the simplextree to compute correctly the degree, for flag complexes - invariant : the topological invariant to produce the signed measure. Choices are “hilbert” or “euler”. Will add rank invariant later. - num_collapse : Either an int or “full”. Collapse the complex before doing computation. - _möbius_inversion : if False, will not do the mobius inversion. output has to be a matrix then. - enforce_null_mass : Returns a zero mass measure, by thresholding the module if True.
- _infer_filtration(X)
- _input_checks(X)
- static _is_filtered_complex(input)
- _params_check()
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- transform1(simplextree, ax, thread_id='')
- Parameters:
thread_id (str)
- Parameters:
degrees (list[int | None])
rank_degrees (list[int])
filtration_grid (Sequence[Sequence[ndarray]] | None)
num_collapses (int | str)
resolution (Iterable[int] | int | None)
plot (bool)
filtration_quantile (float)
normalize_filtrations (bool)
grid_strategy (str)
seed (int)
out_resolution (Iterable[int] | int | None)
individual_grid (bool | None)
enforce_null_mass (bool)
backend (str | None)
- class multipers.ml.signed_measures.SignedMeasure2Convolution(filtration_grid=None, kernel='gaussian', bandwidth=1.0, flatten=False, n_jobs=1, resolution=None, grid_strategy='regular', progress=False, backend='pykeops', plot=False, log_density=False, **kde_kwargs)
Bases:
BaseEstimator
,TransformerMixin
Discrete convolution of a signed measure
Input
(data) x (degree) x (signed measure)
Parameters
filtration_grid : Iterable[array] For each filtration, the filtration values on which to evaluate the grid
resolution : int or (num_parameters) : If filtration grid is not given, will infer a grid, with this resolution
grid_strategy : the strategy to generate the grid. Available ones are regular, quantile, exact
flatten : if true, the output will be flattened
kernel : kernel to used to convolve the images.
flatten : flatten the images if True
progress : progress bar if True
backend : sklearn, pykeops or numba.
plot : Creates a plot Figure.
Output
(data) x (concatenation of imgs of degree)
- _plot_imgs(imgs, size=4)
- Parameters:
imgs (Iterable[ndarray])
- _sklearn_auto_wrap_output_keys = {'transform'}
- _sm2smi(signed_measures)
- _transform_from_sparse(X)
- fit(X, y=None)
- transform(X)
- Parameters:
filtration_grid (Iterable[ndarray])
kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)
bandwidth (float | Iterable[float])
flatten (bool)
n_jobs (int)
resolution (int | None)
grid_strategy (str)
progress (bool)
backend (str)
plot (bool)
log_density (bool)
- class multipers.ml.signed_measures.SignedMeasure2SlicedWassersteinDistance(n_jobs=None, num_directions=10, _sliced=True, epsilon=-1, ground_norm=1, progress=False, grid_reconversion=None, scales=None)
Bases:
BaseEstimator
,TransformerMixin
Transformer from signed measure to distance matrix.
Input
(data) x (degree) x (signed measure)
Format
a signed measure : tuple of array. (point position) : npts x (num_paramters) and weigths : npts
each data is a list of signed measure (for e.g. multiple degrees)
Output
(degree) x (distance matrix)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- predict(X)
- transform(X)
- Parameters:
num_directions (int)
_sliced (bool)
- class multipers.ml.signed_measures.SignedMeasureFormatter(filtrations_weights=None, normalize=False, plot=False, unsparse=False, axis=-1, resolution=50, flatten=False, deep_format=False, unrag=True, n_jobs=1, verbose=False, integrate=False, grid_strategy='regular')
Bases:
BaseEstimator
,TransformerMixin
Input
(data) x (degree) x (signed measure) or (data) x (axis) x (degree) x (signed measure)
Iterable[list[signed_measure_matrix of degree]] or Iterable[previous].
The second is meant to use multiple choices for signed measure input. An example of usage : they come from a Rips + Density with different bandwidth. It is controlled by the axis parameter.
Output
Iterable[list[(reweighted)_sparse_signed_measure of degree]]
or (deep format)
Tensor of shape (num_axis*num_degrees, data, max_num_pts, num_parameters)
- _check_axis(X)
- _check_backend(X)
- _check_measures(X)
- _check_resolution()
- static _check_sm(sm)
- Return type:
bool
- _check_weights()
- _get_filtration_bounds(X, axis)
- _infer_grids(X)
- static _integrate_measure(sm, filtrations)
- _plot_signed_measures(sms, size=4)
- Parameters:
sms (Iterable[ndarray])
- _print_stats(X)
- _rescale_measures(X)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- unsparse_signed_measure(sparse_signed_measure)
- Parameters:
filtrations_weights (Iterable[float] | None)
plot (bool)
unsparse (bool)
axis (int)
resolution (int | Iterable[int])
flatten (bool)
deep_format (bool)
unrag (bool)
n_jobs (int)
verbose (bool)
integrate (bool)
- class multipers.ml.signed_measures.SignedMeasures2SlicedWassersteinDistances(progress=False, n_jobs=1, scales=None, **kwargs)
Bases:
BaseEstimator
,TransformerMixin
Transformer from signed measure to distance matrix. Input —– (data) x opt (axis) x (degree) x (signed measure)
Format
a signed measure : tuple of array. (point position) : npts x (num_paramters) and weigths : npts
each data is a list of signed measure (for e.g. multiple degrees)
Output
(axis) x (degree) x (distance matrix)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- Parameters:
n_jobs (int)
scales (Iterable[Iterable[float]] | None)
- class multipers.ml.signed_measures.SimplexTree2RectangleDecomposition(filtration_grid, degrees, plot=False, reconvert_grid=True, num_collapses=0)
Bases:
BaseEstimator
,TransformerMixin
Transformer. 2 parameter SimplexTrees to their respective rectangle decomposition.
- Parameters:
filtration_grid (ndarray)
degrees (Iterable[int])
num_collapses (int)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
TODO : infer grid from multiple simplextrees
- transform(X)
- Parameters:
X (Iterable[SimplexTreeMulti_KFi32 | SimplexTreeMulti_Fi32 | SimplexTreeMulti_KFi64 | SimplexTreeMulti_Fi64 | SimplexTreeMulti_KFf32 | SimplexTreeMulti_Ff32 | SimplexTreeMulti_KFf64 | SimplexTreeMulti_Ff64])
- class multipers.ml.signed_measures.SimplexTree2SignedMeasure(degrees=[], rank_degrees=[], filtration_grid=None, progress=False, num_collapses=0, n_jobs=None, resolution=None, plot=False, filtration_quantile=0.0, expand=False, normalize_filtrations=False, grid_strategy='exact', seed=0, fit_fraction=1, out_resolution=None, individual_grid=None, enforce_null_mass=False, flatten=True, backend=None)
Bases:
FilteredComplex2SignedMeasure
- Parameters:
degrees (list[int | None])
rank_degrees (list[int])
filtration_grid (Sequence[Sequence[ndarray]] | None)
num_collapses (int | str)
resolution (Iterable[int] | int | None)
plot (bool)
filtration_quantile (float)
normalize_filtrations (bool)
grid_strategy (str)
seed (int)
out_resolution (Iterable[int] | int | None)
individual_grid (bool | None)
enforce_null_mass (bool)
backend (str | None)
- _sklearn_auto_wrap_output_keys = {'transform'}
- multipers.ml.signed_measures._st2ranktensor(st, filtration_grid, degree, plot, reconvert_grid, num_collapse=0)
TODO
- Parameters:
st (SimplexTreeMulti_KFi32 | SimplexTreeMulti_Fi32 | SimplexTreeMulti_KFi64 | SimplexTreeMulti_Fi64 | SimplexTreeMulti_KFf32 | SimplexTreeMulti_Ff32 | SimplexTreeMulti_KFf64 | SimplexTreeMulti_Ff64)
filtration_grid (ndarray)
degree (int)
plot (bool)
reconvert_grid (bool)
num_collapse (int | str)
- multipers.ml.signed_measures.batch_signed_measure_convolutions(signed_measures, x, bandwidth, kernel, api=None)
Input
signed_measures: unragged, of shape (num_data, num_pts, D+1) where last coord is weights, (0 for dummy points)
x : the points to convolve (num_x,D)
bandwidth : the bandwidths or covariance matrix inverse or … of the kernel
kernel : “gaussian”, “multivariate_gaussian”, “exponential”, or Callable (x_i, y_i, bandwidth)->float
Output
Array of shape (num_convolutions, (num_axis), num_data, Array of shape (num_convolutions, (num_axis), num_data, max_x_size)
- Parameters:
kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)
- multipers.ml.signed_measures.deep_unrag(sms, api=None)
- multipers.ml.signed_measures.rescale_sparse_signed_measure(signed_measure, filtration_weights, normalize_scales=None)
- multipers.ml.signed_measures.sm2deep(signed_measure, api=None)
- multipers.ml.signed_measures.sm_convolution(sms, grid, bandwidth, kernel='gaussian', plot=False, **plt_kwargs)
- Parameters:
kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)
plot (bool)
- multipers.ml.signed_measures.tensor_möbius_inversion(tensor, grid_conversion=None, plot=False, raw=False, num_parameters=None)
- Parameters:
grid_conversion (Iterable[ndarray] | None)
plot (bool)
raw (bool)
num_parameters (int | None)
multipers.ml.sliced_wasserstein module
- class multipers.ml.sliced_wasserstein.SlicedWassersteinDistance(num_directions=10, scales=None, n_jobs=None)
Bases:
BaseEstimator
,TransformerMixin
This is a class for computing the sliced Wasserstein distance matrix from a list of signed measures. The Sliced Wasserstein distance is computed by projecting the signed measures onto lines, comparing the projections with the 1-norm, and finally integrating over all possible lines. See http://proceedings.mlr.press/v70/carriere17a.html for more details.
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
Fit the SlicedWassersteinDistance class on a list of signed measures: signed measures are projected onto the different lines. The measures themselves are then stored in numpy arrays, called measures_.
- Parameters:
X (list of tuples): input signed measures. y (n x 1 array): signed measure labels (unused).
- transform(X)
Compute all sliced Wasserstein distances between the signed measures that were stored after calling the fit() method, and a given list of (possibly different) signed measures.
- Parameters:
X (list of tuples): input signed measures.
- Returns:
numpy array of shape (number of measures in measures) x (number of measures in X): matrix of pairwise sliced Wasserstein distances.
- class multipers.ml.sliced_wasserstein.WassersteinDistance(epsilon=1.0, ground_norm=1, n_jobs=None)
Bases:
BaseEstimator
,TransformerMixin
This is a class for computing the Wasserstein distance matrix from a list of signed measures.
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
Fit the WassersteinDistance class on a list of signed measures. The measures themselves are then stored in numpy arrays, called measures_.
- Parameters:
X (list of tuples): input signed measures. y (n x 1 array): signed measure labels (unused).
- transform(X)
Compute all Wasserstein distances between the signed measures that were stored after calling the fit() method, and a given list of (possibly different) signed measures.
- Parameters:
X (list of tuples): input signed measures.
- Returns:
numpy array of shape (number of measures in measures) x (number of measures in X): matrix of pairwise Wasserstein distances.
- multipers.ml.sliced_wasserstein._compute_signed_measure_parts(X)
This is a function for separating the positive and negative points of a list of signed measures. This function can be used as a preprocessing step in order to speed up the running time for computing all pairwise (sliced) Wasserstein distances on a list of signed measures.
- Parameters:
X (list of n tuples): list of signed measures.
- Returns:
list of n pairs of numpy arrays of shape (num x dimension): list of positive and negative signed measures.
- multipers.ml.sliced_wasserstein._compute_signed_measure_projections(X, num_directions, scales)
This is a function for projecting the points of a list of signed measures onto a fixed number of lines sampled uniformly. This function can be used as a preprocessing step in order to speed up the running time for computing all pairwise sliced Wasserstein distances on a list of signed measures.
- Parameters:
X (list of n tuples): list of signed measures. num_directions (int): number of lines evenly sampled from [-pi/2,pi/2] in order to approximate and speed up the distance computation. scales (array of shape D): scales associated to the dimensions.
- Returns:
list of n pairs of numpy arrays of shape (num x num_directions): list of positive and negative projected signed measures.
- multipers.ml.sliced_wasserstein._pairwise(fallback, skipdiag, X, Y, metric, n_jobs)
- multipers.ml.sliced_wasserstein._sklearn_wrapper(metric, X, Y, **kwargs)
This function is a wrapper for any metric between two signed measures that takes two numpy arrays of shapes (nxD) and (mxD) as arguments.
- multipers.ml.sliced_wasserstein._sliced_wasserstein_distance(meas1, meas2, num_directions, scales=None)
This is a function for computing the sliced Wasserstein distance from two signed measures. The Sliced Wasserstein distance is computed by projecting the signed measures onto lines, comparing the projections with the 1-norm, and finally averaging over the lines. See http://proceedings.mlr.press/v70/carriere17a.html for more details.
- Parameters:
meas1: ((n x D), (n)) tuple with numpy.array encoding the (finite points of the) first measure and their multiplicities. Must not contain essential points (i.e. with infinite coordinate). meas2: ((m x D), (m)) tuple encoding the second measure. num_directions (int): number of lines evenly sampled from [-pi/2,pi/2] in order to approximate and speed up the distance computation. scales (array of shape D): scales associated to the dimensions.
- Returns:
float: the sliced Wasserstein distance between signed measures.
- multipers.ml.sliced_wasserstein._sliced_wasserstein_distance_on_projections(meas1, meas2, scales=None)
This is a function for computing the sliced Wasserstein distance between two signed measures that have already been projected onto some lines. It simply amounts to comparing the sorted projections with the 1-norm, and averaging over the lines. See http://proceedings.mlr.press/v70/carriere17a.html for more details.
- Parameters:
meas1: pair of (n x number_of_lines) numpy.arrays containing the projected points of the positive and negative parts of the first measure. meas2: pair of (m x number_of_lines) numpy.arrays containing the projected points of the positive and negative parts of the second measure. scales (array of shape D): scales associated to the dimensions.
- Returns:
float: the sliced Wasserstein distance between the projected signed measures.
- multipers.ml.sliced_wasserstein._wasserstein_distance(meas1, meas2, epsilon, ground_norm)
This is a function for computing the Wasserstein distance from two signed measures.
- Parameters:
meas1: ((n x D), (n)) tuple with numpy.array encoding the (finite points of the) first measure and their multiplicities. Must not contain essential points (i.e. with infinite coordinate). meas2: ((m x D), (m)) tuple encoding the second measure. epsilon (float): entropy regularization parameter. ground_norm (int): norm to use for ground metric cost.
- Returns:
float: the Wasserstein distance between signed measures.
- multipers.ml.sliced_wasserstein._wasserstein_distance_on_parts(ground_norm=1, epsilon=1.0)
This is a function for computing the Wasserstein distance between two signed measures that have already been separated into their positive and negative parts.
- Parameters:
meas1: pair of (n x dimension) numpy.arrays containing the points of the positive and negative parts of the first measure. meas2: pair of (m x dimension) numpy.arrays containing the points of the positive and negative parts of the second measure.
- Returns:
float: the sliced Wasserstein distance between the projected signed measures.
- multipers.ml.sliced_wasserstein.pairwise_signed_measure_distances(X, Y=None, metric='sliced_wasserstein', n_jobs=None, **kwargs)
This function computes the distance matrix between two lists of signed measures given as numpy arrays of shape (nxD).
- Parameters:
X (list of n tuples): first list of signed measures. Y (list of m tuples): second list of signed measures (optional). If None, pairwise distances are computed from the first list only. metric: distance to use. It can be either a string (“sliced_wasserstein”, “wasserstein”) or a function taking two tuples as inputs. If it is a function, make sure that it is symmetric and that it outputs 0 if called on the same two tuples. n_jobs (int): number of jobs to use for the computation. This uses joblib.Parallel(prefer=”threads”), so metrics that do not release the GIL may not scale unless run inside a joblib.parallel_backend block. **kwargs: optional keyword parameters. Any further parameters are passed directly to the distance function. See the docs of the various distance classes in this module.
- Returns:
numpy array of shape (nxm): distance matrix
multipers.ml.tools module
- class multipers.ml.tools.SimplexTreeEdgeCollapser(num_collapses=0, full=False, max_dimension=None, n_jobs=1)
Bases:
BaseEstimator
,TransformerMixin
- Parameters:
num_collapses (int)
full (bool)
max_dimension (int | None)
n_jobs (int)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- Parameters:
X (ndarray | list)
- transform(X)
- multipers.ml.tools.filtration_grid_to_coordinates(F, return_resolution)
- multipers.ml.tools.get_filtration_weights_grid(num_parameters=2, resolution=3, *, min=0, max=20, dtype=<class 'float'>, remove_homothetie=True, weights=None)
- Provides a grid of weights, for filtration rescaling.
num parameter : the dimension of the grid tensor
resolution : the size of each coordinate
min : minimum weight
max : maximum weight
weights : custom weights (instead of linspace between min and max)
dtype : the type of the grid values (useful for int weights)
- Parameters:
num_parameters (int)
resolution (int | Iterable[int])
min (float)
max (float)
remove_homothetie (bool)
- multipers.ml.tools.get_simplex_tree_from_delayed(x)
- Return type:
SimplexTreeMulti
- multipers.ml.tools.get_simplextree(x)
- Return type:
SimplexTreeMulti