multipers.ml package

Submodules

multipers.ml.accuracies module

multipers.ml.invariants_with_persistable module

multipers.ml.kernels module

class multipers.ml.kernels.DistanceList2DistanceMatrix

Bases: BaseEstimator, TransformerMixin

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
class multipers.ml.kernels.DistanceMatrices2DistancesList

Bases: BaseEstimator, TransformerMixin

Input (degree) x (distance matrix) or (axis) x (degree) x (distance matrix D) Output _ (D1) x opt (axis) x (degree) x (D2, , with indices first)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
predict(X)
transform(X)
class multipers.ml.kernels.DistanceMatrix2DistanceList

Bases: BaseEstimator, TransformerMixin

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
class multipers.ml.kernels.DistanceMatrix2Kernel(sigma=1, axis=None, weights=1)

Bases: BaseEstimator, TransformerMixin

Input : (degree) x (distance matrix) or (axis) x (degree) x (distance matrix) in the second case, axis HAS to be specified (meant for cross validation) Output : kernel of the same shape of distance matrix

Parameters:
  • sigma (float | Iterable[float])

  • axis (int | None)

  • weights (Iterable[float] | float)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
Return type:

ndarray

class multipers.ml.kernels.DistancesLists2DistanceMatrices

Bases: BaseEstimator, TransformerMixin

Input (D1) x opt (axis) x (degree) x (D2 with indices first) Output opt (axis) x (degree) x (distance matrix (D1,D2))

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
Parameters:

X (ndarray)

transform(X)

multipers.ml.mma module

class multipers.ml.mma.FilteredComplex2MMA(n_jobs=-1, expand_dim=None, prune_degrees_above=None, progress=False, minpres_degrees=None, plot=False, **persistence_kwargs)

Bases: BaseEstimator, TransformerMixin

Turns a list of list of simplextrees or slicers to MMA approximations.

Parameters:
  • n_jobs (int)

  • expand_dim (int | None)

  • prune_degrees_above (int | None)

  • minpres_degrees (Iterable[int] | None)

  • plot (bool)

_infer_bounding_box(X)
_input_checks(X)
static _is_filtered_complex(input)
_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
class multipers.ml.mma.MMA2IMG(degrees, bandwidth=0.1, power=1, normalize=False, resolution=50, plot=False, box=None, n_jobs=-1, flatten=False, progress=False, grid_strategy='regular', kernel='linear', signed=False)

Bases: BaseEstimator, TransformerMixin

Parameters:
  • degrees (list)

  • bandwidth (float)

  • power (float)

  • normalize (bool)

  • resolution (list | int)

  • plot (bool)

  • signed (bool)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
class multipers.ml.mma.MMA2Landscape(resolution=[100, 100], degrees=[0, 1], ks=range(0, 5), phi=<function sum>, box=None, plot=False, n_jobs=-1, filtration_quantile=0.01)

Bases: BaseEstimator, TransformerMixin

Turns a list of MMA approximations into Landscapes vectorisations

Parameters:
  • degrees (list[int] | None)

  • ks (Iterable[int])

  • phi (Callable)

  • plot (bool)

  • filtration_quantile (float)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
Return type:

list[ndarray]

class multipers.ml.mma.MMAFormatter(degrees=None, axis=None, verbose=False, normalize=False, weights=None, quantiles=None, dump=False, from_dump=False)

Bases: BaseEstimator, TransformerMixin

Parameters:
  • degrees (list[int] | None)

  • verbose (bool)

  • normalize (bool)

static _get_module_bound(x, degree)

Output format : (2,num_parameters)

static _infer_axis(X)
static _infer_bounds(X, degrees=None, axis=[slice(None, None, None)], quantiles=None)

Compute bounds of filtration values of a list of modules.

Output Format

m,M of shape : (num_axis,num_degrees,2,num_parameters)

_infer_degrees(X)
static _infer_grid(X, strategy, resolution, degrees=None)

Given a list of PyModules, computes a multiparameter discrete grid, with a given strategy, from the filtration values of the summands of the modules.

Parameters:
  • X (List[PyModule_f64 | PyModule_f32 | PyModule_i32 | PyModule_i64])

  • strategy (str)

  • resolution (int)

static _infer_num_parameters(X, ax=slice(None, None, None))
static _maybe_from_dump(X_in)
_sklearn_auto_wrap_output_keys = {'transform'}
static copy_transform(mod, degrees, translation, rescale_factors, new_box)
fit(X_in, y=None)
set_fit_request(*, X_in='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

X_instr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for X_in parameter in fit.

selfobject

The updated object.

Parameters:
Return type:

MMAFormatter

set_transform_request(*, X_in='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the transform method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

X_instr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for X_in parameter in transform.

selfobject

The updated object.

Parameters:
Return type:

MMAFormatter

transform(X_in)
class multipers.ml.mma.SimplexTree2MMA(n_jobs=-1, expand_dim=None, prune_degrees_above=None, progress=False, minpres_degrees=None, **persistence_kwargs)

Bases: FilteredComplex2MMA

Parameters:
  • n_jobs (int)

  • expand_dim (int | None)

  • prune_degrees_above (int | None)

  • minpres_degrees (Iterable[int] | None)

_sklearn_auto_wrap_output_keys = {'transform'}

multipers.ml.one module

multipers.ml.point_clouds module

class multipers.ml.point_clouds.PointCloud2FilteredComplex(bandwidths=[], masses=[], threshold=-inf, complex='rips', sparse=None, num_collapses=-2, kernel='gaussian', log_density=True, expand_dim=1, progress=False, n_jobs=None, fit_fraction=1, verbose=False, safe_conversion=False, output_type=None, reduce_degrees=None)

Bases: BaseEstimator, TransformerMixin

Parameters:
  • threshold (float)

  • complex (Literal['alpha', 'rips', 'delaunay'])

  • sparse (float | None)

  • num_collapses (int)

  • kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)

  • log_density (bool)

  • expand_dim (int)

  • progress (bool)

  • n_jobs (int | None)

  • fit_fraction (float)

  • verbose (bool)

  • safe_conversion (bool)

  • output_type (Literal['slicer', 'simplextree', 'slicer_vine', 'slicer_novine'] | None)

  • reduce_degrees (Iterable[int] | None)

_define_bandwidths(X)
_define_sts()
_get_codensities(x_fit, x_sample)
_get_distance_quantiles_and_threshold(X, qs)
_get_sts_alpha(x, return_alpha=False)
Parameters:

x (ndarray)

_get_sts_delaunay(x)
Parameters:

x (ndarray)

_get_sts_rips(x)
_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
Parameters:

X (ndarray | list)

transform(X)
class multipers.ml.point_clouds.PointCloud2SimplexTree(bandwidths=[], masses=[], threshold=inf, complex='rips', sparse=None, num_collapses=-2, kernel='gaussian', log_density=True, expand_dim=1, progress=False, n_jobs=None, fit_fraction=1, verbose=False, safe_conversion=False, output_type=None, reduce_degrees=None)

Bases: PointCloud2FilteredComplex

Parameters:
  • threshold (float)

  • complex (Literal['alpha', 'rips', 'delaunay'])

  • sparse (float | None)

  • num_collapses (int)

  • kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)

  • log_density (bool)

  • expand_dim (int)

  • progress (bool)

  • n_jobs (int | None)

  • fit_fraction (float)

  • verbose (bool)

  • safe_conversion (bool)

  • output_type (Literal['slicer', 'simplextree', 'slicer_vine', 'slicer_novine'] | None)

  • reduce_degrees (Iterable[int] | None)

_sklearn_auto_wrap_output_keys = {'transform'}

multipers.ml.signed_measures module

class multipers.ml.signed_measures.DegreeRips2SignedMeasure(degrees, min_rips_value, max_rips_value, max_normalized_degree, min_normalized_degree, grid_granularity, progress=False, n_jobs=1, sparse=False, _möbius_inversion=True, fit_fraction=1)

Bases: BaseEstimator, TransformerMixin

Parameters:
  • degrees (Iterable[int])

  • min_rips_value (float)

  • max_normalized_degree (float)

  • min_normalized_degree (float)

  • grid_granularity (int)

  • progress (bool)

  • sparse (bool)

_sklearn_auto_wrap_output_keys = {'transform'}
_transform1(data)
Parameters:

data (ndarray)

fit(X, y=None)
Parameters:

X (ndarray | list)

transform(X)
class multipers.ml.signed_measures.FilteredComplex2SignedMeasure(degrees=[], rank_degrees=[], filtration_grid=None, progress=False, num_collapses=0, n_jobs=None, resolution=None, plot=False, filtration_quantile=0.0, expand=False, normalize_filtrations=False, grid_strategy='exact', seed=0, fit_fraction=1, out_resolution=None, individual_grid=None, enforce_null_mass=False, flatten=True, backend=None)

Bases: BaseEstimator, TransformerMixin

Input

Iterable[SimplexTreeMulti]

Output

Iterable[ list[signed_measure for degree] ]

signed measure is either
  • (points : (n x num_parameters) array, weights : (n) int array ) if sparse,

  • else an integer matrix.

Parameters

  • degrees : list of degrees to compute. None correspond to the euler characteristic

  • filtration grid : the grid on which to compute.

If None, the fit will infer it from - fit_fraction : the fraction of data to consider for the fit, seed is controlled by the seed parameter - resolution : the resolution of this grid - filtration_quantile : filtrations values quantile to ignore - grid_strategy:str : ‘regular’ or ‘quantile’ or ‘exact’ - normalize filtration : if sparse, will normalize all filtrations. - expand : expands the simplextree to compute correctly the degree, for flag complexes - invariant : the topological invariant to produce the signed measure. Choices are “hilbert” or “euler”. Will add rank invariant later. - num_collapse : Either an int or “full”. Collapse the complex before doing computation. - _möbius_inversion : if False, will not do the mobius inversion. output has to be a matrix then. - enforce_null_mass : Returns a zero mass measure, by thresholding the module if True.

_infer_filtration(X)
_input_checks(X)
static _is_filtered_complex(input)
_params_check()
_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
transform1(simplextree, ax, thread_id='')
Parameters:

thread_id (str)

Parameters:
  • degrees (list[int | None])

  • rank_degrees (list[int])

  • filtration_grid (Sequence[Sequence[ndarray]] | None)

  • num_collapses (int | str)

  • resolution (Iterable[int] | int | None)

  • plot (bool)

  • filtration_quantile (float)

  • normalize_filtrations (bool)

  • grid_strategy (str)

  • seed (int)

  • out_resolution (Iterable[int] | int | None)

  • individual_grid (bool | None)

  • enforce_null_mass (bool)

  • backend (str | None)

class multipers.ml.signed_measures.SignedMeasure2Convolution(filtration_grid=None, kernel='gaussian', bandwidth=1.0, flatten=False, n_jobs=1, resolution=None, grid_strategy='regular', progress=False, backend='pykeops', plot=False, log_density=False, **kde_kwargs)

Bases: BaseEstimator, TransformerMixin

Discrete convolution of a signed measure

Input

(data) x (degree) x (signed measure)

Parameters

  • filtration_grid : Iterable[array] For each filtration, the filtration values on which to evaluate the grid

  • resolution : int or (num_parameters) : If filtration grid is not given, will infer a grid, with this resolution

  • grid_strategy : the strategy to generate the grid. Available ones are regular, quantile, exact

  • flatten : if true, the output will be flattened

  • kernel : kernel to used to convolve the images.

  • flatten : flatten the images if True

  • progress : progress bar if True

  • backend : sklearn, pykeops or numba.

  • plot : Creates a plot Figure.

Output

(data) x (concatenation of imgs of degree)

_plot_imgs(imgs, size=4)
Parameters:

imgs (Iterable[ndarray])

_sklearn_auto_wrap_output_keys = {'transform'}
_sm2smi(signed_measures)
_transform_from_sparse(X)
fit(X, y=None)
transform(X)
Parameters:
  • filtration_grid (Iterable[ndarray])

  • kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)

  • bandwidth (float | Iterable[float])

  • flatten (bool)

  • n_jobs (int)

  • resolution (int | None)

  • grid_strategy (str)

  • progress (bool)

  • backend (str)

  • plot (bool)

  • log_density (bool)

class multipers.ml.signed_measures.SignedMeasure2SlicedWassersteinDistance(n_jobs=None, num_directions=10, _sliced=True, epsilon=-1, ground_norm=1, progress=False, grid_reconversion=None, scales=None)

Bases: BaseEstimator, TransformerMixin

Transformer from signed measure to distance matrix.

Input

(data) x (degree) x (signed measure)

Format

  • a signed measure : tuple of array. (point position) : npts x (num_paramters) and weigths : npts

  • each data is a list of signed measure (for e.g. multiple degrees)

Output

  • (degree) x (distance matrix)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
predict(X)
transform(X)
Parameters:
  • num_directions (int)

  • _sliced (bool)

class multipers.ml.signed_measures.SignedMeasureFormatter(filtrations_weights=None, normalize=False, plot=False, unsparse=False, axis=-1, resolution=50, flatten=False, deep_format=False, unrag=True, n_jobs=1, verbose=False, integrate=False, grid_strategy='regular')

Bases: BaseEstimator, TransformerMixin

Input

(data) x (degree) x (signed measure) or (data) x (axis) x (degree) x (signed measure)

Iterable[list[signed_measure_matrix of degree]] or Iterable[previous].

The second is meant to use multiple choices for signed measure input. An example of usage : they come from a Rips + Density with different bandwidth. It is controlled by the axis parameter.

Output

Iterable[list[(reweighted)_sparse_signed_measure of degree]]

or (deep format)

Tensor of shape (num_axis*num_degrees, data, max_num_pts, num_parameters)

_check_axis(X)
_check_backend(X)
_check_measures(X)
_check_resolution()
static _check_sm(sm)
Return type:

bool

_check_weights()
_get_filtration_bounds(X, axis)
_infer_grids(X)
static _integrate_measure(sm, filtrations)
_plot_signed_measures(sms, size=4)
Parameters:

sms (Iterable[ndarray])

_print_stats(X)
_rescale_measures(X)
_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
unsparse_signed_measure(sparse_signed_measure)
Parameters:
  • filtrations_weights (Iterable[float] | None)

  • plot (bool)

  • unsparse (bool)

  • axis (int)

  • resolution (int | Iterable[int])

  • flatten (bool)

  • deep_format (bool)

  • unrag (bool)

  • n_jobs (int)

  • verbose (bool)

  • integrate (bool)

class multipers.ml.signed_measures.SignedMeasures2SlicedWassersteinDistances(progress=False, n_jobs=1, scales=None, **kwargs)

Bases: BaseEstimator, TransformerMixin

Transformer from signed measure to distance matrix. Input —– (data) x opt (axis) x (degree) x (signed measure)

Format

  • a signed measure : tuple of array. (point position) : npts x (num_paramters) and weigths : npts

  • each data is a list of signed measure (for e.g. multiple degrees)

Output

  • (axis) x (degree) x (distance matrix)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
Parameters:
  • n_jobs (int)

  • scales (Iterable[Iterable[float]] | None)

class multipers.ml.signed_measures.SimplexTree2RectangleDecomposition(filtration_grid, degrees, plot=False, reconvert_grid=True, num_collapses=0)

Bases: BaseEstimator, TransformerMixin

Transformer. 2 parameter SimplexTrees to their respective rectangle decomposition.

Parameters:
  • filtration_grid (ndarray)

  • degrees (Iterable[int])

  • num_collapses (int)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)

TODO : infer grid from multiple simplextrees

transform(X)
Parameters:

X (Iterable[SimplexTreeMulti_KFi32 | SimplexTreeMulti_Fi32 | SimplexTreeMulti_KFi64 | SimplexTreeMulti_Fi64 | SimplexTreeMulti_KFf32 | SimplexTreeMulti_Ff32 | SimplexTreeMulti_KFf64 | SimplexTreeMulti_Ff64])

class multipers.ml.signed_measures.SimplexTree2SignedMeasure(degrees=[], rank_degrees=[], filtration_grid=None, progress=False, num_collapses=0, n_jobs=None, resolution=None, plot=False, filtration_quantile=0.0, expand=False, normalize_filtrations=False, grid_strategy='exact', seed=0, fit_fraction=1, out_resolution=None, individual_grid=None, enforce_null_mass=False, flatten=True, backend=None)

Bases: FilteredComplex2SignedMeasure

Parameters:
  • degrees (list[int | None])

  • rank_degrees (list[int])

  • filtration_grid (Sequence[Sequence[ndarray]] | None)

  • num_collapses (int | str)

  • resolution (Iterable[int] | int | None)

  • plot (bool)

  • filtration_quantile (float)

  • normalize_filtrations (bool)

  • grid_strategy (str)

  • seed (int)

  • out_resolution (Iterable[int] | int | None)

  • individual_grid (bool | None)

  • enforce_null_mass (bool)

  • backend (str | None)

_sklearn_auto_wrap_output_keys = {'transform'}
multipers.ml.signed_measures._st2ranktensor(st, filtration_grid, degree, plot, reconvert_grid, num_collapse=0)

TODO

Parameters:
  • st (SimplexTreeMulti_KFi32 | SimplexTreeMulti_Fi32 | SimplexTreeMulti_KFi64 | SimplexTreeMulti_Fi64 | SimplexTreeMulti_KFf32 | SimplexTreeMulti_Ff32 | SimplexTreeMulti_KFf64 | SimplexTreeMulti_Ff64)

  • filtration_grid (ndarray)

  • degree (int)

  • plot (bool)

  • reconvert_grid (bool)

  • num_collapse (int | str)

multipers.ml.signed_measures.batch_signed_measure_convolutions(signed_measures, x, bandwidth, kernel, api=None)

Input

  • signed_measures: unragged, of shape (num_data, num_pts, D+1) where last coord is weights, (0 for dummy points)

  • x : the points to convolve (num_x,D)

  • bandwidth : the bandwidths or covariance matrix inverse or … of the kernel

  • kernel : “gaussian”, “multivariate_gaussian”, “exponential”, or Callable (x_i, y_i, bandwidth)->float

Output

Array of shape (num_convolutions, (num_axis), num_data, Array of shape (num_convolutions, (num_axis), num_data, max_x_size)

Parameters:

kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)

multipers.ml.signed_measures.deep_unrag(sms, api=None)
multipers.ml.signed_measures.rescale_sparse_signed_measure(signed_measure, filtration_weights, normalize_scales=None)
multipers.ml.signed_measures.sm2deep(signed_measure, api=None)
multipers.ml.signed_measures.sm_convolution(sms, grid, bandwidth, kernel='gaussian', plot=False, **plt_kwargs)
Parameters:
  • kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)

  • plot (bool)

multipers.ml.signed_measures.tensor_möbius_inversion(tensor, grid_conversion=None, plot=False, raw=False, num_parameters=None)
Parameters:
  • grid_conversion (Iterable[ndarray] | None)

  • plot (bool)

  • raw (bool)

  • num_parameters (int | None)

multipers.ml.sliced_wasserstein module

class multipers.ml.sliced_wasserstein.SlicedWassersteinDistance(num_directions=10, scales=None, n_jobs=None)

Bases: BaseEstimator, TransformerMixin

This is a class for computing the sliced Wasserstein distance matrix from a list of signed measures. The Sliced Wasserstein distance is computed by projecting the signed measures onto lines, comparing the projections with the 1-norm, and finally integrating over all possible lines. See http://proceedings.mlr.press/v70/carriere17a.html for more details.

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)

Fit the SlicedWassersteinDistance class on a list of signed measures: signed measures are projected onto the different lines. The measures themselves are then stored in numpy arrays, called measures_.

Parameters:

X (list of tuples): input signed measures. y (n x 1 array): signed measure labels (unused).

transform(X)

Compute all sliced Wasserstein distances between the signed measures that were stored after calling the fit() method, and a given list of (possibly different) signed measures.

Parameters:

X (list of tuples): input signed measures.

Returns:

numpy array of shape (number of measures in measures) x (number of measures in X): matrix of pairwise sliced Wasserstein distances.

class multipers.ml.sliced_wasserstein.WassersteinDistance(epsilon=1.0, ground_norm=1, n_jobs=None)

Bases: BaseEstimator, TransformerMixin

This is a class for computing the Wasserstein distance matrix from a list of signed measures.

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)

Fit the WassersteinDistance class on a list of signed measures. The measures themselves are then stored in numpy arrays, called measures_.

Parameters:

X (list of tuples): input signed measures. y (n x 1 array): signed measure labels (unused).

transform(X)

Compute all Wasserstein distances between the signed measures that were stored after calling the fit() method, and a given list of (possibly different) signed measures.

Parameters:

X (list of tuples): input signed measures.

Returns:

numpy array of shape (number of measures in measures) x (number of measures in X): matrix of pairwise Wasserstein distances.

multipers.ml.sliced_wasserstein._compute_signed_measure_parts(X)

This is a function for separating the positive and negative points of a list of signed measures. This function can be used as a preprocessing step in order to speed up the running time for computing all pairwise (sliced) Wasserstein distances on a list of signed measures.

Parameters:

X (list of n tuples): list of signed measures.

Returns:

list of n pairs of numpy arrays of shape (num x dimension): list of positive and negative signed measures.

multipers.ml.sliced_wasserstein._compute_signed_measure_projections(X, num_directions, scales)

This is a function for projecting the points of a list of signed measures onto a fixed number of lines sampled uniformly. This function can be used as a preprocessing step in order to speed up the running time for computing all pairwise sliced Wasserstein distances on a list of signed measures.

Parameters:

X (list of n tuples): list of signed measures. num_directions (int): number of lines evenly sampled from [-pi/2,pi/2] in order to approximate and speed up the distance computation. scales (array of shape D): scales associated to the dimensions.

Returns:

list of n pairs of numpy arrays of shape (num x num_directions): list of positive and negative projected signed measures.

multipers.ml.sliced_wasserstein._pairwise(fallback, skipdiag, X, Y, metric, n_jobs)
multipers.ml.sliced_wasserstein._sklearn_wrapper(metric, X, Y, **kwargs)

This function is a wrapper for any metric between two signed measures that takes two numpy arrays of shapes (nxD) and (mxD) as arguments.

multipers.ml.sliced_wasserstein._sliced_wasserstein_distance(meas1, meas2, num_directions, scales=None)

This is a function for computing the sliced Wasserstein distance from two signed measures. The Sliced Wasserstein distance is computed by projecting the signed measures onto lines, comparing the projections with the 1-norm, and finally averaging over the lines. See http://proceedings.mlr.press/v70/carriere17a.html for more details.

Parameters:

meas1: ((n x D), (n)) tuple with numpy.array encoding the (finite points of the) first measure and their multiplicities. Must not contain essential points (i.e. with infinite coordinate). meas2: ((m x D), (m)) tuple encoding the second measure. num_directions (int): number of lines evenly sampled from [-pi/2,pi/2] in order to approximate and speed up the distance computation. scales (array of shape D): scales associated to the dimensions.

Returns:

float: the sliced Wasserstein distance between signed measures.

multipers.ml.sliced_wasserstein._sliced_wasserstein_distance_on_projections(meas1, meas2, scales=None)

This is a function for computing the sliced Wasserstein distance between two signed measures that have already been projected onto some lines. It simply amounts to comparing the sorted projections with the 1-norm, and averaging over the lines. See http://proceedings.mlr.press/v70/carriere17a.html for more details.

Parameters:

meas1: pair of (n x number_of_lines) numpy.arrays containing the projected points of the positive and negative parts of the first measure. meas2: pair of (m x number_of_lines) numpy.arrays containing the projected points of the positive and negative parts of the second measure. scales (array of shape D): scales associated to the dimensions.

Returns:

float: the sliced Wasserstein distance between the projected signed measures.

multipers.ml.sliced_wasserstein._wasserstein_distance(meas1, meas2, epsilon, ground_norm)

This is a function for computing the Wasserstein distance from two signed measures.

Parameters:

meas1: ((n x D), (n)) tuple with numpy.array encoding the (finite points of the) first measure and their multiplicities. Must not contain essential points (i.e. with infinite coordinate). meas2: ((m x D), (m)) tuple encoding the second measure. epsilon (float): entropy regularization parameter. ground_norm (int): norm to use for ground metric cost.

Returns:

float: the Wasserstein distance between signed measures.

multipers.ml.sliced_wasserstein._wasserstein_distance_on_parts(ground_norm=1, epsilon=1.0)

This is a function for computing the Wasserstein distance between two signed measures that have already been separated into their positive and negative parts.

Parameters:

meas1: pair of (n x dimension) numpy.arrays containing the points of the positive and negative parts of the first measure. meas2: pair of (m x dimension) numpy.arrays containing the points of the positive and negative parts of the second measure.

Returns:

float: the sliced Wasserstein distance between the projected signed measures.

multipers.ml.sliced_wasserstein.pairwise_signed_measure_distances(X, Y=None, metric='sliced_wasserstein', n_jobs=None, **kwargs)

This function computes the distance matrix between two lists of signed measures given as numpy arrays of shape (nxD).

Parameters:

X (list of n tuples): first list of signed measures. Y (list of m tuples): second list of signed measures (optional). If None, pairwise distances are computed from the first list only. metric: distance to use. It can be either a string (“sliced_wasserstein”, “wasserstein”) or a function taking two tuples as inputs. If it is a function, make sure that it is symmetric and that it outputs 0 if called on the same two tuples. n_jobs (int): number of jobs to use for the computation. This uses joblib.Parallel(prefer=”threads”), so metrics that do not release the GIL may not scale unless run inside a joblib.parallel_backend block. **kwargs: optional keyword parameters. Any further parameters are passed directly to the distance function. See the docs of the various distance classes in this module.

Returns:

numpy array of shape (nxm): distance matrix

multipers.ml.tools module

class multipers.ml.tools.SimplexTreeEdgeCollapser(num_collapses=0, full=False, max_dimension=None, n_jobs=1)

Bases: BaseEstimator, TransformerMixin

Parameters:
  • num_collapses (int)

  • full (bool)

  • max_dimension (int | None)

  • n_jobs (int)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
Parameters:

X (ndarray | list)

transform(X)
multipers.ml.tools.filtration_grid_to_coordinates(F, return_resolution)
multipers.ml.tools.get_filtration_weights_grid(num_parameters=2, resolution=3, *, min=0, max=20, dtype=<class 'float'>, remove_homothetie=True, weights=None)
Provides a grid of weights, for filtration rescaling.
  • num parameter : the dimension of the grid tensor

  • resolution : the size of each coordinate

  • min : minimum weight

  • max : maximum weight

  • weights : custom weights (instead of linspace between min and max)

  • dtype : the type of the grid values (useful for int weights)

Parameters:
  • num_parameters (int)

  • resolution (int | Iterable[int])

  • min (float)

  • max (float)

  • remove_homothetie (bool)

multipers.ml.tools.get_simplex_tree_from_delayed(x)
Return type:

SimplexTreeMulti

multipers.ml.tools.get_simplextree(x)
Return type:

SimplexTreeMulti

Module contents