multipers.ml package

Submodules

multipers.ml.accuracies module

multipers.ml.convolutions module

class multipers.ml.convolutions.DTM(masses, metric='euclidean', **_kdtree_kwargs)

Bases: object

Distance To Measure

Parameters:

metric (str)

fit(X, sample_weights=None, y=None)
score_samples(Y, X=None)

Returns the kernel density estimates of each point in Y.

Parameters

Ytensor (m, d)

m points with d dimensions for which the probability density will be calculated

Returns

the DTMs of Y, for each mass in masses.

score_samples_diff(Y)

Returns the kernel density estimates of each point in Y.

Parameters

Ytensor (m, d)

m points with d dimensions for which the probability density will be calculated

Xtensor (n, d), optional

n points with d dimensions to which KDE will be fit. Provided to allow batch calculations in log_prob. By default, X is None and all points used to initialize KernelDensityEstimator are included.

Returns

log_probstensor (m)

log probability densities for each of the queried points in Y

class multipers.ml.convolutions.KDE(bandwidth=1, kernel='gaussian', return_log=False)

Bases: object

Fast, scikit-style, and differentiable kernel density estimation, using PyKeops.

Parameters:
  • bandwidth (Any)

  • kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)

  • return_log (bool)

fit(X, sample_weights=None, y=None)
score_samples(Y, X=None, return_kernel=False)

Returns the kernel density estimates of each point in Y.

Parameters

Ytensor (m, d)

m points with d dimensions for which the probability density will be calculated

Xtensor (n, d), optional

n points with d dimensions to which KDE will be fit. Provided to allow batch calculations in log_prob. By default, X is None and all points used to initialize KernelDensityEstimator are included.

Returns

log_probstensor (m)

log probability densities for each of the queried points in Y

static to_lazy(X, Y, x_weights)
multipers.ml.convolutions._kernel(kernel='gaussian')
Parameters:

kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)

multipers.ml.convolutions._pts_convolution_pykeops(pts, pts_weights, grid_iterator, kernel='gaussian', bandwidth=0.1, **more_kde_args)

Pykeops convolution

Parameters:
  • pts (ndarray)

  • pts_weights (ndarray)

  • kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)

multipers.ml.convolutions._pts_convolution_sparse_old(pts, pts_weights, grid_iterator, kernel='gaussian', bandwidth=0.1, **more_kde_args)

Old version of convolution_signed_measures. Scikitlearn’s convolution is slower than the code above.

Parameters:
  • pts (ndarray)

  • pts_weights (ndarray)

  • kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)

multipers.ml.convolutions.batch_signed_measure_convolutions(signed_measures, x, bandwidth, kernel)

Input

  • signed_measures: unragged, of shape (num_data, num_pts, D+1) where last coord is weights, (0 for dummy points)

  • x : the points to convolve (num_x,D)

  • bandwidth : the bandwidths or covariance matrix inverse or … of the kernel

  • kernel : “gaussian”, “multivariate_gaussian”, “exponential”, or Callable (x_i, y_i, bandwidth)->float

Output

Array of shape (num_convolutions, (num_axis), num_data, Array of shape (num_convolutions, (num_axis), num_data, max_x_size)

Parameters:

kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)

multipers.ml.convolutions.convolution_signed_measures(iterable_of_signed_measures, filtrations, bandwidth, flatten=True, n_jobs=1, backend='pykeops', kernel='gaussian', **kwargs)

Evaluates the convolution of the signed measures Iterable(pts, weights) with a gaussian measure of bandwidth bandwidth, on a grid given by the filtrations

Parameters

  • iterable_of_signed_measures : (num_signed_measure) x [ (npts) x (num_parameters), (npts)]

  • filtrations : (num_parameter) x (filtration values)

  • flatten : bool

  • n_jobs : int

Outputs

The concatenated images, for each signed measure (num_signed_measures) x (len(f) for f in filtration_values)

Parameters:
  • flatten (bool)

  • n_jobs (int)

  • kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)

multipers.ml.convolutions.exponential_kernel(x_i, y_j, bandwidth)
multipers.ml.convolutions.gaussian_kernel(x_i, y_j, bandwidth)
multipers.ml.convolutions.multivariate_gaussian_kernel(x_i, y_j, covariance_matrix_inverse)
multipers.ml.convolutions.sinc_kernel(x_i, y_j, bandwidth)

multipers.ml.invariants_with_persistable module

multipers.ml.kernels module

class multipers.ml.kernels.DistanceList2DistanceMatrix

Bases: BaseEstimator, TransformerMixin

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
class multipers.ml.kernels.DistanceMatrices2DistancesList

Bases: BaseEstimator, TransformerMixin

Input (degree) x (distance matrix) or (axis) x (degree) x (distance matrix D) Output _ (D1) x opt (axis) x (degree) x (D2, , with indices first)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
predict(X)
transform(X)
class multipers.ml.kernels.DistanceMatrix2DistanceList

Bases: BaseEstimator, TransformerMixin

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
class multipers.ml.kernels.DistanceMatrix2Kernel(sigma=1, axis=None, weights=1)

Bases: BaseEstimator, TransformerMixin

Input : (degree) x (distance matrix) or (axis) x (degree) x (distance matrix) in the second case, axis HAS to be specified (meant for cross validation) Output : kernel of the same shape of distance matrix

Parameters:
  • sigma (float | Iterable[float])

  • axis (int | None)

  • weights (Iterable[float] | float)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
Return type:

ndarray

class multipers.ml.kernels.DistancesLists2DistanceMatrices

Bases: BaseEstimator, TransformerMixin

Input (D1) x opt (axis) x (degree) x (D2 with indices first) Output opt (axis) x (degree) x (distance matrix (D1,D2))

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
Parameters:

X (ndarray)

transform(X)

multipers.ml.mma module

class multipers.ml.mma.FilteredComplex2MMA(n_jobs=-1, expand_dim=None, prune_degrees_above=None, progress=False, minpres_degrees=None, plot=False, **persistence_kwargs)

Bases: BaseEstimator, TransformerMixin

Turns a list of list of simplextrees or slicers to MMA approximations.

Parameters:
  • n_jobs (int)

  • expand_dim (int | None)

  • prune_degrees_above (int | None)

  • minpres_degrees (Iterable[int] | None)

  • plot (bool)

_infer_bounding_box(X)
_input_checks(X)
static _is_filtered_complex(input)
_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
class multipers.ml.mma.MMA2IMG(degrees, bandwidth=0.1, power=1, normalize=False, resolution=50, plot=False, box=None, n_jobs=-1, flatten=False, progress=False, grid_strategy='regular', kernel='linear', signed=False)

Bases: BaseEstimator, TransformerMixin

Parameters:
  • degrees (list)

  • bandwidth (float)

  • power (float)

  • normalize (bool)

  • resolution (list | int)

  • plot (bool)

  • signed (bool)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
class multipers.ml.mma.MMA2Landscape(resolution=[100, 100], degrees=[0, 1], ks=range(0, 5), phi=<function sum>, box=None, plot=False, n_jobs=-1, filtration_quantile=0.01)

Bases: BaseEstimator, TransformerMixin

Turns a list of MMA approximations into Landscapes vectorisations

Parameters:
  • degrees (list[int] | None)

  • ks (Iterable[int])

  • phi (Callable)

  • plot (bool)

  • filtration_quantile (float)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
Return type:

list[ndarray]

class multipers.ml.mma.MMAFormatter(degrees=None, axis=None, verbose=False, normalize=False, weights=None, quantiles=None, dump=False, from_dump=False)

Bases: BaseEstimator, TransformerMixin

Parameters:
  • degrees (list[int] | None)

  • verbose (bool)

  • normalize (bool)

static _get_module_bound(x, degree)

Output format : (2,num_parameters)

static _infer_axis(X)
static _infer_bounds(X, degrees=None, axis=[slice(None, None, None)], quantiles=None)

Compute bounds of filtration values of a list of modules.

Output Format

m,M of shape : (num_axis,num_degrees,2,num_parameters)

_infer_degrees(X)
static _infer_grid(X, strategy, resolution, degrees=None)

Given a list of PyModules, computes a multiparameter discrete grid, with a given strategy, from the filtration values of the summands of the modules.

Parameters:
  • X (List[PyModule_i32 | PyModule_i64 | PyModule_f32 | PyModule_f64])

  • strategy (str)

  • resolution (int)

static _infer_num_parameters(X, ax=slice(None, None, None))
static _maybe_from_dump(X_in)
_sklearn_auto_wrap_output_keys = {'transform'}
static copy_transform(mod, degrees, translation, rescale_factors, new_box)
fit(X_in, y=None)
set_fit_request(*, X_in='$UNCHANGED$')

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters

X_instr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for X_in parameter in fit.

Returns

selfobject

The updated object.

Parameters:
Return type:

MMAFormatter

set_transform_request(*, X_in='$UNCHANGED$')

Request metadata passed to the transform method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters

X_instr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for X_in parameter in transform.

Returns

selfobject

The updated object.

Parameters:
Return type:

MMAFormatter

transform(X_in)
class multipers.ml.mma.SimplexTree2MMA(n_jobs=-1, expand_dim=None, prune_degrees_above=None, progress=False, minpres_degrees=None, **persistence_kwargs)

Bases: FilteredComplex2MMA

Parameters:
  • n_jobs (int)

  • expand_dim (int | None)

  • prune_degrees_above (int | None)

  • minpres_degrees (Iterable[int] | None)

_sklearn_auto_wrap_output_keys = {'transform'}

multipers.ml.one module

multipers.ml.point_clouds module

class multipers.ml.point_clouds.PointCloud2FilteredComplex(bandwidths=[], masses=[], threshold=-inf, complex='rips', sparse=None, num_collapses=-2, kernel='gaussian', log_density=True, expand_dim=1, progress=False, n_jobs=None, fit_fraction=1, verbose=False, safe_conversion=False, output_type=None, reduce_degrees=None)

Bases: BaseEstimator, TransformerMixin

Parameters:
  • threshold (float)

  • complex (Literal['alpha', 'rips', 'delaunay'])

  • sparse (float | None)

  • num_collapses (int)

  • kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)

  • log_density (bool)

  • expand_dim (int)

  • progress (bool)

  • n_jobs (int | None)

  • fit_fraction (float)

  • verbose (bool)

  • safe_conversion (bool)

  • output_type (Literal['slicer', 'simplextree', 'slicer_vine', 'slicer_novine'] | None)

  • reduce_degrees (Iterable[int] | None)

_define_bandwidths(X)
_define_sts()
_get_codensities(x_fit, x_sample)
_get_distance_quantiles_and_threshold(X, qs)
_get_sts_alpha(x, return_alpha=False)
Parameters:

x (ndarray)

_get_sts_delaunay(x)
Parameters:

x (ndarray)

_get_sts_rips(x)
_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
Parameters:

X (ndarray | list)

transform(X)
class multipers.ml.point_clouds.PointCloud2SimplexTree(bandwidths=[], masses=[], threshold=inf, complex='rips', sparse=None, num_collapses=-2, kernel='gaussian', log_density=True, expand_dim=1, progress=False, n_jobs=None, fit_fraction=1, verbose=False, safe_conversion=False, output_type=None, reduce_degrees=None)

Bases: PointCloud2FilteredComplex

Parameters:
  • threshold (float)

  • complex (Literal['alpha', 'rips', 'delaunay'])

  • sparse (float | None)

  • num_collapses (int)

  • kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)

  • log_density (bool)

  • expand_dim (int)

  • progress (bool)

  • n_jobs (int | None)

  • fit_fraction (float)

  • verbose (bool)

  • safe_conversion (bool)

  • output_type (Literal['slicer', 'simplextree', 'slicer_vine', 'slicer_novine'] | None)

  • reduce_degrees (Iterable[int] | None)

_sklearn_auto_wrap_output_keys = {'transform'}

multipers.ml.signed_betti module

multipers.ml.signed_betti.rank_decomposition_by_rectangles(rank_invariant, threshold=False)
multipers.ml.signed_betti.signed_betti(hilbert_function, threshold=False, sparse=False)

multipers.ml.signed_measures module

class multipers.ml.signed_measures.DegreeRips2SignedMeasure(degrees, min_rips_value, max_rips_value, max_normalized_degree, min_normalized_degree, grid_granularity, progress=False, n_jobs=1, sparse=False, _möbius_inversion=True, fit_fraction=1)

Bases: BaseEstimator, TransformerMixin

Parameters:
  • degrees (Iterable[int])

  • min_rips_value (float)

  • max_normalized_degree (float)

  • min_normalized_degree (float)

  • grid_granularity (int)

  • progress (bool)

  • sparse (bool)

_sklearn_auto_wrap_output_keys = {'transform'}
_transform1(data)
Parameters:

data (ndarray)

fit(X, y=None)
Parameters:

X (ndarray | list)

transform(X)
class multipers.ml.signed_measures.FilteredComplex2SignedMeasure(degrees=[], rank_degrees=[], filtration_grid=None, progress=False, num_collapses=0, n_jobs=None, resolution=None, plot=False, filtration_quantile=0.0, expand=False, normalize_filtrations=False, grid_strategy='exact', seed=0, fit_fraction=1, out_resolution=None, individual_grid=None, enforce_null_mass=False, flatten=True, backend=None)

Bases: BaseEstimator, TransformerMixin

Input

Iterable[SimplexTreeMulti]

Output

Iterable[ list[signed_measure for degree] ]

signed measure is either
  • (points : (n x num_parameters) array, weights : (n) int array ) if sparse,

  • else an integer matrix.

Parameters

  • degrees : list of degrees to compute. None correspond to the euler characteristic

  • filtration grid : the grid on which to compute.

If None, the fit will infer it from - fit_fraction : the fraction of data to consider for the fit, seed is controlled by the seed parameter - resolution : the resolution of this grid - filtration_quantile : filtrations values quantile to ignore - grid_strategy:str : ‘regular’ or ‘quantile’ or ‘exact’ - normalize filtration : if sparse, will normalize all filtrations. - expand : expands the simplextree to compute correctly the degree, for flag complexes - invariant : the topological invariant to produce the signed measure. Choices are “hilbert” or “euler”. Will add rank invariant later. - num_collapse : Either an int or “full”. Collapse the complex before doing computation. - _möbius_inversion : if False, will not do the mobius inversion. output has to be a matrix then. - enforce_null_mass : Returns a zero mass measure, by thresholding the module if True.

_infer_filtration(X)
_input_checks(X)
static _is_filtered_complex(input)
_params_check()
_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
transform1(simplextree, ax, thread_id='')
Parameters:

thread_id (str)

Parameters:
  • degrees (list[int | None])

  • rank_degrees (list[int])

  • filtration_grid (Sequence[Sequence[ndarray]] | None)

  • num_collapses (int | str)

  • resolution (Iterable[int] | int | None)

  • plot (bool)

  • filtration_quantile (float)

  • normalize_filtrations (bool)

  • grid_strategy (str)

  • seed (int)

  • out_resolution (Iterable[int] | int | None)

  • individual_grid (bool | None)

  • enforce_null_mass (bool)

  • backend (str | None)

class multipers.ml.signed_measures.SignedMeasure2Convolution(filtration_grid=None, kernel='gaussian', bandwidth=1.0, flatten=False, n_jobs=1, resolution=None, grid_strategy='regular', progress=False, backend='pykeops', plot=False, log_density=False, **kde_kwargs)

Bases: BaseEstimator, TransformerMixin

Discrete convolution of a signed measure

Input

(data) x (degree) x (signed measure)

Parameters

  • filtration_grid : Iterable[array] For each filtration, the filtration values on which to evaluate the grid

  • resolution : int or (num_parameters) : If filtration grid is not given, will infer a grid, with this resolution

  • grid_strategy : the strategy to generate the grid. Available ones are regular, quantile, exact

  • flatten : if true, the output will be flattened

  • kernel : kernel to used to convolve the images.

  • flatten : flatten the images if True

  • progress : progress bar if True

  • backend : sklearn, pykeops or numba.

  • plot : Creates a plot Figure.

Output

(data) x (concatenation of imgs of degree)

_plot_imgs(imgs, size=4)
Parameters:

imgs (Iterable[ndarray])

_sklearn_auto_wrap_output_keys = {'transform'}
_sm2smi(signed_measures)
Parameters:

signed_measures (Iterable[ndarray])

_transform_from_sparse(X)
fit(X, y=None)
transform(X)
Parameters:
  • filtration_grid (Iterable[ndarray])

  • kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)

  • bandwidth (float | Iterable[float])

  • flatten (bool)

  • n_jobs (int)

  • resolution (int | None)

  • grid_strategy (str)

  • progress (bool)

  • backend (str)

  • plot (bool)

  • log_density (bool)

class multipers.ml.signed_measures.SignedMeasure2SlicedWassersteinDistance(n_jobs=None, num_directions=10, _sliced=True, epsilon=-1, ground_norm=1, progress=False, grid_reconversion=None, scales=None)

Bases: BaseEstimator, TransformerMixin

Transformer from signed measure to distance matrix.

Input

(data) x (degree) x (signed measure)

Format

  • a signed measure : tuple of array. (point position) : npts x (num_paramters) and weigths : npts

  • each data is a list of signed measure (for e.g. multiple degrees)

Output

  • (degree) x (distance matrix)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
predict(X)
transform(X)
Parameters:
  • num_directions (int)

  • _sliced (bool)

class multipers.ml.signed_measures.SignedMeasureFormatter(filtrations_weights=None, normalize=False, plot=False, unsparse=False, axis=-1, resolution=50, flatten=False, deep_format=False, unrag=True, n_jobs=1, verbose=False, integrate=False, grid_strategy='regular')

Bases: BaseEstimator, TransformerMixin

Input

(data) x (degree) x (signed measure) or (data) x (axis) x (degree) x (signed measure)

Iterable[list[signed_measure_matrix of degree]] or Iterable[previous].

The second is meant to use multiple choices for signed measure input. An example of usage : they come from a Rips + Density with different bandwidth. It is controlled by the axis parameter.

Output

Iterable[list[(reweighted)_sparse_signed_measure of degree]]

or (deep format)

Tensor of shape (num_axis*num_degrees, data, max_num_pts, num_parameters)

_check_axis(X)
_check_backend(X)
_check_measures(X)
_check_resolution()
static _check_sm(sm)
Return type:

bool

_check_weights()
_get_filtration_bounds(X, axis)
_infer_grids(X)
static _integrate_measure(sm, filtrations)
_plot_signed_measures(sms, size=4)
Parameters:

sms (Iterable[ndarray])

_print_stats(X)
_rescale_measures(X)
_sklearn_auto_wrap_output_keys = {'transform'}
static deep_format_measure(signed_measure)
fit(X, y=None)
transform(X)
unsparse_signed_measure(sparse_signed_measure)
Parameters:
  • filtrations_weights (Iterable[float] | None)

  • plot (bool)

  • unsparse (bool)

  • axis (int)

  • resolution (int | Iterable[int])

  • flatten (bool)

  • deep_format (bool)

  • unrag (bool)

  • n_jobs (int)

  • verbose (bool)

  • integrate (bool)

class multipers.ml.signed_measures.SignedMeasures2SlicedWassersteinDistances(progress=False, n_jobs=1, scales=None, **kwargs)

Bases: BaseEstimator, TransformerMixin

Transformer from signed measure to distance matrix. Input —– (data) x opt (axis) x (degree) x (signed measure)

Format

  • a signed measure : tuple of array. (point position) : npts x (num_paramters) and weigths : npts

  • each data is a list of signed measure (for e.g. multiple degrees)

Output

  • (axis) x (degree) x (distance matrix)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
transform(X)
Parameters:
  • n_jobs (int)

  • scales (Iterable[Iterable[float]] | None)

class multipers.ml.signed_measures.SimplexTree2RectangleDecomposition(filtration_grid, degrees, plot=False, reconvert_grid=True, num_collapses=0)

Bases: BaseEstimator, TransformerMixin

Transformer. 2 parameter SimplexTrees to their respective rectangle decomposition.

Parameters:
  • filtration_grid (ndarray)

  • degrees (Iterable[int])

  • num_collapses (int)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)

TODO : infer grid from multiple simplextrees

transform(X)
Parameters:

X (Iterable[SimplexTreeMulti_KFi32 | SimplexTreeMulti_Fi32 | SimplexTreeMulti_KFi64 | SimplexTreeMulti_Fi64 | SimplexTreeMulti_KFf32 | SimplexTreeMulti_Ff32 | SimplexTreeMulti_KFf64 | SimplexTreeMulti_Ff64])

class multipers.ml.signed_measures.SimplexTree2SignedMeasure(degrees=[], rank_degrees=[], filtration_grid=None, progress=False, num_collapses=0, n_jobs=None, resolution=None, plot=False, filtration_quantile=0.0, expand=False, normalize_filtrations=False, grid_strategy='exact', seed=0, fit_fraction=1, out_resolution=None, individual_grid=None, enforce_null_mass=False, flatten=True, backend=None)

Bases: FilteredComplex2SignedMeasure

Parameters:
  • degrees (list[int | None])

  • rank_degrees (list[int])

  • filtration_grid (Sequence[Sequence[ndarray]] | None)

  • num_collapses (int | str)

  • resolution (Iterable[int] | int | None)

  • plot (bool)

  • filtration_quantile (float)

  • normalize_filtrations (bool)

  • grid_strategy (str)

  • seed (int)

  • out_resolution (Iterable[int] | int | None)

  • individual_grid (bool | None)

  • enforce_null_mass (bool)

  • backend (str | None)

_sklearn_auto_wrap_output_keys = {'transform'}
multipers.ml.signed_measures._st2ranktensor(st, filtration_grid, degree, plot, reconvert_grid, num_collapse=0)

TODO

Parameters:
  • st (SimplexTreeMulti_KFi32 | SimplexTreeMulti_Fi32 | SimplexTreeMulti_KFi64 | SimplexTreeMulti_Fi64 | SimplexTreeMulti_KFf32 | SimplexTreeMulti_Ff32 | SimplexTreeMulti_KFf64 | SimplexTreeMulti_Ff64)

  • filtration_grid (ndarray)

  • degree (int)

  • plot (bool)

  • reconvert_grid (bool)

  • num_collapse (int | str)

multipers.ml.signed_measures.rescale_sparse_signed_measure(signed_measure, filtration_weights, normalize_scales=None)
multipers.ml.signed_measures.tensor_möbius_inversion(tensor, grid_conversion=None, plot=False, raw=False, num_parameters=None)
Parameters:
  • grid_conversion (Iterable[ndarray] | None)

  • plot (bool)

  • raw (bool)

  • num_parameters (int | None)

multipers.ml.sliced_wasserstein module

class multipers.ml.sliced_wasserstein.SlicedWassersteinDistance(num_directions=10, scales=None, n_jobs=None)

Bases: BaseEstimator, TransformerMixin

This is a class for computing the sliced Wasserstein distance matrix from a list of signed measures. The Sliced Wasserstein distance is computed by projecting the signed measures onto lines, comparing the projections with the 1-norm, and finally integrating over all possible lines. See http://proceedings.mlr.press/v70/carriere17a.html for more details.

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)

Fit the SlicedWassersteinDistance class on a list of signed measures: signed measures are projected onto the different lines. The measures themselves are then stored in numpy arrays, called measures_.

Parameters:

X (list of tuples): input signed measures. y (n x 1 array): signed measure labels (unused).

transform(X)

Compute all sliced Wasserstein distances between the signed measures that were stored after calling the fit() method, and a given list of (possibly different) signed measures.

Parameters:

X (list of tuples): input signed measures.

Returns:

numpy array of shape (number of measures in measures) x (number of measures in X): matrix of pairwise sliced Wasserstein distances.

class multipers.ml.sliced_wasserstein.WassersteinDistance(epsilon=1.0, ground_norm=1, n_jobs=None)

Bases: BaseEstimator, TransformerMixin

This is a class for computing the Wasserstein distance matrix from a list of signed measures.

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)

Fit the WassersteinDistance class on a list of signed measures. The measures themselves are then stored in numpy arrays, called measures_.

Parameters:

X (list of tuples): input signed measures. y (n x 1 array): signed measure labels (unused).

transform(X)

Compute all Wasserstein distances between the signed measures that were stored after calling the fit() method, and a given list of (possibly different) signed measures.

Parameters:

X (list of tuples): input signed measures.

Returns:

numpy array of shape (number of measures in measures) x (number of measures in X): matrix of pairwise Wasserstein distances.

multipers.ml.sliced_wasserstein._compute_signed_measure_parts(X)

This is a function for separating the positive and negative points of a list of signed measures. This function can be used as a preprocessing step in order to speed up the running time for computing all pairwise (sliced) Wasserstein distances on a list of signed measures.

Parameters:

X (list of n tuples): list of signed measures.

Returns:

list of n pairs of numpy arrays of shape (num x dimension): list of positive and negative signed measures.

multipers.ml.sliced_wasserstein._compute_signed_measure_projections(X, num_directions, scales)

This is a function for projecting the points of a list of signed measures onto a fixed number of lines sampled uniformly. This function can be used as a preprocessing step in order to speed up the running time for computing all pairwise sliced Wasserstein distances on a list of signed measures.

Parameters:

X (list of n tuples): list of signed measures. num_directions (int): number of lines evenly sampled from [-pi/2,pi/2] in order to approximate and speed up the distance computation. scales (array of shape D): scales associated to the dimensions.

Returns:

list of n pairs of numpy arrays of shape (num x num_directions): list of positive and negative projected signed measures.

multipers.ml.sliced_wasserstein._pairwise(fallback, skipdiag, X, Y, metric, n_jobs)
multipers.ml.sliced_wasserstein._sklearn_wrapper(metric, X, Y, **kwargs)

This function is a wrapper for any metric between two signed measures that takes two numpy arrays of shapes (nxD) and (mxD) as arguments.

multipers.ml.sliced_wasserstein._sliced_wasserstein_distance(meas1, meas2, num_directions, scales=None)

This is a function for computing the sliced Wasserstein distance from two signed measures. The Sliced Wasserstein distance is computed by projecting the signed measures onto lines, comparing the projections with the 1-norm, and finally averaging over the lines. See http://proceedings.mlr.press/v70/carriere17a.html for more details.

Parameters:

meas1: ((n x D), (n)) tuple with numpy.array encoding the (finite points of the) first measure and their multiplicities. Must not contain essential points (i.e. with infinite coordinate). meas2: ((m x D), (m)) tuple encoding the second measure. num_directions (int): number of lines evenly sampled from [-pi/2,pi/2] in order to approximate and speed up the distance computation. scales (array of shape D): scales associated to the dimensions.

Returns:

float: the sliced Wasserstein distance between signed measures.

multipers.ml.sliced_wasserstein._sliced_wasserstein_distance_on_projections(meas1, meas2, scales=None)

This is a function for computing the sliced Wasserstein distance between two signed measures that have already been projected onto some lines. It simply amounts to comparing the sorted projections with the 1-norm, and averaging over the lines. See http://proceedings.mlr.press/v70/carriere17a.html for more details.

Parameters:

meas1: pair of (n x number_of_lines) numpy.arrays containing the projected points of the positive and negative parts of the first measure. meas2: pair of (m x number_of_lines) numpy.arrays containing the projected points of the positive and negative parts of the second measure. scales (array of shape D): scales associated to the dimensions.

Returns:

float: the sliced Wasserstein distance between the projected signed measures.

multipers.ml.sliced_wasserstein._wasserstein_distance(meas1, meas2, epsilon, ground_norm)

This is a function for computing the Wasserstein distance from two signed measures.

Parameters:

meas1: ((n x D), (n)) tuple with numpy.array encoding the (finite points of the) first measure and their multiplicities. Must not contain essential points (i.e. with infinite coordinate). meas2: ((m x D), (m)) tuple encoding the second measure. epsilon (float): entropy regularization parameter. ground_norm (int): norm to use for ground metric cost.

Returns:

float: the Wasserstein distance between signed measures.

multipers.ml.sliced_wasserstein._wasserstein_distance_on_parts(ground_norm=1, epsilon=1.0)

This is a function for computing the Wasserstein distance between two signed measures that have already been separated into their positive and negative parts.

Parameters:

meas1: pair of (n x dimension) numpy.arrays containing the points of the positive and negative parts of the first measure. meas2: pair of (m x dimension) numpy.arrays containing the points of the positive and negative parts of the second measure.

Returns:

float: the sliced Wasserstein distance between the projected signed measures.

multipers.ml.sliced_wasserstein.pairwise_signed_measure_distances(X, Y=None, metric='sliced_wasserstein', n_jobs=None, **kwargs)

This function computes the distance matrix between two lists of signed measures given as numpy arrays of shape (nxD).

Parameters:

X (list of n tuples): first list of signed measures. Y (list of m tuples): second list of signed measures (optional). If None, pairwise distances are computed from the first list only. metric: distance to use. It can be either a string (“sliced_wasserstein”, “wasserstein”) or a function taking two tuples as inputs. If it is a function, make sure that it is symmetric and that it outputs 0 if called on the same two tuples. n_jobs (int): number of jobs to use for the computation. This uses joblib.Parallel(prefer=”threads”), so metrics that do not release the GIL may not scale unless run inside a joblib.parallel_backend block. **kwargs: optional keyword parameters. Any further parameters are passed directly to the distance function. See the docs of the various distance classes in this module.

Returns:

numpy array of shape (nxm): distance matrix

multipers.ml.tools module

class multipers.ml.tools.SimplexTreeEdgeCollapser(num_collapses=0, full=False, max_dimension=None, n_jobs=1)

Bases: BaseEstimator, TransformerMixin

Parameters:
  • num_collapses (int)

  • full (bool)

  • max_dimension (int | None)

  • n_jobs (int)

_sklearn_auto_wrap_output_keys = {'transform'}
fit(X, y=None)
Parameters:

X (ndarray | list)

transform(X)
multipers.ml.tools.filtration_grid_to_coordinates(F, return_resolution)
multipers.ml.tools.get_filtration_weights_grid(num_parameters=2, resolution=3, *, min=0, max=20, dtype=<class 'float'>, remove_homothetie=True, weights=None)
Provides a grid of weights, for filtration rescaling.
  • num parameter : the dimension of the grid tensor

  • resolution : the size of each coordinate

  • min : minimum weight

  • max : maximum weight

  • weights : custom weights (instead of linspace between min and max)

  • dtype : the type of the grid values (useful for int weights)

Parameters:
  • num_parameters (int)

  • resolution (int | Iterable[int])

  • min (float)

  • max (float)

  • remove_homothetie (bool)

multipers.ml.tools.get_simplex_tree_from_delayed(x)
Return type:

SimplexTreeMulti

multipers.ml.tools.get_simplextree(x)
Return type:

SimplexTreeMulti

Module contents