multipers.ml package
Submodules
multipers.ml.accuracies module
multipers.ml.convolutions module
- class multipers.ml.convolutions.DTM(masses, metric='euclidean', **_kdtree_kwargs)
Bases:
object
Distance To Measure
- Parameters:
metric (str)
- fit(X, sample_weights=None, y=None)
- score_samples(Y, X=None)
Returns the kernel density estimates of each point in Y.
Parameters
- Ytensor (m, d)
m points with d dimensions for which the probability density will be calculated
Returns
the DTMs of Y, for each mass in masses.
- score_samples_diff(Y)
Returns the kernel density estimates of each point in Y.
Parameters
- Ytensor (m, d)
m points with d dimensions for which the probability density will be calculated
- Xtensor (n, d), optional
n points with d dimensions to which KDE will be fit. Provided to allow batch calculations in log_prob. By default, X is None and all points used to initialize KernelDensityEstimator are included.
Returns
- log_probstensor (m)
log probability densities for each of the queried points in Y
- class multipers.ml.convolutions.KDE(bandwidth=1, kernel='gaussian', return_log=False)
Bases:
object
Fast, scikit-style, and differentiable kernel density estimation, using PyKeops.
- Parameters:
bandwidth (Any)
kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)
return_log (bool)
- fit(X, sample_weights=None, y=None)
- score_samples(Y, X=None, return_kernel=False)
Returns the kernel density estimates of each point in Y.
Parameters
- Ytensor (m, d)
m points with d dimensions for which the probability density will be calculated
- Xtensor (n, d), optional
n points with d dimensions to which KDE will be fit. Provided to allow batch calculations in log_prob. By default, X is None and all points used to initialize KernelDensityEstimator are included.
Returns
- log_probstensor (m)
log probability densities for each of the queried points in Y
- static to_lazy(X, Y, x_weights)
- multipers.ml.convolutions._kernel(kernel='gaussian')
- Parameters:
kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)
- multipers.ml.convolutions._pts_convolution_pykeops(pts, pts_weights, grid_iterator, kernel='gaussian', bandwidth=0.1, **more_kde_args)
Pykeops convolution
- Parameters:
pts (ndarray)
pts_weights (ndarray)
kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)
- multipers.ml.convolutions._pts_convolution_sparse_old(pts, pts_weights, grid_iterator, kernel='gaussian', bandwidth=0.1, **more_kde_args)
Old version of convolution_signed_measures. Scikitlearn’s convolution is slower than the code above.
- Parameters:
pts (ndarray)
pts_weights (ndarray)
kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)
- multipers.ml.convolutions.batch_signed_measure_convolutions(signed_measures, x, bandwidth, kernel)
Input
signed_measures: unragged, of shape (num_data, num_pts, D+1) where last coord is weights, (0 for dummy points)
x : the points to convolve (num_x,D)
bandwidth : the bandwidths or covariance matrix inverse or … of the kernel
kernel : “gaussian”, “multivariate_gaussian”, “exponential”, or Callable (x_i, y_i, bandwidth)->float
Output
Array of shape (num_convolutions, (num_axis), num_data, Array of shape (num_convolutions, (num_axis), num_data, max_x_size)
- Parameters:
kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)
- multipers.ml.convolutions.convolution_signed_measures(iterable_of_signed_measures, filtrations, bandwidth, flatten=True, n_jobs=1, backend='pykeops', kernel='gaussian', **kwargs)
Evaluates the convolution of the signed measures Iterable(pts, weights) with a gaussian measure of bandwidth bandwidth, on a grid given by the filtrations
Parameters
iterable_of_signed_measures : (num_signed_measure) x [ (npts) x (num_parameters), (npts)]
filtrations : (num_parameter) x (filtration values)
flatten : bool
n_jobs : int
Outputs
The concatenated images, for each signed measure (num_signed_measures) x (len(f) for f in filtration_values)
- Parameters:
flatten (bool)
n_jobs (int)
kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)
- multipers.ml.convolutions.exponential_kernel(x_i, y_j, bandwidth)
- multipers.ml.convolutions.gaussian_kernel(x_i, y_j, bandwidth)
- multipers.ml.convolutions.multivariate_gaussian_kernel(x_i, y_j, covariance_matrix_inverse)
- multipers.ml.convolutions.sinc_kernel(x_i, y_j, bandwidth)
multipers.ml.invariants_with_persistable module
multipers.ml.kernels module
- class multipers.ml.kernels.DistanceList2DistanceMatrix
Bases:
BaseEstimator
,TransformerMixin
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- class multipers.ml.kernels.DistanceMatrices2DistancesList
Bases:
BaseEstimator
,TransformerMixin
Input (degree) x (distance matrix) or (axis) x (degree) x (distance matrix D) Output _ (D1) x opt (axis) x (degree) x (D2, , with indices first)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- predict(X)
- transform(X)
- class multipers.ml.kernels.DistanceMatrix2DistanceList
Bases:
BaseEstimator
,TransformerMixin
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- class multipers.ml.kernels.DistanceMatrix2Kernel(sigma=1, axis=None, weights=1)
Bases:
BaseEstimator
,TransformerMixin
Input : (degree) x (distance matrix) or (axis) x (degree) x (distance matrix) in the second case, axis HAS to be specified (meant for cross validation) Output : kernel of the same shape of distance matrix
- Parameters:
sigma (float | Iterable[float])
axis (int | None)
weights (Iterable[float] | float)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- Return type:
ndarray
- class multipers.ml.kernels.DistancesLists2DistanceMatrices
Bases:
BaseEstimator
,TransformerMixin
Input (D1) x opt (axis) x (degree) x (D2 with indices first) Output opt (axis) x (degree) x (distance matrix (D1,D2))
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- Parameters:
X (ndarray)
- transform(X)
multipers.ml.mma module
- class multipers.ml.mma.FilteredComplex2MMA(n_jobs=-1, expand_dim=None, prune_degrees_above=None, progress=False, minpres_degrees=None, plot=False, **persistence_kwargs)
Bases:
BaseEstimator
,TransformerMixin
Turns a list of list of simplextrees or slicers to MMA approximations.
- Parameters:
n_jobs (int)
expand_dim (int | None)
prune_degrees_above (int | None)
minpres_degrees (Iterable[int] | None)
plot (bool)
- _infer_bounding_box(X)
- _input_checks(X)
- static _is_filtered_complex(input)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- class multipers.ml.mma.MMA2IMG(degrees, bandwidth=0.1, power=1, normalize=False, resolution=50, plot=False, box=None, n_jobs=-1, flatten=False, progress=False, grid_strategy='regular', kernel='linear', signed=False)
Bases:
BaseEstimator
,TransformerMixin
- Parameters:
degrees (list)
bandwidth (float)
power (float)
normalize (bool)
resolution (list | int)
plot (bool)
signed (bool)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- class multipers.ml.mma.MMA2Landscape(resolution=[100, 100], degrees=[0, 1], ks=range(0, 5), phi=<function sum>, box=None, plot=False, n_jobs=-1, filtration_quantile=0.01)
Bases:
BaseEstimator
,TransformerMixin
Turns a list of MMA approximations into Landscapes vectorisations
- Parameters:
degrees (list[int] | None)
ks (Iterable[int])
phi (Callable)
plot (bool)
filtration_quantile (float)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- Return type:
list[ndarray]
- class multipers.ml.mma.MMAFormatter(degrees=None, axis=None, verbose=False, normalize=False, weights=None, quantiles=None, dump=False, from_dump=False)
Bases:
BaseEstimator
,TransformerMixin
- Parameters:
degrees (list[int] | None)
verbose (bool)
normalize (bool)
- static _get_module_bound(x, degree)
Output format : (2,num_parameters)
- static _infer_axis(X)
- static _infer_bounds(X, degrees=None, axis=[slice(None, None, None)], quantiles=None)
Compute bounds of filtration values of a list of modules.
Output Format
m,M of shape : (num_axis,num_degrees,2,num_parameters)
- _infer_degrees(X)
- static _infer_grid(X, strategy, resolution, degrees=None)
Given a list of PyModules, computes a multiparameter discrete grid, with a given strategy, from the filtration values of the summands of the modules.
- Parameters:
X (List[PyModule_i32 | PyModule_i64 | PyModule_f32 | PyModule_f64])
strategy (str)
resolution (int)
- static _infer_num_parameters(X, ax=slice(None, None, None))
- static _maybe_from_dump(X_in)
- _sklearn_auto_wrap_output_keys = {'transform'}
- static copy_transform(mod, degrees, translation, rescale_factors, new_box)
- fit(X_in, y=None)
- set_fit_request(*, X_in='$UNCHANGED$')
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.Parameters
- X_instr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
X_in
parameter infit
.
Returns
- selfobject
The updated object.
- Parameters:
self (MMAFormatter)
X_in (bool | None | str)
- Return type:
- set_transform_request(*, X_in='$UNCHANGED$')
Request metadata passed to the
transform
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed totransform
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it totransform
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.Parameters
- X_instr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
X_in
parameter intransform
.
Returns
- selfobject
The updated object.
- Parameters:
self (MMAFormatter)
X_in (bool | None | str)
- Return type:
- transform(X_in)
- class multipers.ml.mma.SimplexTree2MMA(n_jobs=-1, expand_dim=None, prune_degrees_above=None, progress=False, minpres_degrees=None, **persistence_kwargs)
Bases:
FilteredComplex2MMA
- Parameters:
n_jobs (int)
expand_dim (int | None)
prune_degrees_above (int | None)
minpres_degrees (Iterable[int] | None)
- _sklearn_auto_wrap_output_keys = {'transform'}
multipers.ml.one module
multipers.ml.point_clouds module
- class multipers.ml.point_clouds.PointCloud2FilteredComplex(bandwidths=[], masses=[], threshold=-inf, complex='rips', sparse=None, num_collapses=-2, kernel='gaussian', log_density=True, expand_dim=1, progress=False, n_jobs=None, fit_fraction=1, verbose=False, safe_conversion=False, output_type=None, reduce_degrees=None)
Bases:
BaseEstimator
,TransformerMixin
- Parameters:
threshold (float)
complex (Literal['alpha', 'rips', 'delaunay'])
sparse (float | None)
num_collapses (int)
kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)
log_density (bool)
expand_dim (int)
progress (bool)
n_jobs (int | None)
fit_fraction (float)
verbose (bool)
safe_conversion (bool)
output_type (Literal['slicer', 'simplextree', 'slicer_vine', 'slicer_novine'] | None)
reduce_degrees (Iterable[int] | None)
- _define_bandwidths(X)
- _define_sts()
- _get_codensities(x_fit, x_sample)
- _get_distance_quantiles_and_threshold(X, qs)
- _get_sts_alpha(x, return_alpha=False)
- Parameters:
x (ndarray)
- _get_sts_delaunay(x)
- Parameters:
x (ndarray)
- _get_sts_rips(x)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- Parameters:
X (ndarray | list)
- transform(X)
- class multipers.ml.point_clouds.PointCloud2SimplexTree(bandwidths=[], masses=[], threshold=inf, complex='rips', sparse=None, num_collapses=-2, kernel='gaussian', log_density=True, expand_dim=1, progress=False, n_jobs=None, fit_fraction=1, verbose=False, safe_conversion=False, output_type=None, reduce_degrees=None)
Bases:
PointCloud2FilteredComplex
- Parameters:
threshold (float)
complex (Literal['alpha', 'rips', 'delaunay'])
sparse (float | None)
num_collapses (int)
kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)
log_density (bool)
expand_dim (int)
progress (bool)
n_jobs (int | None)
fit_fraction (float)
verbose (bool)
safe_conversion (bool)
output_type (Literal['slicer', 'simplextree', 'slicer_vine', 'slicer_novine'] | None)
reduce_degrees (Iterable[int] | None)
- _sklearn_auto_wrap_output_keys = {'transform'}
multipers.ml.signed_betti module
- multipers.ml.signed_betti.rank_decomposition_by_rectangles(rank_invariant, threshold=False)
- multipers.ml.signed_betti.signed_betti(hilbert_function, threshold=False, sparse=False)
multipers.ml.signed_measures module
- class multipers.ml.signed_measures.DegreeRips2SignedMeasure(degrees, min_rips_value, max_rips_value, max_normalized_degree, min_normalized_degree, grid_granularity, progress=False, n_jobs=1, sparse=False, _möbius_inversion=True, fit_fraction=1)
Bases:
BaseEstimator
,TransformerMixin
- Parameters:
degrees (Iterable[int])
min_rips_value (float)
max_normalized_degree (float)
min_normalized_degree (float)
grid_granularity (int)
progress (bool)
sparse (bool)
- _sklearn_auto_wrap_output_keys = {'transform'}
- _transform1(data)
- Parameters:
data (ndarray)
- fit(X, y=None)
- Parameters:
X (ndarray | list)
- transform(X)
- class multipers.ml.signed_measures.FilteredComplex2SignedMeasure(degrees=[], rank_degrees=[], filtration_grid=None, progress=False, num_collapses=0, n_jobs=None, resolution=None, plot=False, filtration_quantile=0.0, expand=False, normalize_filtrations=False, grid_strategy='exact', seed=0, fit_fraction=1, out_resolution=None, individual_grid=None, enforce_null_mass=False, flatten=True, backend=None)
Bases:
BaseEstimator
,TransformerMixin
Input
Iterable[SimplexTreeMulti]
Output
Iterable[ list[signed_measure for degree] ]
- signed measure is either
(points : (n x num_parameters) array, weights : (n) int array ) if sparse,
else an integer matrix.
Parameters
degrees : list of degrees to compute. None correspond to the euler characteristic
filtration grid : the grid on which to compute.
If None, the fit will infer it from - fit_fraction : the fraction of data to consider for the fit, seed is controlled by the seed parameter - resolution : the resolution of this grid - filtration_quantile : filtrations values quantile to ignore - grid_strategy:str : ‘regular’ or ‘quantile’ or ‘exact’ - normalize filtration : if sparse, will normalize all filtrations. - expand : expands the simplextree to compute correctly the degree, for flag complexes - invariant : the topological invariant to produce the signed measure. Choices are “hilbert” or “euler”. Will add rank invariant later. - num_collapse : Either an int or “full”. Collapse the complex before doing computation. - _möbius_inversion : if False, will not do the mobius inversion. output has to be a matrix then. - enforce_null_mass : Returns a zero mass measure, by thresholding the module if True.
- _infer_filtration(X)
- _input_checks(X)
- static _is_filtered_complex(input)
- _params_check()
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- transform1(simplextree, ax, thread_id='')
- Parameters:
thread_id (str)
- Parameters:
degrees (list[int | None])
rank_degrees (list[int])
filtration_grid (Sequence[Sequence[ndarray]] | None)
num_collapses (int | str)
resolution (Iterable[int] | int | None)
plot (bool)
filtration_quantile (float)
normalize_filtrations (bool)
grid_strategy (str)
seed (int)
out_resolution (Iterable[int] | int | None)
individual_grid (bool | None)
enforce_null_mass (bool)
backend (str | None)
- class multipers.ml.signed_measures.SignedMeasure2Convolution(filtration_grid=None, kernel='gaussian', bandwidth=1.0, flatten=False, n_jobs=1, resolution=None, grid_strategy='regular', progress=False, backend='pykeops', plot=False, log_density=False, **kde_kwargs)
Bases:
BaseEstimator
,TransformerMixin
Discrete convolution of a signed measure
Input
(data) x (degree) x (signed measure)
Parameters
filtration_grid : Iterable[array] For each filtration, the filtration values on which to evaluate the grid
resolution : int or (num_parameters) : If filtration grid is not given, will infer a grid, with this resolution
grid_strategy : the strategy to generate the grid. Available ones are regular, quantile, exact
flatten : if true, the output will be flattened
kernel : kernel to used to convolve the images.
flatten : flatten the images if True
progress : progress bar if True
backend : sklearn, pykeops or numba.
plot : Creates a plot Figure.
Output
(data) x (concatenation of imgs of degree)
- _plot_imgs(imgs, size=4)
- Parameters:
imgs (Iterable[ndarray])
- _sklearn_auto_wrap_output_keys = {'transform'}
- _sm2smi(signed_measures)
- Parameters:
signed_measures (Iterable[ndarray])
- _transform_from_sparse(X)
- fit(X, y=None)
- transform(X)
- Parameters:
filtration_grid (Iterable[ndarray])
kernel (Literal['gaussian', 'exponential', 'exponential_kernel', 'multivariate_gaussian', 'sinc'] | ~collections.abc.Callable)
bandwidth (float | Iterable[float])
flatten (bool)
n_jobs (int)
resolution (int | None)
grid_strategy (str)
progress (bool)
backend (str)
plot (bool)
log_density (bool)
- class multipers.ml.signed_measures.SignedMeasure2SlicedWassersteinDistance(n_jobs=None, num_directions=10, _sliced=True, epsilon=-1, ground_norm=1, progress=False, grid_reconversion=None, scales=None)
Bases:
BaseEstimator
,TransformerMixin
Transformer from signed measure to distance matrix.
Input
(data) x (degree) x (signed measure)
Format
a signed measure : tuple of array. (point position) : npts x (num_paramters) and weigths : npts
each data is a list of signed measure (for e.g. multiple degrees)
Output
(degree) x (distance matrix)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- predict(X)
- transform(X)
- Parameters:
num_directions (int)
_sliced (bool)
- class multipers.ml.signed_measures.SignedMeasureFormatter(filtrations_weights=None, normalize=False, plot=False, unsparse=False, axis=-1, resolution=50, flatten=False, deep_format=False, unrag=True, n_jobs=1, verbose=False, integrate=False, grid_strategy='regular')
Bases:
BaseEstimator
,TransformerMixin
Input
(data) x (degree) x (signed measure) or (data) x (axis) x (degree) x (signed measure)
Iterable[list[signed_measure_matrix of degree]] or Iterable[previous].
The second is meant to use multiple choices for signed measure input. An example of usage : they come from a Rips + Density with different bandwidth. It is controlled by the axis parameter.
Output
Iterable[list[(reweighted)_sparse_signed_measure of degree]]
or (deep format)
Tensor of shape (num_axis*num_degrees, data, max_num_pts, num_parameters)
- _check_axis(X)
- _check_backend(X)
- _check_measures(X)
- _check_resolution()
- static _check_sm(sm)
- Return type:
bool
- _check_weights()
- _get_filtration_bounds(X, axis)
- _infer_grids(X)
- static _integrate_measure(sm, filtrations)
- _plot_signed_measures(sms, size=4)
- Parameters:
sms (Iterable[ndarray])
- _print_stats(X)
- _rescale_measures(X)
- _sklearn_auto_wrap_output_keys = {'transform'}
- static deep_format_measure(signed_measure)
- fit(X, y=None)
- transform(X)
- unsparse_signed_measure(sparse_signed_measure)
- Parameters:
filtrations_weights (Iterable[float] | None)
plot (bool)
unsparse (bool)
axis (int)
resolution (int | Iterable[int])
flatten (bool)
deep_format (bool)
unrag (bool)
n_jobs (int)
verbose (bool)
integrate (bool)
- class multipers.ml.signed_measures.SignedMeasures2SlicedWassersteinDistances(progress=False, n_jobs=1, scales=None, **kwargs)
Bases:
BaseEstimator
,TransformerMixin
Transformer from signed measure to distance matrix. Input —– (data) x opt (axis) x (degree) x (signed measure)
Format
a signed measure : tuple of array. (point position) : npts x (num_paramters) and weigths : npts
each data is a list of signed measure (for e.g. multiple degrees)
Output
(axis) x (degree) x (distance matrix)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- transform(X)
- Parameters:
n_jobs (int)
scales (Iterable[Iterable[float]] | None)
- class multipers.ml.signed_measures.SimplexTree2RectangleDecomposition(filtration_grid, degrees, plot=False, reconvert_grid=True, num_collapses=0)
Bases:
BaseEstimator
,TransformerMixin
Transformer. 2 parameter SimplexTrees to their respective rectangle decomposition.
- Parameters:
filtration_grid (ndarray)
degrees (Iterable[int])
num_collapses (int)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
TODO : infer grid from multiple simplextrees
- transform(X)
- Parameters:
X (Iterable[SimplexTreeMulti_KFi32 | SimplexTreeMulti_Fi32 | SimplexTreeMulti_KFi64 | SimplexTreeMulti_Fi64 | SimplexTreeMulti_KFf32 | SimplexTreeMulti_Ff32 | SimplexTreeMulti_KFf64 | SimplexTreeMulti_Ff64])
- class multipers.ml.signed_measures.SimplexTree2SignedMeasure(degrees=[], rank_degrees=[], filtration_grid=None, progress=False, num_collapses=0, n_jobs=None, resolution=None, plot=False, filtration_quantile=0.0, expand=False, normalize_filtrations=False, grid_strategy='exact', seed=0, fit_fraction=1, out_resolution=None, individual_grid=None, enforce_null_mass=False, flatten=True, backend=None)
Bases:
FilteredComplex2SignedMeasure
- Parameters:
degrees (list[int | None])
rank_degrees (list[int])
filtration_grid (Sequence[Sequence[ndarray]] | None)
num_collapses (int | str)
resolution (Iterable[int] | int | None)
plot (bool)
filtration_quantile (float)
normalize_filtrations (bool)
grid_strategy (str)
seed (int)
out_resolution (Iterable[int] | int | None)
individual_grid (bool | None)
enforce_null_mass (bool)
backend (str | None)
- _sklearn_auto_wrap_output_keys = {'transform'}
- multipers.ml.signed_measures._st2ranktensor(st, filtration_grid, degree, plot, reconvert_grid, num_collapse=0)
TODO
- Parameters:
st (SimplexTreeMulti_KFi32 | SimplexTreeMulti_Fi32 | SimplexTreeMulti_KFi64 | SimplexTreeMulti_Fi64 | SimplexTreeMulti_KFf32 | SimplexTreeMulti_Ff32 | SimplexTreeMulti_KFf64 | SimplexTreeMulti_Ff64)
filtration_grid (ndarray)
degree (int)
plot (bool)
reconvert_grid (bool)
num_collapse (int | str)
- multipers.ml.signed_measures.rescale_sparse_signed_measure(signed_measure, filtration_weights, normalize_scales=None)
- multipers.ml.signed_measures.tensor_möbius_inversion(tensor, grid_conversion=None, plot=False, raw=False, num_parameters=None)
- Parameters:
grid_conversion (Iterable[ndarray] | None)
plot (bool)
raw (bool)
num_parameters (int | None)
multipers.ml.sliced_wasserstein module
- class multipers.ml.sliced_wasserstein.SlicedWassersteinDistance(num_directions=10, scales=None, n_jobs=None)
Bases:
BaseEstimator
,TransformerMixin
This is a class for computing the sliced Wasserstein distance matrix from a list of signed measures. The Sliced Wasserstein distance is computed by projecting the signed measures onto lines, comparing the projections with the 1-norm, and finally integrating over all possible lines. See http://proceedings.mlr.press/v70/carriere17a.html for more details.
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
Fit the SlicedWassersteinDistance class on a list of signed measures: signed measures are projected onto the different lines. The measures themselves are then stored in numpy arrays, called measures_.
- Parameters:
X (list of tuples): input signed measures. y (n x 1 array): signed measure labels (unused).
- transform(X)
Compute all sliced Wasserstein distances between the signed measures that were stored after calling the fit() method, and a given list of (possibly different) signed measures.
- Parameters:
X (list of tuples): input signed measures.
- Returns:
numpy array of shape (number of measures in measures) x (number of measures in X): matrix of pairwise sliced Wasserstein distances.
- class multipers.ml.sliced_wasserstein.WassersteinDistance(epsilon=1.0, ground_norm=1, n_jobs=None)
Bases:
BaseEstimator
,TransformerMixin
This is a class for computing the Wasserstein distance matrix from a list of signed measures.
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
Fit the WassersteinDistance class on a list of signed measures. The measures themselves are then stored in numpy arrays, called measures_.
- Parameters:
X (list of tuples): input signed measures. y (n x 1 array): signed measure labels (unused).
- transform(X)
Compute all Wasserstein distances between the signed measures that were stored after calling the fit() method, and a given list of (possibly different) signed measures.
- Parameters:
X (list of tuples): input signed measures.
- Returns:
numpy array of shape (number of measures in measures) x (number of measures in X): matrix of pairwise Wasserstein distances.
- multipers.ml.sliced_wasserstein._compute_signed_measure_parts(X)
This is a function for separating the positive and negative points of a list of signed measures. This function can be used as a preprocessing step in order to speed up the running time for computing all pairwise (sliced) Wasserstein distances on a list of signed measures.
- Parameters:
X (list of n tuples): list of signed measures.
- Returns:
list of n pairs of numpy arrays of shape (num x dimension): list of positive and negative signed measures.
- multipers.ml.sliced_wasserstein._compute_signed_measure_projections(X, num_directions, scales)
This is a function for projecting the points of a list of signed measures onto a fixed number of lines sampled uniformly. This function can be used as a preprocessing step in order to speed up the running time for computing all pairwise sliced Wasserstein distances on a list of signed measures.
- Parameters:
X (list of n tuples): list of signed measures. num_directions (int): number of lines evenly sampled from [-pi/2,pi/2] in order to approximate and speed up the distance computation. scales (array of shape D): scales associated to the dimensions.
- Returns:
list of n pairs of numpy arrays of shape (num x num_directions): list of positive and negative projected signed measures.
- multipers.ml.sliced_wasserstein._pairwise(fallback, skipdiag, X, Y, metric, n_jobs)
- multipers.ml.sliced_wasserstein._sklearn_wrapper(metric, X, Y, **kwargs)
This function is a wrapper for any metric between two signed measures that takes two numpy arrays of shapes (nxD) and (mxD) as arguments.
- multipers.ml.sliced_wasserstein._sliced_wasserstein_distance(meas1, meas2, num_directions, scales=None)
This is a function for computing the sliced Wasserstein distance from two signed measures. The Sliced Wasserstein distance is computed by projecting the signed measures onto lines, comparing the projections with the 1-norm, and finally averaging over the lines. See http://proceedings.mlr.press/v70/carriere17a.html for more details.
- Parameters:
meas1: ((n x D), (n)) tuple with numpy.array encoding the (finite points of the) first measure and their multiplicities. Must not contain essential points (i.e. with infinite coordinate). meas2: ((m x D), (m)) tuple encoding the second measure. num_directions (int): number of lines evenly sampled from [-pi/2,pi/2] in order to approximate and speed up the distance computation. scales (array of shape D): scales associated to the dimensions.
- Returns:
float: the sliced Wasserstein distance between signed measures.
- multipers.ml.sliced_wasserstein._sliced_wasserstein_distance_on_projections(meas1, meas2, scales=None)
This is a function for computing the sliced Wasserstein distance between two signed measures that have already been projected onto some lines. It simply amounts to comparing the sorted projections with the 1-norm, and averaging over the lines. See http://proceedings.mlr.press/v70/carriere17a.html for more details.
- Parameters:
meas1: pair of (n x number_of_lines) numpy.arrays containing the projected points of the positive and negative parts of the first measure. meas2: pair of (m x number_of_lines) numpy.arrays containing the projected points of the positive and negative parts of the second measure. scales (array of shape D): scales associated to the dimensions.
- Returns:
float: the sliced Wasserstein distance between the projected signed measures.
- multipers.ml.sliced_wasserstein._wasserstein_distance(meas1, meas2, epsilon, ground_norm)
This is a function for computing the Wasserstein distance from two signed measures.
- Parameters:
meas1: ((n x D), (n)) tuple with numpy.array encoding the (finite points of the) first measure and their multiplicities. Must not contain essential points (i.e. with infinite coordinate). meas2: ((m x D), (m)) tuple encoding the second measure. epsilon (float): entropy regularization parameter. ground_norm (int): norm to use for ground metric cost.
- Returns:
float: the Wasserstein distance between signed measures.
- multipers.ml.sliced_wasserstein._wasserstein_distance_on_parts(ground_norm=1, epsilon=1.0)
This is a function for computing the Wasserstein distance between two signed measures that have already been separated into their positive and negative parts.
- Parameters:
meas1: pair of (n x dimension) numpy.arrays containing the points of the positive and negative parts of the first measure. meas2: pair of (m x dimension) numpy.arrays containing the points of the positive and negative parts of the second measure.
- Returns:
float: the sliced Wasserstein distance between the projected signed measures.
- multipers.ml.sliced_wasserstein.pairwise_signed_measure_distances(X, Y=None, metric='sliced_wasserstein', n_jobs=None, **kwargs)
This function computes the distance matrix between two lists of signed measures given as numpy arrays of shape (nxD).
- Parameters:
X (list of n tuples): first list of signed measures. Y (list of m tuples): second list of signed measures (optional). If None, pairwise distances are computed from the first list only. metric: distance to use. It can be either a string (“sliced_wasserstein”, “wasserstein”) or a function taking two tuples as inputs. If it is a function, make sure that it is symmetric and that it outputs 0 if called on the same two tuples. n_jobs (int): number of jobs to use for the computation. This uses joblib.Parallel(prefer=”threads”), so metrics that do not release the GIL may not scale unless run inside a joblib.parallel_backend block. **kwargs: optional keyword parameters. Any further parameters are passed directly to the distance function. See the docs of the various distance classes in this module.
- Returns:
numpy array of shape (nxm): distance matrix
multipers.ml.tools module
- class multipers.ml.tools.SimplexTreeEdgeCollapser(num_collapses=0, full=False, max_dimension=None, n_jobs=1)
Bases:
BaseEstimator
,TransformerMixin
- Parameters:
num_collapses (int)
full (bool)
max_dimension (int | None)
n_jobs (int)
- _sklearn_auto_wrap_output_keys = {'transform'}
- fit(X, y=None)
- Parameters:
X (ndarray | list)
- transform(X)
- multipers.ml.tools.filtration_grid_to_coordinates(F, return_resolution)
- multipers.ml.tools.get_filtration_weights_grid(num_parameters=2, resolution=3, *, min=0, max=20, dtype=<class 'float'>, remove_homothetie=True, weights=None)
- Provides a grid of weights, for filtration rescaling.
num parameter : the dimension of the grid tensor
resolution : the size of each coordinate
min : minimum weight
max : maximum weight
weights : custom weights (instead of linspace between min and max)
dtype : the type of the grid values (useful for int weights)
- Parameters:
num_parameters (int)
resolution (int | Iterable[int])
min (float)
max (float)
remove_homothetie (bool)
- multipers.ml.tools.get_simplex_tree_from_delayed(x)
- Return type:
SimplexTreeMulti
- multipers.ml.tools.get_simplextree(x)
- Return type:
SimplexTreeMulti