RipsLowerstar bifiltrations
import multipers as mp
import multipers.filtrations as mpf
import gudhi as gd
import matplotlib.pyplot as plt
import numpy as np
[KeOps] Warning : Cuda libraries were not detected on the system or could not be loaded ; using cpu only mode
The Rips complex
In this example, we show how to extend Gudhi simplextrees (see, e.g., here), to the multiparameter settings.
We will consider the Rips Complex simplex here, which can be defined as follows. Consider \(X = (x_1, \ldots,x_n)\) a point cloud in a metric space, and its pairwise distance matrix \(D=\left(d(x_i,x_j)\right)_{1\le i,j\le n}\)
Now, for any \(t\ge 0\), we consider the so-called simplicial complex $\(\mathrm{Rips}(X)_t:= \left\{ \sigma \subseteq \{x_1, \ldots, x_n\} \mid \mathrm{diam}(\sigma)\le t \right\}, \quad \textnormal{where }\mathrm{diam}(\sigma) = \max_{x,y\in \sigma} d(x,y)\)$
We will see that this mathematical structure is well suited to encode, at any scale, topological information. In the meantime, let’s build it on a simple dataset.
from multipers.data.synthetic import noisy_annulus
np.random.seed(0)
X = noisy_annulus(200, 0)
plt.scatter(*X.T)
plt.gca().set_aspect(1)
data:image/s3,"s3://crabby-images/de2d8/de2d894ac1689702a97a1746668b00be27eb72cf" alt="../_images/2b776b9555f61870fd9a171d40dee3ef15396416045cae04d9d725d60e07d556.png"
Intuitively in this example, for small radii, the points are disconnected and the circle annulus is not preserved, but for large radii, the hole is closed.
Computing the rips for all possible parameters \(t\in \mathbb R_+\) hence should allow us to recover the topological feature here, i.e., the cycle.
The Rips filtration
The Rips complex can easily be built in gudhi, and encoded in a well-suited structure for this: the SimplexTree
.
st = gd.RipsComplex(points = X).create_simplex_tree(max_dimension=2)
Then the persistence of topological features along this time parameter can be encoded in a persistence barcode. We see in the example below that only one feature survives for a “long time”; it corresponds to this hole.
st.compute_persistence()
pers = st.persistence_intervals_in_dimension(1)
gd.plot_persistence_barcode(pers)
<Axes: title={'center': 'Persistence barcode'}>
data:image/s3,"s3://crabby-images/e5a7b/e5a7b297942db438984f6f70cd11616108424c23" alt="../_images/767b6971a953454e4fab48afc18108e25768453198a8ae8925aad53d8f19a5e1.png"
With the noise
Adding (diffuse) noise to this dataset however significantly reduces what we can recover.
np.random.seed(2)
X = noisy_annulus(200, 200)
plt.scatter(*X.T)
plt.gca().set_aspect(1)
data:image/s3,"s3://crabby-images/eee83/eee8315c3845ecbfe6451397ffb073ca9583d2b6" alt="../_images/b728aff8dacf7bb3f1770ebaeccc21407899b44f0048b7dac93bede32fdb5daf.png"
st = gd.RipsComplex(points = X).create_simplex_tree()
st.collapse_edges(2) # Little optimization trick
st.expansion(2)
st.compute_persistence()
pers = st.persistence_intervals_in_dimension(1)
gd.plot_persistence_barcode(pers)
<Axes: title={'center': 'Persistence barcode'}>
data:image/s3,"s3://crabby-images/ead35/ead35ee85185e376f842c6bdb2efdb4a6c75ce25" alt="../_images/a5faa15d6252352f50a4f8614a13be10ac23c7c0086429fb94fa08d7989193c8.png"
The Rips-Codensity bifiltration
A simple-to-implement fix to this issue is to add a parameter to take into account the concentration of the sampling. Indeed, the noise is usually less concentrated than the signal.
Given a density estimation \(f\) of the sampling measure \(\mu\), i.e., \(f \simeq \frac{\mathrm d\mu}{\mathrm dx\mathrm dy}\), we can consider the following bifiltration, whose first parameter is the radius and the second the codensity:
As the second parameter of the filtration only depends of the values of \(f\) on the \(0\)-simplices, i.e., the points of \(X\), this second parameter is a so-called lowerstar filtration. This function accepts any point cloud, or distance matrix, with an array \((\mathrm{num\_points}, \mathrm{num\_additional\_axis})\) for additional lowestar parameters.
from sklearn.neighbors import KernelDensity
f= - KernelDensity(bandwidth=.2).fit(X).score_samples(X) # minus to reverse the order.
plt.scatter(*X.T, c=f, cmap="viridis_r")
plt.gca().set_aspect(1)
st = mpf.RipsLowerstar(points=X, function=f).collapse_edges(-2).expansion(2) # Same trick as previously
data:image/s3,"s3://crabby-images/a2f4b/a2f4bfd2ff7d0ea944f87aa4d56374a2e433584f" alt="../_images/70aa8727f9cf473cb96f2abe026bb4145c5a371b6a94c21dbc7af691c702103e.png"
Note that this can be achieved in one line by calling the specialized function RipsCodensity
.
st = mpf.RipsCodensity(points=X, bandwidth=.1).collapse_edges(-2).expansion(2) # Same trick as previously
Looking at the different pairs of \((\mathrm{radius},\mathrm{density})\) threshold, we see that there is only one shape that is significantly bigger than the other here: our initial circle !
mp.module_approximation(
st
).plot(degree=1)
plt.xlabel("Radius")
plt.ylabel("Codensity")
plt.title("RipsCodensity")
Text(0.5, 1.0, 'RipsCodensity')
data:image/s3,"s3://crabby-images/99621/99621404c1ae3dfa75fb599eab3dc253c6efc32d" alt="../_images/f6a091317854db205f1d05e26876c41ff47a39d589bf8b57448d27baa0cfdad0.png"
Another example (todo)
(maybe a 3-parameter complex ?)