2

From the given point cloud (Fig. 1), I use Scipy-TDA to extract persistence diagram (Fig. 2). What I'm interested in is to extract 3 circles. For example, I'd like to know 3 center points and labels for each point. I'm quite newbie to topological data analysis. Anyone to help me or guide the process?

enter image description here

enter image description here

Edit: To show a challenging case, my data is like the following where 12 ellipses should be clustered.

Glorfindel
  • 219
  • 1
  • 4
  • 11
jakeoung
  • 225
  • 1
  • 4
  • Have you tried K-means clustering? It might be really easy job for means to divide your dataset into three different regions and give you the centeroid of each Voronoi. K-means is freely available in scikit-learn. – Mithridates the Great Feb 28 '20 at 13:48
  • Thanks for the input. Actually, my real data is more complicated. For example, there can be a circle containing all the 3 circles. In this case, I guess K-means can be a problem. – jakeoung Feb 28 '20 at 18:57
  • I just proposed K-means based on the picture that you attached here. But, for much more complex situations even if your point cloud is unstructured, still there are some advanced clustering techniques like Hierarchical and DBSCAN methods that might work very well, where K-means fails to find the right clusters. But based on your data and even if you have another circle around these three circles, I believe it's a relatively easy task still for K-means, but you never know until you try it. – Mithridates the Great Feb 28 '20 at 19:14
  • Are you sure you'll always have circles? If so, you can leverage that prior. Reply here and I'll think about methods. – Richard Feb 29 '20 at 16:32
  • It can have ellipses as well, but as a starter, I want to cluster the circles first. Actually, the real data is quite challenging. For example, I have 12 ellipses and have very sparse points. Any idea is welcome! – jakeoung Feb 29 '20 at 16:43
  • @jakeoung Still easy task for DBSCAN especially. See it here: https://scikit-learn.org/stable/modules/clustering.html – Mithridates the Great Mar 01 '20 at 05:19
  • I actually tried both dbscsn and spectral clustering. But it's not robust for the challenging data I added. Also I want to learn tda more. – jakeoung Mar 01 '20 at 07:18

2 Answers2

1

If you can represent your data as a sparse graph you can use; https://docs.scipy.org/doc/scipy-0.19.0/reference/generated/scipy.sparse.csgraph.connected_components.html

For the cases you are asking about, I would define a parameter $\varepsilon$ such that for two nodes $p, q$ are connected via an edge $e$ if $d(p,q)<\varepsilon$ -with $d(\cdot,\cdot)$ is some distance function-.

I have done a similar project, albeit not for topological data analysis, in MATLAB and this approach worked well there. Both MATLAB and SciPy have great graph theory tools.

Abdullah Ali Sivas
  • 2,636
  • 1
  • 6
  • 20
0

So H0 tells you about zero dimensional or connected components. So each point on the circumference of the circle is connected to each other point on the circle through its neighbors and therefore circumference of the circle constitutes one connected components. In the figure you have provided, I can see that circles are not fully connected but yet you can go from one point to other lying on the circumference so your top right circle is one connected component. Your bottom circle - as shown in the figure - is two connected component and left center one is three connected component. H1 tells you about enclosed empty area. So a full circle (which there are none in the image) will enclose an empty area and its number will be one. In persistent diagram, one starts putting a disc on top of each point and increasing the radius of disc incrementaly and recording when new components are formed or die. For example, when increasing the radius of discs centered on each point of circumference, these discs will come in contact with diametrically opposite disc when their radius will become radius R. Thus you can extract radius of circles. I doubt if you can find actual coordinates of circles through this.