Spatial Clustering

I am trying to produce an algorithm that analyzes a set of data points in 2D space to find clusters based on radius.

Here's what I'm thinking in sort of pseudo code


1
2
3
4
5
6
7
8
9
10
11
12
vector<Point2f> data;
vector<vector<int>> clusters;

for (i = 0; i = data.size(); i++) { 
   for (n = (i+1); n = data.size(); n++) {
       if distance between i and n < threshold
           if i already in clusters
                add n to same cluster as i
           else create new cluster in clusters, and add i and n
}
}
search clusters for points that belong to multiple clusters and merge
I would do this in 2 steps.
step 1: find the center of a group. You can do this using the ideas from physics on finding the center of mass of an object, something like that? The center may or may not be a datapoint!

step 2: build the cluster for that group.

if its based on radius, you can't merge. ?? either a point is within X distance of a centrally located position, or it is not.

this should not require brute force. If you sort by x coordinate and then by y coordinate, you should be able to find boundaries and group everything in between?

you may want to allow total outliers that belong to no clusters.

there are almost certainly algorithms on the web for this.
Last edited on
Topic archived. No new replies allowed.