unsupervised clustering in 10 dimensions
I have a set of ~1000 feature vectors in ~10 dimensions and would like to
cluster them in an unsupervised manner. I am expecting some of the vectors
to bunch together in groups, but quite a lot to be outliers that are
nowhere near each other (so ~ 5 meaningful clusters and 1 cluster which is
just a uniform distribution in all dimensions).
I'm thinking of using a Gaussian mixture model; does that sounds
reasonable? Is learning a GMM suitable for this higher dimension of data
or is there perhaps a more suitable technique? Does 1000 vectors sound
like enough to do 10 dimensional clustering. I am quite new to it so am
trying to get a feel. Thanks very much for any insight you might be able
to provide! :)
No comments:
Post a Comment