Which Clustering Algorithm should is used for ?

By ahmed jabbar | Aug 07, 2016 10:35PM CEST

Dear supporters

I would like to know which algorithm should i used for my high-dimensional dataset ( 128 / 82 dimensions ) with string attributes matrix , entries are values for tf-idf , so which algorithm can work and clustering my instances that has 128 in one dataset and second dataset is 82 dimensions ) that are mention suitable algorithm for clustering high-dimensional dataset .

Note : these dimension has been produced after string to word conversion, and attributes selection process , so result attributes has one class labels consist of 10 labels, would like to cluster it into clusters and validate result by cross validation process




By Thomas | Aug 08, 2016 11:38AM CEST | XLSTAT Agent

Dear Ahmed,

AHC seems suitable for your analysis.
However, in case of a large dataset, you can perform the k-means clustering followed by ACH as described in the linked page below:—-using-k-means-clustering-followed-by-an-ahc?b_id=9283


