Your data analysis solution
Name is required.
Email address is required.
Invalid email address
Answer is required.
Exceeding max length of 5KB

Which Clustering Algorithm should is used for ?

By ahmed jabbar | Aug 07, 2016 10:35PM CEST

Dear supporters

I would like to know which algorithm should i used for my high-dimensional dataset ( 128 / 82 dimensions ) with string attributes matrix , entries are values for tf-idf , so which algorithm can work and clustering my instances that has 128 in one dataset and second dataset is 82 dimensions ) that are mention suitable algorithm for clustering high-dimensional dataset .

Note : these dimension has been produced after string to word conversion, and attributes selection process , so result attributes has one class labels consist of 10 labels, would like to cluster it into clusters and validate result by cross validation process

Up

0

Down

By Thomas | Aug 08, 2016 11:38AM CEST | XLSTAT Agent

Dear Ahmed,

AHC seems suitable for your analysis.
However, in case of a large dataset, you can perform the k-means clustering followed by ACH as described in the linked page below:
https://help.xlstat.com/customer/en/portal/articles/2062322-clustering-big-datasets-with-xlstat—-using-k-means-clustering-followed-by-an-ahc?b_id=9283

Regards,

This question has received the maximum number of answers.

Contact Us

1c26995d494fb3061dd0ae8571ffc0a4@xlstat.desk-mail.com
https://cdn.desk.com/
false
desk
Loading
seconds ago
a minute ago
minutes ago
an hour ago
hours ago
a day ago
days ago
about
false
Invalid characters found
/customer/portal/articles/autocomplete
9283