Your data analysis solution
Name is required.
Email address is required.
Invalid email address
Answer is required.
Exceeding max length of 5KB

Clustering for large data

By Dr. Alaa Mahgoub | Oct 19, 2016 01:20AM CEST

I am an ecologist, in my tables species are in rows and sites in columns, and for each species the performance percentage is recorded, as each site is subdivided into several stands, is there is a mean to cluster these species into groups and show which of these group is characterizing one site or more in AHC structure, I mean group A is characterizing site 1,8,10 and so on, and the result is further analyzed to indicate the effect of environmental and soil factors on each group, like clay soil, loamy, sandy, calcium, soil water holding capacity and so on, and finally the source data could be used as it is , I mean percentage or as general numeric numbers and should this be standardize or normalized first.

Best Answer
By Jean Paul | Oct 19, 2016 12:29PM CEST | XLSTAT Agent

Hello,

Well, you may:
1) Start by clustering your species using a clustering technique (AHC or k-means).
2) Extract the class memberships you obtain in your results report
3) Use class membership as a subsample identifier in the describing data / descriptive statistics feature. In this feature fill in the quantitative data with anything you want: site or environmental variables. You can also do the same using parallel coordinates plots.
4) Otherwise, you may also try redundancy analysis or canonical correspondence analysis. Both classically work with species in columns and sites in rows though. https://help.xlstat.com/customer/en/portal/articles/2062255-canonical-correspondence-analysis-cca-tutorial?b_id=9283

Best,


Up

0

Down

By Jean Paul | Oct 19, 2016 12:29PM CEST | XLSTAT Agent

Hello,

Well, you may:
1) Start by clustering your species using a clustering technique (AHC or k-means).
2) Extract the class memberships you obtain in your results report
3) Use class membership as a subsample identifier in the describing data / descriptive statistics feature. In this feature fill in the quantitative data with anything you want: site or environmental variables. You can also do the same using parallel coordinates plots.
4) Otherwise, you may also try redundancy analysis or canonical correspondence analysis. Both classically work with species in columns and sites in rows though. https://help.xlstat.com/customer/en/portal/articles/2062255-canonical-correspondence-analysis-cca-tutorial?b_id=9283

Best,

This question has received the maximum number of answers.

Contact Us

1c26995d494fb3061dd0ae8571ffc0a4@xlstat.desk-mail.com
https://cdn.desk.com/
false
desk
Loading
seconds ago
a minute ago
minutes ago
an hour ago
hours ago
a day ago
days ago
about
false
Invalid characters found
/customer/portal/articles/autocomplete
9283