One of my clients is interested in applying Principal Component Analysis (PCA) to a dataset. There are 10 million or more of records in the data. If I use XLSTAT to do PCA, is there a constraint on the number of records in the data? If yes, what is the maximum number of records? Is there a constraint on the number of variables or factors? Moreover, how long does it take to do the calculations for 10 million of records if I use XLSTAT? I have a concern about the speed of the analysis under this situation. Your quick response will be much appreciated. Thanks and regards, Michael
Up
0
Down
By Thierry |
Apr 08, 2018 10:21AM CEST |
XLSTAT Agent
Hello Michael,
XLSTAT is currently limited by the number of rows in Excel (1 million rows) and 16 000 columns. I would recommend that you extract a subsample of your data to do the PCA and it is very unlikely that you obtain more information using all the dataset. But that depends on what your customer wants to do with the results of the PCA.
Hi everyone:
One of my clients is interested in applying Principal Component Analysis (PCA) to a dataset.
There are 10 million or more of records in the data.
If I use XLSTAT to do PCA, is there a constraint on the number of records in the data? If yes, what is the maximum number of records?
Is there a constraint on the number of variables or factors?
Moreover, how long does it take to do the calculations for 10 million of records if I use XLSTAT? I have a concern about the speed of the analysis under this situation.
Your quick response will be much appreciated.
Thanks and regards,
Michael
Up
Down
Hello Michael,
XLSTAT is currently limited by the number of rows in Excel (1 million rows) and 16 000 columns. I would recommend that you extract a subsample of your data to do the PCA and it is very unlikely that you obtain more information using all the dataset. But that depends on what your customer wants to do with the results of the PCA.
Best,
Post Your Public Answer