when your matrix has 156267 columns, library ‘ff’ needs to preallocate a matrix with 156267 * 156267 cells:

> traceback()

2: ff(vmode = "double", dim = c(NCOL, NCOL))

1: bigcor(X)

The cells are of class double (8 Byte), so that the preallocated matrix has 156267 * 156267 * 8 Byte = 1.95355E11 Byte,

which is 1.95355E11 Byte /1024/1024/1024 = 181 Gbyte! I suppose that exceeds your RAM…

Maybe you can split your data somehow…

Cheers,

Andrej

Nice post!

I’ve been trying to run the bigcor function on a matrix with nrow=144 y ncol=156267, but get an error message, like this one below:

Error in if (length .Machine$integer.max) stop(“length must be between 1 and .Machine$integer.max”) : missing value where TRUE/FALSE needed In addition: Warning message: In ff(vmode = “single”, dim = c(NCOL, NCOL)) :

What can I do?

]]>unfortunately nlsLM, as any NLS method, aims to minimize RSS and not the parameter s.e.’s… ðŸ˜‰

If you type vcov(your model), can you see that your both parameters have a high covariance (off-diagonal values)?

I often had this in asymmetric (5-parameter) sigmoidal models, where the asymmetry parameter has a high covariance to the slope parameter, and often the converged values were extremely unrealistic. The only way to avoid this is to set some reasonable bounds on the parameters, using the ‘lower’ and ‘upper’ arguments of nlsLM. However, in most of the cases if you set i.e. lower = c(-Inf, 10) and upper = c(Inf, 20) for w in [-Inf, Inf] and a in [10, 20] then the iteration will stop when a reaches either 10 or 20. At that point, it will have its lowest RSS of all values in the bounds, however not the minimal one over the complete parameter space. Can you define “sensible” boundary values?

Cheers,

Andrej

Although in most cases the algorithm returns sensible values there are a few exceptions. In those exceptions I noticed that the algorithm converges after more than on average iterations and returns non-realistic values for these parameter; typically higher than normal. I noticed also, that the improvement in terms of reduction in RSS is marginal. For example, if I was to stop after the 4th iteration I would get a rather sensible value for “a” (e.g. 1.5) and value of RSS=11.5. Iif I leave it converge (e.g. until the 31st iteration) however I get a small reduction in RSS (11.3 from 11.5) at a cost of an unreasonable value for “a”, (e.g. 64!).

What puzzles me even further is that when I check the standard errors, I notice a positive correlation between their size and the number of iterations. As if se accumulate over iterations. So for example, when I specify maxiter=4, I get a s.e=5 for “a” but when I specify maxiter=6, I get s.e=10. This happens despite RSS decrease. The same holds for w, so it’s not that there is a different allocation of errors, both se increase with more iterations.

Any thoughts as to why this happens?

]]>wfct(sum((fitted-observation)^2)/Uncertainty)

Cheers,

Andrej

Thanks a lot for your reply, I didÂ´t received any alert by email. Just now I saw your reply by chance (maybe I forgot set the notification).

I was wrong, my suitable data set is the transposed matrix

dim(dat)

[1] 72 70736

but anyway the problem was related with Rstudio and some permissions problems in the server apparently. Now is working. Thanks a lot !!!!!

]]>