when your matrix has 156267 columns, library ‘ff’ needs to preallocate a matrix with 156267 * 156267 cells:

> traceback()

2: ff(vmode = "double", dim = c(NCOL, NCOL))

1: bigcor(X)

The cells are of class double (8 Byte), so that the preallocated matrix has 156267 * 156267 * 8 Byte = 1.95355E11 Byte,

which is 1.95355E11 Byte /1024/1024/1024 = 181 Gbyte! I suppose that exceeds your RAM…

Maybe you can split your data somehow…

Cheers,

Andrej

Nice post!

I’ve been trying to run the bigcor function on a matrix with nrow=144 y ncol=156267, but get an error message, like this one below:

Error in if (length .Machine$integer.max) stop(“length must be between 1 and .Machine$integer.max”) : missing value where TRUE/FALSE needed In addition: Warning message: In ff(vmode = “single”, dim = c(NCOL, NCOL)) :

What can I do?

]]>Thanks a lot for your reply, I did´t received any alert by email. Just now I saw your reply by chance (maybe I forgot set the notification).

I was wrong, my suitable data set is the transposed matrix

dim(dat)

[1] 72 70736

but anyway the problem was related with Rstudio and some permissions problems in the server apparently. Now is working. Thanks a lot !!!!!

]]>Happy New Year!

Is “dat” your complete matrix? If so, you can use the R function “cor” without any problems, because it only has 72 columns:

res <- cor(dat)

If your real matrix is bigger, then try the "size" parameter smaller than the number of columns, e.g. if you have 2000 columns, use "size = 1000". Otherwise I don't really know, is class(dat) = "matrix"?

Cheers, Andrej

]]>Hi, I would respectfully ask you something related with bigcor (and ff apparently). Maybe I am wrong, but with a small test matrix all is Ok with bigcor, but with my real matrix I am receiving this warning:

Loading required package: minpack.lm

Error in if (length .Machine$integer.max) stop(“length must be between 1 and .Machine$integer.max”) :

missing value where TRUE/FALSE needed

Calls: bigcor -> ff

In addition: Warning message:

In ff(vmode = “double”, dim = c(NCOL, NCOL)) :

NAs introduced by coercion to integer range

Execution halted

What does it means? I am wrong with something or is a kind of ff constraint?

> dim(dat)

[1] 70736 72

I am sure that the matrix does not have missed values. So I am really lost, do you understand the warning? Any clue could be great.

Thank you very much,

alex

]]>Thanks so much for your answer. It worked!

In case there are more beginners in R; the script I finally used is:

COR <- bigcor(mydata, fun="cor", size=2000, verbose= TRUE)

COR <- as.ffdf(COR)

write.table(COR, file = "C:/myexportfile.csv", sep = ",", qmethod = "double")

Kind regards,

Erwan

to look at it in the console, you have to convert it to a matrix:

COR <- bigcor(mydata, fun = “cor” , size=2000 , verbose= TRUE)

COR <- COR[1:nrow(COR), 1:ncol(COR)]

However, if your data is too big to view in R, you can convert it to ‘ffdf’ (ff dataframe) and export it to .csv:

COR <- bigcor(mydata, fun = “cor” , size=2000 , verbose= TRUE)

COR <- as.ffdf(COR)

write.table(COR, file = "C:\\temp\\test.csv")

Does that work?

Cheers,

Andrej

Thanks for creating this package!

I installed R and loaded “propagate” after I found your website through Google.

I loaded my datafile (n=2000; 8602 variables) and got bigcor to run fine and quickly. However only a few correlations are being displayed in the console; not the 8602 x 8602 matrix I was looking for.

I used: bigcor(mydata, fun = “cor” , size=2000 , verbose= TRUE)

Is there a way to display the whole correlation matrix in the console? Or to export it in csv (or any other format)?

It might be because i do not understand what ff is doing?

Best regards,

Erwan

Will see what I can do, maybe not a package update but just function posted here…

Cheers,

Andrej