[Solved] Cannot allocate a vector in R


Simplest answer: Purchase more RAM. If you work in R with large datasets often, it’s worth it. If you don’t have enough memory to load your files, you may not have enough to manipulate them as you want either.

Let’s assume that you could hold this data in RAM and manipulate it as you wish so that reading it in is your only problem. The comments to your question have provided some ideas. You could also use scan() to read your file instead of read.csv. It requires a little more setup but is faster and relatively memory friendly.

There are several packages available to help people whose data is too large for memory. The one I’m familiar with (though I haven’t used it in years thanks to falling RAM prices) is filehash. This will store your data on disk instead of RAM but you still interact with it the same way. It will be slower, but likely resolve your problem.

library(filehash)
dbCreate('tmp.db')
dbhandle <- dbInit('tmp.db')
db <- db2env(dbhandle)    
db$test <- ldply(list.files(path="C:/rcsv",pattern="csv",full.names=TRUE),
function(filename) {
  db$dum <- read.csv(filename)
  db$dum$filename <- filename
  return(dum)
})

I don’t know whether this approach has been deprecated, but I believe it still works. I’m not sure to what extent ldply() will put temporary stuff in RAM. If this fails you could instead create a list inside of db, read each of the files into the list in a for loop and then rbind them together yourself.

Another approach would be maybe to create an sqlite database and use read.csv.sql() from the sqldf package to put everything in there and then bind them. Then at the end you can read the final dataframe out.

1

solved Cannot allocate a vector in R