Speed up your R scripts. A cool optimized way to load, write and store big data frames with FST package!
A must in our R environment! Unbeaten speed and data frame compression. It’s FST! x100 faster than write.csv()
Are you trying to save and load your DL model or a big dataset in R? Here we show you a performance boost to your scripts and reduction in disk memory storage with the FST CRAN package. We are going to benchmark it with R base functions (csv and RDS extensions) and another great package like readr:
library(microbenchmark)
library(readr)
library(fst)
microbenchmark(
write.csv(big_dataset, paste0(path,"big_dataset.csv"),), # utils
write_csv(big_dataset, paste0(path,"big_dataset.csv")), # readr
write_csv(big_dataset, paste0(path,"big_dataset.csv.gz"),), # readr GZ
saveRDS(big_dataset, paste0(path,"big_dataset.RDS")), # utils
write_rds(big_dataset, paste0(path,"big_dataset.RDS")), # readr
write_fst(big_dataset, paste0(path,"big_dataset.fst")), # fst
times = 10
)
## Unit: milliseconds
## min mean median max neval file_size
##utils 10943.1161 11232.20073 11098.66610 12011.1538 10 109 MB
##readr 3140.4450 3442.92772 3388.14280 3768.4109 10 109 MB
##readrGZ 6993.8850 7332.31976 7260.95040 7946.9233 10 23 MB
##base 4800.3516 5122.22345 5024.69395 5833.9807 10 15 MB
##readr 187.0765 210.74584 211.70760 246.6369 10 46 MB
"fst 60.3065 87.30611 74.94375 154.7718 10 16 MB"
Wow! That was cool! We can achieve an amazing reading and writing speed plus an incredible file size!
We can see a x3 and x50 performance improvements over the readr::write_rds() and base saveRDS() functions!
An incredible x100 performance between fst and csv writing functions, but the true here is that they are not directly comparable as they work with quite different file formats.
Are you going to add FST to your R projects toolbox too?
See related useful tips on TypeThePipe