Skip to content

edwindj/ffbase2

Repository files navigation

ffbase2

Next version of ffbase

  • dplyr on ff

Working (dplyr 0.3), but documentation is not yet CRAN ready...

version Build Status

Install

ffbase2 is currently only available from github. To install it, run the following script.

# install.packages("devtools")
devtools::install_github("edwindj/ffbase2")

CRAN-readiness

  • Documentation needs to be completed
  • Add more tests
  • copy some dplyr internal functions into ffbase2
  • Phase out dependency on ffbase (version 1)

Usage

Creation

Creating a tbl_ffdf: this will create/use a temporary ffdf data.frame in options("fftempdir").

iris_f <- tbl_ffdf(iris)

species <- 
   iris_f %>%
   group_by(Species) %>%
   summarise(petal_width = sum(Petal.Width))

A tbl_ffdf is also a ffdf

iris_f <- tbl_ffdf(iris)
is.ffdf(iris_f)
## [1] TRUE

Use src_ffdf for storing your data in a directory

library(ffbase2)
# store a ffdf data.frame in "./db_ff"" directory
cars <- tbl_ffdf(mtcars, src="./db_ff", name="cars")
print(cars, n=2)
## Source:     ffdf ('./db_ff/cars') [32 x 11]
## 
##    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## 1   21   6  160 110  3.9 2.620 16.46  0  1    4    4
## 2   21   6  160 110  3.9 2.875 17.02  0  1    4    4
## .. ... ...  ... ...  ...   ...   ... .. ..  ...  ...

To retrieve tables from a ffdf source, use src_ffdf

src <- src_ffdf("./db_ff")
print(src) 
## src: ffdf ['./db_ff']
## tbls: cars
# what tables are available?
src_tbls(src)
## [1] "cars"
#retrieve table from src 
cars <- tbl(src, from="cars") # or equivalently tbl_ffdf(src=src, name="cars")
print(cars, n=2)
## Source:     ffdf ('./db_ff/cars') [32 x 11]
## 
##    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## 1   21   6  160 110  3.9 2.620 16.46  0  1    4    4
## 2   21   6  160 110  3.9 2.875 17.02  0  1    4    4
## .. ... ...  ... ...  ...   ...   ... .. ..  ...  ...

Use copy_to to add data to a src_ffdf

src <- src_ffdf("./db_ff")
copy_to(src, iris) # or equivalenty tbl_ffdf(iris, src)
## Source:     ffdf ('./db_ff/iris') [150 x 5]
## 
##    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1           5.1         3.5          1.4         0.2  setosa
## 2           4.9         3.0          1.4         0.2  setosa
## 3           4.7         3.2          1.3         0.2  setosa
## 4           4.6         3.1          1.5         0.2  setosa
## 5           5.0         3.6          1.4         0.2  setosa
## 6           5.4         3.9          1.7         0.4  setosa
## 7           4.6         3.4          1.4         0.3  setosa
## 8           5.0         3.4          1.5         0.2  setosa
## 9           4.4         2.9          1.4         0.2  setosa
## 10          4.9         3.1          1.5         0.1  setosa
## ..          ...         ...          ...         ...     ...
src_tbls(src)
## [1] "cars" "iris"

About

dplyr for ff

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages