Storing R Objects in SQL Tables

I am running network simulations. But at this time I’m not certain what sorts of analyses should be run on the resulting networks. So I want to store the network objects for later anlysis. I could store each of the networks as .rda files with the save command, but I feel like that would result in a folder full of thousands, or even millions of files. That’s unseamly.

It’s not even about speed for me, it’s about cleanly managing the data. Plus I can store different attributes and information about the simulations (the size, the time it took, different parameters for its creation, etc). Then I can search and recall the results according to different parameters. It would be perfect.

Let’s say this is the object I want to store. It’s a list that contains graph objects resulting from a Watts-Strogatz model. I could vary the different parameters, but this is just a proof of concept for now. I want to store each graph as an entry in a SQLite table

gs <- list()
for(i in 1:10)
  gs[[i]] <- watts.strogatz.game(1, 100, 5, 0.05)

Blobs

So SQLite is the database of choice here. It stores as a single file that I can keep in a Dropbox folder. I like that. And SQLite has a datatype called a BLOB which stores a blob of byte data exact how it was input.

Here I create the table in the database. I only have an _id variable for indexing, but I could add other columns that refer to the size of the netork, or the different parameters of the WS model.

dbGetQuery(con, 'create table if not exists graphs 
                 (_id integer primary key autoincrement, 
                  graph blob)')

Serialize

I referred to the unit tests for RSQLite to see how to do a blob insert. We need the R-object condensed into a single item we can insert into a database. There are apparently many ways to accomplish this. One way is dump which outputs the structure of an R object. It usually dumps to a file that can be sourced, but I believe you can export it to a character string. I attempt to export it to a character string, without luck. It also spawned warnings about inadequate deparse or something.

But then I learned of serialize which will do exactly what I want: convert an R object to a vector of raw bytes. This line here converts the list into a data.frame with a column in which each row is a raw vector of the graph object. Then the I function forces the data.frame to store the whole vector as an entry in the data.frame.

df <- data.frame(g = I(lapply(gs, function(x) { serialize(x, NULL)})))

# And insert it
dbGetPreparedQuery(con, 'insert into graphs (graph) values (:g)', bind.data=df)

Retrieve the result

Now we can select the data out of the database and unserialize it. It’s pretty simple.

df2 <- dbGetQuery(con, "select * from graphs")
gs2 <- lapply(df2$graph, 'unserialize')

And now the compulsory network image:

g <- gs2[[1]]
V(g)$size <- log(betweenness(g)) + 1
V(g)$color <- "#66c2a4"
V(g)$frame.color <- "#238b45"
plot(g, vertex.label = NA)

Sqlite blob test.

Storing R Objects in SQL Tables

Blobs

Serialize

Retrieve the result

The Full Code

A Simple Network Analysis

Getting Network Data In and Out of R

R and Networks

An Introduction to Network Analysis in R

Notes on SQLite

Using Jekyll

Pixyll in Action

Storing R Objects in SQL Tables

Blobs

Serialize

Retrieve the result

The Full Code

Related Posts

A Simple Network Analysis

Getting Network Data In and Out of R

R and Networks

An Introduction to Network Analysis in R

Notes on SQLite

Using Jekyll

Pixyll in Action