#### Renjin: put some Java in your R

##### Quick look at using R with some Java 'under the hood'
R data science statistics computing

R is a great, highly flexible language for statistical computing, but it does suffer greatly from performance issues. As I’ve steadily increase my use of R, I quickly became aware that I would have to one day learn to integrate R with a programming language with better performance, the main choice here being C++. To integrate R with C++, the Rcpp framework (and R package) was created, allowing for parts of the R code of a given package or project to be re-written in C++ and easily integrated with R. Using Rcpp comes with great advantages in terms of R code performance; however, it obviously requires that one learn C++. I was about to devote a great deal of time to doing so, when – fortuitously – I came across the rather new renjin project. Renjin is a new (in-development) interpreter for GNU R that relies on the Java Virtual Machine (JVM) to enhance R’s performance. The idea seems to be that it can eventually serve as a drop-in replacement for GNU R. It seems that the renjin R package can be used to provide performance gains via interfacing with the JVM, just by wrapping standard R code.

## Minimal example

For now, I just thought I would try the example from the renjin R package documentation, more involved examples might be added to this post later or come in separate blog posts of their own. Here we go:

Let’s make sure we have the newest version of Renjin:

if (!require(renjin)) {
}
## Loading required package: renjin
library(renjin)

Let’s define a function to simply add by iteration:

bigsum <- function(n) {
sum <- 0
for(i in seq(from = 1, to = n)) {
sum <- sum + i
}
sum
}

We can improve the speed of this function by pre-compiling it to bytecode using R’s native bytecode compiler. We’d expect this to save us some time relative to the naive implementation.

bigsumc <- compiler::cmpfun(bigsum) # GNU R's byte code compiler

Alright, now we’re ready to compare the performances of the naive and bytecode-compiled implementations:

time_norm <- system.time(bigsum(1e7))
time_comp <- system.time(bigsumc(1e7))

Notice that directly using R’s native bytecode compiler improves the performance of our bigsum function quite a bit – that is, considering the time the system spends on the computation, we save about 0 seconds, (roughly) a factor of 1. Maybe renjin can help us out even more?

time_renjin <- system.time(renjin(bigsum(1e7)))
print(table)
##         user system total
## naive  0.422  0.014 0.441
## cmpfun 0.430  0.008 0.444
## renjin 0.430  0.024 0.241

Wow – just, wow. The gain in computational efficiency here is incredible! Using renjin – even just as a wrapper – improves the time cost (on the system side) by a factor of 1 relative to the naive implementation and by quite a bit still (a factor of 2) when compared to the bytecode-compiled version of our function. This was just a simple example, but we were able to save so much computational time just by naively calling renjin…and it took just a few extra characters to call it as a wrapper…

Although Renjin is still in its infancy, I can’t help but be excited for the future of R – and statistical computing in general – with how well its already performing. We’re going to be able to (try to) do great things with these new tools ✨

#### sl3: Machine Learning Pipelines for R

##### Simplifying machine learning in R through pipelines
R data science machine learning computing

#### Taking blogdown for a test drive

##### Trying out RStudio's new blogging framework
R data science tools productivity

#### A shell called xonsh

##### A review of Xonsh, a new Python-facing shell
tools productivity computing