#[extendr]
fn add(x: i32, y: i32) -> i32 {
+ y
x }
extendr is a project that provides an interface between R and Rust. In the last post, I explained about how to create an R package with extendr briefly. This time, we’ll walk though how to handle various R types.
Vector
Let’s start with the last example in the last post.
While this works perfectly fine with a single value, this fails when the length is more than one.
add(1:2, 2:3)
Error in add(1:2, 2:3): Input must be of length 1. Vector of length >1 given.
This is very easy to fix. In Rust, we can use Vec<T>
to represent a vector of values of type T
.
// I don't explain much about the Rust code this time, but, for now, please don't
// worry if you can't understand what it does at the moment. Probably it's not
// very important to understand this post. Move forward.
#[extendr]
fn add2(x: Vec<i32>, y: Vec<i32>) -> Vec<i32> {
.iter().enumerate().map(|(i, x)| x + y[i]).collect()
x}
add2(1:2, 2:3)
[1] 3 5
Easy!
Wait, didn’t you say we can’t do this…!?
Some of you might remember, in this post, I wrote
We cannot simply pass a variable length of vector
from R to Rust.
Yeah, it’s true it was too difficult because I was struggling to do it via FFI! There’s no metadata available about the length or the structure of the data by default. But, with extendr, we can seamlessly access these metadata via R’s C API. So, in short, extendr is the game changer.
&[T]
If you are already familiar with Rust, you might feel using Vec<T>
as arguments looks a bit weird. In fact, the document of Vec<T>
says:
In Rust, it’s more common to pass slices as arguments rather than vectors when you just want to provide read access. The same goes for
String
and&str
.
(https://doc.rust-lang.org/std/vec/struct.Vec.html#slicing)
Yes, you can use &[T]
instead of Vec<T>
, and this seems to matter on the performance slightly. If you are familiar with Rust to the extent that you know the difference between &[T]
and Vec<T>
(confession: I’m not!), you can should use &[T]
instead. Otherwise, Vec<T>
just works.
#[extendr]
fn add2_slice(x: &[i32], y: &[i32]) -> Vec<i32> {
.iter().enumerate().map(|(i, x)| x + y[i]).collect()
x}
add2_slice(1:2, 2:3)
[1] 3 5
Please note that this isn’t the reference to the original R object, just that to the copied values. If you really want no copying, you should use the “proxy” types, which I’ll cover in the next post.
NA
One more caveat about add()
is that this cannot handle a missing value, NA
.
add(1L, NA)
Error in add(1L, NA): unable to convert R object to primitive
In Rust, we can use Option<T>
to represent an optional, or possibly missing, value.
// pattern match is one of the most powerful things in Rust, btw!
#[extendr]
fn add3(x: Option<i32>, y: Option<i32>) -> Option<i32> {
match (x, y) {
Some(x), Some(y)) => Some(x + y),
(=> NA_INTEGER
_ }
}
This function can handle NA
.
add3(1L, 2L)
[1] 3
add3(1L, NA)
[1] NA
It might be safe to always use Option
since there’s always possibility that R value can be NA
by nature. But, we might want to choose non-Option
version to avoid the overhead (c.f. How much overhead is there with Options and Results? - The Rust Programming Language Forum), so it depends.
Primitive types
Okay, let’s learn about the primitive types at last. Here’s the corresponding table of R types and Rust types. We don’t have the direct equivalent of factor
and complex
here, but let’s talk about it later.
R | Rust |
---|---|
integer |
i32 |
numeric |
f64 |
logical |
bool |
character |
String &str |
factor |
- |
complex |
- |
integer
and numeric
integer
and numeric
can mainly be converted into i32
and f64
respectively. I used “mainly” because it’s not that strict. They both can be converted into either of:
u8
u16
u32
u64
i8
i16
i32
i64
f32
f64
So, in other words, if you don’t want to prevent from numeric values are coerced into integers, you’ll need to check the types by yourself.
logical
logical
is translated from/into bool
. That’s all.
character
character
is a bit tricky in that you can convert it to either of String
and &str
. You’ll probably have to scratch your head to understand the concept of “lifetime” to choose the proper one (confession: I still don’t understand it). But, in short,
String
: choose this when you modify the content strings&str
: choose this (probably with'static
lifetime) when you only reference the strings
If you are not familiar with Rust yet, I recommend you to start with String
. String
is copied around so you might have unnecessary overhead, but it’s generally easier to handle because we need to think about the lifetimes less frequently.
factor
To put things simpler, until this point, I deliberately chose the cases when we have the corresponding types in Rust’s side. But, factor
isn’t the case. It cannot be directly converted into a simple Rust type (at least at the moment). Instead, it can be cast into StrItr
. StrItr
is a “proxy” to the underlying data on R’s side.
I’ll try explaining this in another post, but keep in mind that extendr provides that “proxy”-type of interface as well as the simple conversion to Rust’s primitive types.
list
The corresponding Rust class for list
is List
. A List
can be converted into HashMap<&str, Robj>
. Be careful that R’s list
can be a different data structure than Hashmap
; it can have duplicated elements and unnamed elements.
use std::collections::HashMap;
#[extendr]
fn print_a(x: List) {
let x_hashmap: HashMap<&str, Robj> = x.into_hashmap();
println!("{:?}", x_hashmap.get("a"));
}
print_a(list(a = 1, b = 2))
print_a(list(b = 2))
r!
is a macro to create an R object from a Rust expression, by the way.
Robj
?
As a sneak peak of the next post, let’s take a look at the usage of Robj
.
So far, I created only functions that accepts just one type. What if we want to create a function that accepts multiple types of arguments? In this case, we can create a function that takes Robj
as its argument and convert it by ourselves. Robj
has many methods as_XXX()
to convert to (or, more precisely, extract and copy the value of R object, and turn it into) a type. Here, let’s use as_integer()
to generate Option<i32>
.
#[extendr]
fn int(x: Robj) -> Option<i32> {
.as_integer()
x}
# integer
int(1L)
[1] 1
# not integer-ish
int("foo")
[1] NA
What’s next?
In this post, I focused mainly the Rust’s side of the type ecosystem. Next, I probably need to write about more R-ish things like Function
or Symbol
, which I need some time to understand correctly. Stay tuned…