Previously in this
series, I covered how the plugin
system could be implemented from scratch. This is a lot of work if you’re
dealing with a relatively large codebase and therefore a complex interface in
your plugin system, so let’s see how we can make our lives easier. I’ve been
wanting to try abi_stable
for this since the beginning, which was
specifically created for plugins. But we aren’t really locked to that crate, so
I’ll show other alternatives as well, which can even be combined to your liking.
1. Handy tools for our Plugin System
1.1. Async with the C ABI
In a previous post I mentioned that async was not supported in abi_stable
.
While this is true, because there is no FFI-safe Future
in the crate, it’s
certainly possible, and it might be of interest later on.
Matthias recently let me know about the async_ffi
crate, which
lets us do exactly that. It exports the type FfiFuture<T>
, which provides the
same functionality as Box<dyn Future<Output = T> + Send>
:
// This is how regular async works: the first function is practically equivalent
// to the second.
async fn example() -> String {
read_file().await
}
fn example() -> impl Future<Output = String> {
async {
read_file().await
}
}
// For FFI-safe interfaces there can't be generics involved, so the future is a
// concrete type instead of a trait. This conversion from `Future` to
// `FfiFuture` can be done with `into_ffi`.
fn example() -> FfiFuture<String> {
async move {
read_file().await
}
.into_ffi()
}
// `FfiFuture<T>` implements `Future<Output = T>`, so it can be awaited as usual
async fn user() {
example().await
}
Someone asked for
this feature in abi_stable
back in 2019, but noone seemed interested enough
to implement it at that time, so maybe in the future.
1.2. LCCC
The Lightning Creations Compiler Collection provides a set of frontends and backends with a uniform intermediate representation for multiple programming languages, including Rust.
This means that they’ve written their own standard library with the C ABI, which
is exactly what we need. It’s much simpler than Rust’s standard library, but it
includes the most popular types your library may use: HashMap
, Vec
,
String
, Box
, etc. The source code is quite nice to read in comparison to
std
, which often includes lots of procedural macros and various forms of
astral magic.
It’s not too popular right now, and it’s still Work In Progress, but it serves as
an example of what we’re looking for in this article. We just want to simplify
our lives by having a #[repr(C)]
-compatible standard library so that we don’t have
to write it ourselves. If all you need is something simple like LCCC, consider
this library or a similar one.
1.3. Safer FFI
If you don’t like any of the solutions listed in this article, and you’re going
to end up writing the plugin interfaces by hand, you might be interested in safer_ffi
.
All this crate provides is a set of procedural macros to make FFI interfacing an
easier and safer task. With it, you’ll be able to get rid of lots of extern
"C"
and unsafe
instances in your code, which can get out of hands in larger
codebases. Its documentation is excellent, you can check out
its book for more information.
1.4. CGlue
In my last post, I was brought up the cglue
crate
by
its own creator. It takes a very interesting approach, achieving ABI stability
through opaque types.
An opaque type is simply one for which you don’t know its concrete layout.
There’s no #[repr(C)]
needed at all, because one can only interact with it via
void pointers and its associated vtables.
cglue
's README showcases the following snippet of code, and the repo even
includes an example of a plugin
system.
use cglue::*;
// One annotation for the trait.
#[cglue_trait]
pub trait InfoPrinter {
fn print_info(&self);
}
struct Info {
value: usize
}
impl InfoPrinter for Info {
fn print_info(&self) {
println!("Info struct: {}", self.value);
}
}
fn use_info_printer(printer: &impl InfoPrinter) {
println!("Printing info:");
printer.print_info();
}
fn main() -> () {
let mut info = Info {
value: 5
};
// Here, the object is fully opaque, and is FFI and ABI safe.
let obj = trait_obj!(&mut info as InfoPrinter);
use_info_printer(&obj);
}
cglue
is limited to just generating FFI-safe trait objects, trying to make the
whole process as straightforward as possible. You could say that cglue
covers
just a subset of what abi_stable
does, because most of this is also available
in abi_stable
through the sabi_trait
procedural macro, which I’ll
explain later. It’s possible to combine both crates, which is
something cglue
plans to do in the future. cglue
offers the following
benefits over sabi_trait
[1]:
It’s possible to generate bindings for C/C++, which means that plugins can be written in languages other than Rust.
You can define trait groups, even with optional traits.
Neither of these are particularly useful for my use-case, but if any of these features interests you, definitely take a deeper look. It’s actively maintained and constantly being improved; the documentation is great and the author frequently uploads updates to his personal blog.
1.5. Miri
Miri is an interpreter for Rust’s mid-level intermediate representation. This doesn’t help us with the plugin system per se, but since it’s very likely that we’re going to end up writing unsafe code, it’s good to know about it. That’s exactly what Miri is used for: detecting undefined behavior, such as using uninitialized data or use-after-frees.
I was going to use Miri from the beginning, but since I’ll be using abi_stable
for now, there will be no unsafe code involved. If I end up
having to resort to it, I’ll try to add Miri to Tremor’s workflow (mainly their
Continuous Integration).
1.6. cbindgen
For the first steps with dynamic loading I think the C/C++ binding generator cbindgen
will help us understand what’s going on under the hood. We
can take a look at the generated headers and see how it works internally.
Unfortunately, it fails to run for the abi_stable
crate:
(...)
WARN: Skip abi_stable::CONST - (...)
thread 'main' panicked at 'RResult has 2 params but is being instantiated with 1 values', src/bindgen/ir/enumeration.rs:596:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
This probably has to do with the following warning found in
cbindgen
's
documentation:
NOTE: A major limitation of cbindgen is that it does not understand Rust’s module system or namespacing. This means that if cbindgen sees that it needs the definition for
MyType
and there exists two things in your project with the type nameMyType
, it won’t know what to do. Currently, cbindgen’s behaviour is unspecified if this happens. However, this may be ok if they have different cfgs.
If you’re using something else like cglue
, this will work without issues. But
after letting the maintainers of abi_stable
know about this in
an issue, they
pointed out that this was expected and that they don’t plan on supporting
cbindgen
because it would take too much effort. Understandable, so let’s move
on.
2. Working with abi_stable
I will personally use abi_stable
because it seems like the easiest
choice for now, and the one that meets my needs best. Not only does it provide a
standard library defined with the C ABI, but also lots of other macros and
utilities specially useful for plugin systems. With it, I won’t need a line of
unsafe, and I’ll avoid reinventing the wheel in many instances.
Once the plugin system is fully functional with abi_stable
, I might
consider using something more hand-crafted. This switch won’t be too
complicated, since our interface will already be #[repr(C)]
, which is the most
troublesome part. All we’d have to do is remove a few procedural macros, switch
the abi_stable
types, and load the plugins manually with something like libloading
. The only thing I want right now is a plugin system that
works, and then we can maybe focus on trying to make it available in other
languages, making it more performant, or whatever.
So let’s start comparing abi_stable
with my experiments in the previous post
using raw dynamic linking. I’ve created the abi-stable-simple
directory
in the pdk-experiments
repository. I’ll be taking a look at the already implemented
examples
for abi_stable
in order to make the learning experience smoother. The base
structure for a plugin system with abi_stable
is the same as always: a crate
for the plugin, another for the runtime, and common
, with the shared
interface.
3. Versioning
abi_stable
states this regarding versioning:
This library ensures that the loaded libraries are safe to use through these mechanisms:
The abi_stable ABI of the library is checked, Each
0.y.0
version andx.0.0
version of abi_stable defines its own ABI which is incompatible with previous versions.Types are recursively checked when the dynamic library is loaded, before any function can be called.
In summary, abi_stable
itself is far from being permanently backward
compatible, but it automatically makes sure that its versions are compatible
when running the plugin. While it doesn’t exactly stick to semantic versioning,
it’s good enough for us.
The version checking for the entire common
crate is already implemented, i.e.,
we can’t try to mix different versions that aren’t compatible. We could still
add a version string for each kind of plugin if more fine-grained control is
needed, as described in the previous post.
4. Loading plugins
abi_stable
plugins are structured in modules, which can help us split up our
functionality into smaller independent pieces. There must always be a
root
module that initializes the entire library and provides metadata such as the
name or the version strings. Then, we can have submodules to organize the
functions exported by the library nicely.
Furthermore, the
StableAbi
trait in abi_stable
indicates that a type is FFI-safe. It contains information
about the layout of the type, and it can be
derived
automatically. Each item in abi_stable
's standard library (RStr
,
RSlice<T>
, RArc<T>
, etc) implements this trait, and it’s used to make sure
the types are compatible when loading the plugin.
This also introduces the concept of
prefix
types. When a type derives StableAbi
and has the
#[sabi(kind(Prefix(…)))]
attribute, two more types are generated:
<name>_Prefix
, which contains all the fields up to the#[sabi(last_prefix_field)]
attribute in the original type.<name>_Ref
, which is a pointer to<name>_Prefix
that can actually be passed through the FFI barrier safely.
Prefix types are needed to guarantee some kind of individual versioning to avoid
breakage in future patches. It will let us add more fields to the module after
the last_prefix_field
attribute in patch (0.0.x
) updates. Moving this
attribute requires a backward-incompatible version bump. Prefix types are often
used for modules and vtables.
For now, I’ll just have a single root module and call it MinMod
, exporting the
min
function:
// Using the stable C ABI
#[repr(C)]
// Deriving the `StableAbi` trait, which defines the layout of the struct at
// compile-time:
// https://docs.rs/abi_stable/0.10.2/abi_stable/derive.StableAbi.html
#[derive(StableAbi)]
// Marking the struct as a prefix-type:
// https://docs.rs/abi_stable/0.10.2/abi_stable/docs/prefix_types/index.html
#[sabi(kind(Prefix))]
pub struct MinMod {
/// Initializes the state, which will be passed to the functions in this
/// module. I'll explain more about the state later on.
pub new: extern "C" fn() -> State,
/// Calculates the minimum between two integers. This is the last defined
/// field for the current version. If we try to load fields after this, all
/// of them will be an `Option`.
#[sabi(last_prefix_field)]
pub min: extern "C" fn(&mut State, i32, i32) -> i32,
}
Most of the loading functionality is already handled by abi_stable
. The module
we’re exporting implements the RootModule
trait, which includes functions to
load the plugin, such as
RootModule::load_from_file
or
RootModule::load_from_directory
:
// Marking `MinMod` as the main module in this plugin. Note that `MinMod_Ref` is
// a pointer to the prefix of `MinMod`.
impl RootModule for MinMod_Ref {
// The name of the dynamic library
const BASE_NAME: &'static str = "min";
// The name of the library for logging and similars
const NAME: &'static str = "min";
// The version of this plugin's crate
const VERSION_STRINGS: VersionStrings = package_version_strings!();
// Implements the `RootModule::root_module_statics` function, which is the
// only required implementation for the `RootModule` trait.
declare_root_module_statics!{MinMod_Ref}
}
When loading directories, it makes the following decisions by default (though we could change them if we wanted to):
It does so non-recursively, i.e., only checking the immediate files in the given directory.
The name of the library must be the
RootModule::BASE_NAME
in lowercase, according to the Operating System’s defaults. For example, in Linux our plugin would belibmin.so
, and on Windows it’d bemin.dll
.
This means that we should add the following parameter to the plugin’s
Cargo.toml
file:
[lib]
# This way, the shared object will be saved as `abi_stable` prefers, for example
# `libmin.so`.
name = "min"
Finally, this is what the runtime may look like:
pub fn run_plugin(path: &str) -> Result<()> {
let plugin = MinMod_Ref::load_from_directory(path.as_ref())?;
println!("Loading plugin {}", MinMod_Ref::NAME);
// First we obtain the function pointer. This is not an `Option` because
// `new` is defined before `min`, the last prefix field.
let new_fn = plugin.new();
// We initialize the plugin, obtaining a state.
let mut state = new_fn();
// Same for the `min` function
let min_fn = plugin.min();
println!("initial state: {:?}", state);
println!(" min(1, 2): {}", min_fn(&mut state, 1, 2));
println!(" min(-10, 10): {}", min_fn(&mut state, -10, 10));
println!(" min(2000, 2000): {}", min_fn(&mut state, 2000, 2000));
println!("final state: {:?}", state);
Ok(())
}
Executing the plugin-sample
implementation:
$ make debug-sample
Loading plugin min
initial state: State { counter: 0 }
min(1, 2): 1
min(-10, 10): -10
min(2000, 2000): 2000
final state: State { counter: 3 }
5. Handling state
5.1. Regular Rust
As we saw in the previous example, we need some kind of generic State
type
that each plugin can implement with their own data. In regular Rust, we’d do as
follows:
trait State: Debug {}
// Remember that we can't use generics, so we need `dyn`, either by itself as a
// reference, or in a box.
type StateBox = Box<dyn State>;
fn usage(state: &mut StateBox) {
println!("state debug: {:?}", state);
}
5.2. Interface types
Unfortunately, we already know that regular dyn
is not FFI-safe. I covered how
it’s possible to work around it with pointers, but here we can resort to
abi_stable
's safer and more convenient alternatives. Here’s one of them:
#[repr(C)]
#[derive(StableAbi)]
// An `InterfaceType` describes which traits are required when constructing
// `StateBox` and are then usable afterwards.
#[sabi(impl_InterfaceType(Debug, PartialEq))]
struct State;
// A trait object for `State`
type StateBox = DynTrait<'static, RBox<()>, State>;
// It can then be used easily like this
fn usage(state: &mut StateBox) {
println!("state debug: {:?}", state);
}
Here we first declare a State
interface
type. Note that even though it’s defined as a struct
, this is a translation
of the previous snippet of code, so it acts as the empty “trait”. But all it
does is establish Debug
and PartialEq
as its supertraits and give access to
them; you can’t really add custom methods to the trait.
Unlike dyn
, this even works with supertraits that aren’t object-safe. Thus, we
can use something like PartialEq
. Its main disadvantage is that it’s limited
to a set of 21 hardcoded traits, so it might not be enough for us.
5.3. Trait objects
If we want something more akin to traits on Rust, we can use
#[sabi_trait]
.
The trait has to be object-safe, and by default there’s no support for
PartialEq
in the list of supertraits, so I’ll remove it.
#[sabi_trait]
pub trait State: Debug {
fn counter(&self) -> i32;
}
// A trait object for the `State` Trait Object
pub type StateBox = State_TO<'static, RBox<()>>;
// It can then be used easily like this
pub fn usage(state: &mut StateBox) {
println!("state debug: {:?}", state);
println!("state counter: {:?}", state.counter());
}
As its documentation explains, this still has a limited number of possible supertraits, but at least it lets us require functions as usual, and it even works with default implementations.
6. Error handling
abi_stable
is just a wrapper over libloading
after all. It
doesn’t include a sandbox, so if the plugin developer was a malicious actor,
they’d have full access to the computer the runtime is being executed on. Other
popular plugin systems such as
nginx’s or
apache’s suffer from the same issues,
for reference.
However, I think it’s not so bad to assume that no bad actors will be involved here. A sandbox would be mandatory if we were working on something like Solana (one of the main users of eBPF in Rust), which basically executes random code from the internet. But with Tremor we can assume that the plugins come from trusted sources because they’re installed and configured manually by the user.
There are some additional security measures that could be implemented in the future, like checking the integrity of the plugins and verifying they come from a trusted source before loading them. Of course, if we could afford to have a sandbox it’d definitely be the best way to do it, but we’ve already seen in this series that it’s currently not really viable for this use-case.
Still, we trust that the plugin developer has good intentions, but not necessarily that they know what they’re doing. We should make fatal errors as hard as possible to happen so that Tremor isn’t constantly crashing. The fewer pitfalls, the better.
The full source for the example that’s supported to work is here. Let’s see a few ways in which the plugin could go wrong:
6.1. Version mismatch
The versions of the common
library are checked automatically. In case there’s
a mismatch in those considered incompatible (changes in x.0.0
or 0.x.0
),
this is what will show up:
$ make debug-versionmismatch
Error when running the plugin:
(...)
Error:incompatible package versions
Expected:
0.2.0
Found:
0.1.0
We can absolutely catch this error gracefully and continue with the execution of the runtime, just like with raw dynamic loading. It’s even easier because it works out of the box.
6.2. Missing fields and wrong types
The layout of every type is recursively checked before trying to use them to make sure they are compatible. Unlike raw dynamic loading, these errors can be caught gracefully, which is a huge plus (it used to segfault):
$ make debug-wrongtype
Error when running the plugin:
Compared <this>:
--- Type Layout ---
type:PrefixRef<'a, MinMod>
(...)
To <other>:
--- Type Layout ---
type:PrefixRef<'a, MinMod>
(...)
0 error(s).
0 error(s)inside:
<other>
(...)
Layout of expected type:
--- Type Layout ---
type:MinMod
(...)
Layout of found type:
--- Type Layout ---
type:MinMod
(...)
(...)
The error message is way too long to show here, but it basically shows the
entire layout tree of the types that don’t match for each of its versions
(runtime vs plugin). For this example, I changed the State
trait to use a
boolean instead of an integer counter, and the message describes it perfectly:
their sizes, alignments, and types differ in the trait’s methods.
6.3. Panicking
Panicking trough the FFI boundary is undefined behaviour; we aren’t guaranteed
that the plugin will abort. It may just continue its execution in a completely
invalid state, which is scary. But turns out abi_stable
properly handles this
for us! It will use what it calls an AbortBomb
to even print out the line and
file where it happened. This is publicly available through the macro
extern_fn_panic_handling
.
$ make debug-panic
Loading plugin min
initial state: State { counter: 0 }
thread '<unnamed>' panicked at 'This will crash everything', src/lib.rs:26:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
file:src/lib.rs
line:24
Attempted to panic across the ffi boundary.
Aborting to handle the panic...
If we panic in the plugin it won’t be undefined behaviour anymore because
abi_stable
already makes sure the panic doesn’t reach the FFI boundary.
7. Panicking and FFI
As we’ve already seen, plugins cannot panic across the FFI boundary under any
circumstance [2]. If we aren’t using something like abi_stable
,
every single function we export in the plugin should wrap its contents in
catch_unwind
in
order to be able to panic.
Unwinding is a process in which all local objects are destroyed, properly calling the destructors in the thread in order to continue execution safely [3] [4]. Knowing this is something taken for granted when taking a look at documentation about exceptions in Rust, but it wasn’t so clear to me at the beginning.
For example, the following snippet will panic after creating the vector. If
panics were configured to abort, the contents of the vector wouldn’t be freed at
all; the program would just end abruptly, and the cleaning up would be left to
the Operating System. But if it unwinds, Rust will call Vec
's destructor,
freeing its allocated memory properly, making it possible to continue the
execution of the program.
{
let data = vec![1, 2, 3];
panic!("oh no!");
println!("My data: {:?}", data); // Unreachable
}
In a typical usage of Rust, a panic usually means that your program writes some
scary message to stdout and then ends. This is because unwinding is propagated
and it may end up finishing the execution of the program if it’s not stopped.
But that’s exacty what catch_unwind
is for:
let result = panic::catch_unwind(|| {
let data = vec![1, 2, 3];
panic!("oh no!");
println!("My data: {:?}", data); // Unreachable
});
// This will run just fine and print out `true`
println!("Did it panic? {}", result.is_err());
Rust makes it very clear that catch_unwind
is not intended for regular error
handling (you have Result
for that). But in our case we are almost forced to
use it in order to not invoke undefined behaviour when panicking through the FFI
boundary. Every single function in the FFI interface that has a possibility of
panicking should use it so that the panic doesn’t try to propagate. And this is
quite tricky because even things like addition may cause a panic (overflow in
debug mode).
Let’s see what else can we do about panicking:
7.1. Aborting
The simplest way to do it would be to just configure plugins to abort on panic
instead of unwinding. This is possible with the panic = "abort"
option in the
plugin’s Cargo.toml
. It will still show the panic message, but the execution
will be completely stopped by an abort:
$ cargo r -q
thread 'main' panicked at 'Oh no!', src/main.rs:2:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
zsh: abort (core dumped) cargo r -q
This is sound because the entire program’s execution ends before reaching the
FFI boundary. The problem is that cleaning up will never happen, and that
although there’s
a
hack you can use in your common
library to make sure the plugin is compiled
with panic = "abort"
, it’s only available on nightly until this is merged:
7.2. C-unwind
This problem is something the Rust devs are aware of, and that they’re trying to
fix. It has been proposed under the "C-unwind"
ABI string. Just like how you
currently use extern "C"
, if we used extern "C-unwind"
, we’d get more
guarantees about what happens when a thread panics.
The most relevant things this feature offers us is:
Support for unwinding through the FFI boundary.
A guarantee that even with
extern "C"
, panicking is not undefined behavior, it’ll just abort (except for some very specific cases). Switching between"abort"
and"unwind"
for thepanic
option inCargo.toml
is always sound.
Unfortunately, it’s moving somewhat slowly, and I’m not quite sure when this will be ready. In the meanwhile, we’ll need to use something else to ensure no undefined behaviour occurs in our plugin system.
7.3. AbortBomb
abi_stable
does this in a pretty clever way: it creates an AbortBomb
struct
at the beginning of the function, which contains its filename and line of code.
If something panics and unwraps, AbortBomb
's destructor will be called,
which aborts the program. Otherwise, mem::forget
is called for the AbortBomb
at the end of the function, which will avoid calling its destructor and the
function will be able to end successfully.
Note that even though mem::forget
is called, no memory is actually being
leaked, because the filename is a 'static str
— which lives for the entirety
of the program — and the line number is an integer, which will be in the stack
and doesn’t need fancy destructors.
This approach is completely fine and works great, but it aborts the whole plugin system, so you can’t recover from it at all. In the case of Tremor, if a plugin panics, from a logical standpoint it doesn’t make much sense to continue the execution because there’s a piece missing in the pipeline. It couldn’t continue anyway… Right? Well, we could actually load the plugin that panicked again and use that instead for the remainder of the program. But since our plugin system doesn’t support unloading, we’d be leaking memory, and if the plugin keeps panicking it’d eventually crash.
Recovering from a plugin panicking is definitely viable, and it might be an
interesting feature for the future. Unfortunately, it’s a lot of work to make
sure it works properly, and it’s not really an objective for the first
implementation, so for now I’ll just use abi_stable
's solution.
7.4. Recovering with catch_unwind
As I explained in the beginning, catch_unwind
can be used to detect and stop
unwinding panics. One way to notify the runtime that a plugin has panicked so
that it can act accordingly would be to use an enum equivalent to Option<T>
:
#[repr(C)]
#[derive(Debug, StableAbi)]
pub enum MayPanic<T> {
Panic,
NoPanic(T)
}
MayPanic
is a type that only returns the original value if the function
finished without panicking. Since the contents returned by catch_unwind
are
just dyn Any
and don’t provide much value for us, they’re discarded and the
Panic
variant is empty. The panicking information will be printed
automatically as output anyway (or whatever is configured with
the panic hook). We will
use it in FFI contexts, so it also implements StableAbi
and it’s #[repr(C)]
.
I didn’t want to use Result
for this because panic errors should be treated
differently from a regular error. Apart from the fact that panic::catch_unwind
returns a Box<dyn Any>
, which doesn’t implement Error
, panics happen when
the plugin reaches an unrecoverable state and cannot continue. We really have to
make sure this is handled differently from a regular error, so having the type
safety of a different type can help.
It implements From<thread::Result<T>>
, so it can simply be used like
this:
fn plugin_stuff() -> MayPanic<Whatever> {
panic::catch_unwind(|| {
// Code goes here
})
.into()
}
Ideally, MayPanic
could be accompanied by a #[may_panic]
procedural macro
that adds this boilerplate automatically to the function it’s attached to.
Additionally, it could come with a #[may_not_panic]
variant that attaches the
#[no_panic]
macro from the no-panic
crate to make sure the
statement is true at compile time. However, no-panic
isn’t too reliable, so
perhaps it could be opt-in with something like #[may_not_panic(enforce)]
.
Something that complicates this whole thing considerably is the concept of
exception safety. Unfortunately, catch_unwind
isn’t as easy to use as just
slapping your code into its closure/function, as there are some types that
aren’t considered unwind safe. You can read more about that
here, but I
won’t get into more details because we aren’t going to use MayPanic
in our own
plugin system anyway.
8. Type conversions
It’s important to know the complexity of conversions from and to abi_stable
types. If Vec<T>
→ RVec<T>
wasn’t \(O(n)\) it might be worth avoiding it
altogether.
This means that I should spend at least a bit of my time on understanding how
the abi_stable
types are implemented and making sure this isn’t the case. In
std
, the definition of Vec
is actually quite simple if we remove most of the
noise:
// A non-null pointer to `T` that indicates ownership.
pub struct Unique<T: ?Sized> {
pointer: *const T, // The data itself
_marker: PhantomData<T>, // Indicating that we own a `T`
}
// Low level type related to allocation
pub struct RawVec<T> {
ptr: Unique<T>,
cap: usize,
}
pub struct Vec<T> {
buf: RawVec<T>,
len: usize,
}
It’s mostly self-explanatory; a Vec<T>
is a pointer to T
with a set capacity
and length. What about abi_stable
's implementation?
#[repr(C)] // Notice this, so that it's FFI-safe
#[derive(StableAbi)] // This trait marks `RVec` as FFI-safe, with info about its layout
pub struct RVec<T> {
pub(super) buffer: *mut T,
pub(super) length: usize,
capacity: usize,
vtable: VecVTable_Ref<T>,
_marker: PhantomData<T>,
}
Yup, basically the same, but packed inside a single struct. The single difference is that we have a field with the vtable. The conversion between these types is written with a macro, but if expanded, it looks like this:
impl<T> From<Vec<T>> for RVec<T> {
fn from(this: Vec<T>) -> RVec<T> {
let mut this = std::mem::ManuallyDrop::new(this);
RVec {
vtable: VTableGetter::<T>::LIB_VTABLE,
buffer: this.as_mut_ptr(),
length: this.len(),
capacity: this.capacity(),
_marker: PhantomData,
}
}
}
The only “weird” part is the usage of std::mem::ManuallyDrop
, which is
simply a wrapper that indicates Rust to not call the destructor of its contents
automatically. In this case it’s basically a less error-prone
std::mem::forget
, as
its
docs explain. Thanks to it, the memory from the Vec
won’t be dropped when
this function ends, and its pointer ownership can be safely moved into RVec
,
with no copying.
This happens for every type I checked in abi_stable
, including RSlice<T>
,
which contains a reference to a slice, RStr
, which is just a RSlice<u8>
, and
RString
, which is just a RVec
.
9. Thread safety
abi_stable
uses libloading
, whose error-handling is not fully thread-safe on
some platforms, such as dlerror
on FreeBSD [5] [6].
It’s fully thread-safe on Linux [7], macOS [8], and Windows
[9], so for Tremor specifically we don’t have to worry about this.
But if your programs supports other Operating Systems, you might want to check
their manuals one by one in order to make sure.
However, for the first version of our system this won’t be a problem at all. For simplicity’s sake, loading plugins after the startup will not be implemented yet, and we’ll do it sequentially. But it’s good to know it for the future.
10. Performance
I first tried to write these benchmarks with
cargo
nightly’s implementation. However, since it’s so basic, not updated regularly,
and requires nightly, I moved to criterion
, which I quite liked
after using it for another post.
First, we can take a look at already implemented plugin systems in order to have an idea of the performance hit we’ll experience in Tremor. This is what we should expect once our system is polished and ready for deployment:
nginx reports 20% slower startup times and up to a 5% slowdown in their execution times [10].
This article explains that the only performance difference is saving the resolved address of the symbol in a table the first time, and then it’s just a couple more instructions to access it. Also, obviously, the fact that the compiler can’t optimize parts of the code (e.g., inline function calls).
These are the results of the benchmarks I wrote, on my not-so-fast laptop:
dynamic setup time: [652.53 ns 654.72 ns 657.34 ns]
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) high mild
4 (4.00%) high severe
abi_stable setup time: [30.386 ns 30.477 ns 30.575 ns]
Found 9 outliers among 100 measurements (9.00%)
7 (7.00%) high mild
2 (2.00%) high severe
dynamic runtime time: [1.8814 ns 1.8878 ns 1.8947 ns]
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) low mild
2 (2.00%) high mild
2 (2.00%) high severe
abi_stable runtime time: [3.2155 ns 3.2325 ns 3.2494 ns]
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
2 (2.00%) high mild
native runtime time: [817.39 ps 819.33 ps 821.38 ps]
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
3 (3.00%) high severe
Note that the benchmarks still don’t represent a real usage of Tremor; it’s just
using the plugin I described in this post with the min
function. But we can
more or less analyze the performance differences between abi_stable
and raw
dynamic loading — I doubt it’s worth implementing the final version with both
methods just to run some benchmarks.
The loading times aren’t so important for performance because they only happen
once at the beginning of the program. But abi_stable
's way of recursively
checking the types in the plugins is not free; the difference with raw dynamic
loading should be quite noticeable. But somehow, in my benchmarks abi_stable
was way faster. What??
It turns out that abi_stable
just leaks the library when it’s loaded to
prevent a user-after-free. And since it won’t be unloaded anyway, it’s not much
of a problem in terms of leaking memory. The library will be saved into a static
variable (of type
LateStaticRef
),
and the next times it’s loaded the initial value will be reused. So in my
bencharks for abi_stable
, loading only actually happens once, and for dynamic
loading it happens for every iteration.
Once the library is loaded, it seems that using dynamic loading versus static
linking is quite bad, being more than twice as slow. This is understandable; the
problem with the native benchmark was, and most likely still is, that the Rust
compiler is too smart. If I called min
with fixed parameters — say
10.min(3)
— it was optimized away, so I had to write a more intricate example
that was different for each loop. Furthermore, using tools like sabi_trait
instead of a void*
almost doubles the execution time again.
11. Conclusion
We’ve learned a lot about abi_stable
and the overall state of dynamic loading
in Rust. We’ll definitely avoid a lot of work thanks to these dependencies. It’s
not as bad as I thought; there’s plenty of tools for each use-case, though most
are admittedly only in early stages.
Hopefully, the performance degradations we’ve found won’t be as noticeable in
the final version of the system. We’ll use sabi_trait
only when loading the
library instead of for each call. And having a more complex use-case will
probably avoid such incredible optimizations in the native code. You can find
the full statistical reports in the
criterion-reports
directory of the
repository.
In the next article, I’ll cover the different caveats I’m finding as I try to actually implement the plugin system on Tremor, and the different ways in which they can be approached.
You can leave a comment for this article on GitHub.