Dynamic Casting for Traits

In Rust, traits are a powerful tool to use polymorphism, both static and dynamic. I’m going to skip the basics about the traits and just link to another blog post with a good explanation about static and dynamic dispatch in Rust: Traits and Trait Objects in Rust.

Instead, I would like to do an experiment of making dynamic dispatch even more dynamic! Like in Java¹.

How Dynamic is a Dynamic Dispatch?

The way Rust dynamic dispatch works is via a special reference type, a trait object which provides a way to dispatch a call based on a concrete type. Again, I am going to skip the details, but the idea is that a trait object is represented internally by two pointers. The first pointer points to the data itself. The second pointer points to a virtual table where each “slot” is the address of the function.

Every time Rust compiler generates code to invoke a function on a trait object, the generated code would destructure the trait object into two pointers, the data pointer and virtual table pointer. Then, it would lookup the virtual table to get the address of the function to call (each function is assigned a unique slot in the table, so this is a simple indexing operation). Finally, the code would call that function, providing a pointer to the data as the first parameter (which becomes &self in the function).

However, what if we want to dispatch a call based on both a type of the interface we want to dispatch on and the concrete type? What does it mean and why would we even want that?

Is it a Failure?

One example where this could be useful (which looks reasonable to me, but could be a totally terrible idea!) would be a failure crate. This crate provides an error handling abstraction which, among other things, allows to build chains of errors and iterate these chains.

This is convenient in higher-level error handling code, for example, to collect all the information down the error chain to present it to the end user or the calling system. Each error, in this case, will be hidden behind a trait object of a Fail trait. However, there is not much you can do with that Fail trait. Basically, you can go down the chain or convert an error to its string representation.

However, to help with that, Fail trait has a built-in downcasting functionality. There is a function on Fail trait, downcast_ref which, given the target type, tries to “cast” the referenced Fail implementor to that type. If type matches (for example, you have MyError type implementing Fail trait and you are casting Fail trait object to that MyError type), it will return a reference to the concrete type.

This way you can cast an error to something more concrete and retrieve additional information from it (for example, it could be column and line information).

One limitation, though, is that you can only cast to a concrete type. You cannot cast to another trait. Therefore, if you have different types of errors, but which have similar “extra” information attached to them (like line and column information, or any other additional detailed information about the error), the code which processes the chain of errors, needs to know all the possible types.

Not very convenient! Can we do better?

What if we define our own trait, ExtraInfo, like this:

trait ExtraInfo {
    /// Globally unique error code (like "ERR000243").
    fn code() -> &str;
    /// Location where error happened
    fn location() -> String;
}

So we can then somehow “cross-cast” a trait object of Fail to a trait object of ExtraInfo. Then, we could call code and location functions on that ExtraInfo trait object to collect additional information. Like casting one interface to another in Java!

Plugin Registry

Another example would be a plugin registry. Let’s say, we want to keep a global map of plugins where each plugin is known by its base Plugin trait. However, internally, each plugin could implement additional “interfaces” and we want a mechanism where we can query each plugin for additional functionality.

We could have separate maps for different plugin interfaces. Every time we would need, let’s say, TimerPlugin we could go into timer_plugins map and look for a plugin there. However, let’s assume we want to have a single place where we register plugins (maybe, we have a mechanism to build this map based on a dynamic configuration of some sort!).

Let’s try to build this plugin registry.

Building the Registry

For this experiment, we will start with two specific plugin interfaces. Both are just regular Rust traits:

/// Generate a greeting message for the given name.
trait Greeter {
    fn greet(&self, name: &str) -> String;
}

/// This is a formal version, which uses a first name and a last name.
trait FormalGreeter {
    fn greet_formal(&self, first_name: &str, last_name: &str) -> String;
}

We could have merged both traits into one, with both functions, greet and greet_formal. If a certain function is “not supported”, we could have used a Result returning an error (as one of the options).

It would be a runtime error if we would try to invoke a missing functionality, but it would be the same for the dynamic dispatch: if functionality is not supported, we would only know about that at runtime.

Let’s create two implementations of these plugins. One implementation will support both the simple greeting interface and the formal one. Another implementation will count greets and will only support the formal greeting interface².

use std::sync::atomic::{AtomicUsize, Ordering};

/// Simple greeter, implements both `Greeter` and `FormalGreeter` traits
pub struct SimpleGreeter(String);

impl Greeter for SimpleGreeter {
    fn greet(&self, name: &str) -> String {
        format!("{}, {}!", self.0, name)
    }
}

impl FormalGreeter for SimpleGreeter {
    fn greet_formal(&self, first_name: &str, last_name: &str) -> String {
        format!("{}, {} {}!", self.0, first_name, last_name)
    }
}

/// Counting greeter, only implements `FormalGreeter` trait
pub struct CountingGreeter(AtomicUsize);

impl FormalGreeter for CountingGreeter {
    fn greet_formal(&self, first_name: &str, last_name: &str) -> String {
        self.0.fetch_add(1, Ordering::Relaxed);
        format!("Greetings, {} {}.", last_name, first_name)
    }
}

Now we want to implement a function that tries to an informal greeter interface, if it supported by the provided plugin. If plugin, however, does not support an informal interface, it will try to use formal interface. The function will use a dynamic dispatch, since according to the story, we don’t know the real implementation until runtime.

pub fn rsvp(first_name: &str, last_name: &str, plugin: &???) -> String {
    // what do we write here?...
}

First, what should we specify as the plugin parameter type, so it basically means “any implementation whatsoever”? Like in dynamic typing, there you might have a reference to “any object”. Any ideas?

Any

There’s a trait in Rust, std::any::Any which is said to emulate dynamic typing. Let’s try to use it here.

As a first step, let’s take a look at the std::any::Any and see if it can help us in one way or another. Let’s try to use it in place of the questions marks above and see if we can use it to downcast to our traits.

use std::any::Any;

pub fn rsvp(first_name: &str, last_name: &str, plugin: &Any) -> String {
    if let Some(gt) = plugin.downcast_ref::<Greeter>() {
        return gt.greet(first_name);
    }
    if let Some(gt) = plugin.downcast_ref::<FormalGreeter>() {
        return gt.greet_formal(first_name, last_name);
    }
    "Hello?".to_string()
}

It doesn’t work, however:

error: the `downcast_ref` method cannot be invoked on a trait object
   --> src/lib.rs:162:30
    |
162 |     if let Some(gt) = plugin.downcast_ref::<Greeter>() {
    |                              ^^^^^^^^^^^^

It’s not surprising, as documentation explicitly says “cannot be used to test whether a type implements a trait”. Ok. Let’s take a step back here and use the concrete types. After all, we know that only SimpleGreeter implements Greeter trait and CountingGreeter only implements a formal greeting.

Let’s quickly replace downcasting to plugin.downcast_ref::<SimpleGreeter>() and plugin.downcast_ref::<CountingGreeter>(), then run the following test:

#[test]
fn test() {
    let simple = SimpleGreeter("Hi".to_string());
    let formal = CountingGreeter(Default::default());
    assert_eq!("Hi, Andrew!", rsvp("Andrew", "Baker", &simple));
    assert_eq!("Greetings, Baker Andrew.", rsvp("Andrew", "Baker", &formal));
    assert_eq!(1, formal.0.load(Ordering::Relaxed));
}

The test passes. That’s encouraging, but that’s not exactly what we want – our function should not know about concrete data types. We are modeling an open plugin registry, after all.

Let’s dig deeper into Any trait to see how it works and why it doesn’t support traits.

Digging into Any

Let’s take a look at the downcast_ref function of the Any trait, which could be seen here.

The downcast_ref function on Any trait checks if self is of type T and if so, does an unsafe cast to type T. This is a problem number one. In case T is a trait, we are supposed to return a trait object. However, trait object needs to have two pointers: pointer to the data itself and pointer to the virtual table. We have a data pointer (via self), but we don’t have a pointer to the virtual table!

The problem number two is the implementation of the is function. It doesn’t do much: it gets some magic “type id” value” and compares it to the value reported by the data type itself. There is a get_type_id function on an Any trait which could be invoked via a trait object. The implementation of this function for each concrete type knows the type id (by using the same TypeId machinery).

Can we extend this mechanism to work with traits?

An Idea

First, let’s draft the downcasting function we are looking for. It should be something that is very similar to the downcast_ref of Any, but should allow for traits in its signature:

pub fn downcast_ref<T: ?Sized>(&self) -> Option<&T> { ... }

The difference here is that we are canceling the Sized requirement by using a negative bound ?Sized, so we can substitute a trait name for the type variable T. If T is substituted with a trait name, the return type would be an Option of trait object.

What about the internal implementation? We need a way to somehow get a pointer to the virtual table. This virtual table must correspond to the implementation of trait substituted for T. The idea here is that by using a dynamic dispatch on a common interface Plugin, we can “ask” the type itself to provide its implementation for the given trait T.

The plan is to create a function on a trait that the given target trait type T will return a trait object &T. I was about to call this function QueryInterface³, but decided to use a different name:

pub trait Plugin {
    fn __downcast_ref(&self, target: ???) -> Option<???>;
}

However, what should we take as the target parameter and what should we return from this function? The first answer is straightforward: we can use the same TypeId mechanism to get the unique identifier for the trait.

The second one is a little bit tricky. We cannot return a trait object &T, as this function cannot have type parameters⁴. If we cannot return a trait object, can we return… a TraitObject?

TraitObject is a type in the standard library which has the same memory representation as a trait object. It is a nightly-only struct, but it can be re-created in user code (with an assumption that layout won’t change⁵). However, for this exercise, we would enable the nightly feature.

So, the Plugin trait and its implementation for SimpleGreeter will look like this:

#![feature(raw)]
use std::any::TypeId;
use std::raw::TraitObject;

pub trait Plugin: 'static {
    fn __downcast_ref(&self, _target: TypeId) -> Option<TraitObject> {
        None
    }
}

impl Plugin for SimpleGreeter {
    fn __downcast_ref(&self, target: TypeId) -> Option<TraitObject> {
        unsafe {
            if target == TypeId::of::<Greeter>() {
                return Some(std::mem::transmute(self as &Greeter));
            }
            // other interfaces
            // ...
        }
        None
    }
}

In the implementation, we check target trait against all traits implemented by the concrete type. In case there is a match, we generate trait object by casting self (which is of &SimpleGreeter type) to the reference to the target trait. Then, we use unsafe std::mem::transmute call to transmute it to the TraitObject.

Finally, let’s look at the public API for the downcasting, the downcast_ref function on the Plugin trait. Now that we have a way to retrieve TraitObject for the given trait type T, we can simply cast it back:

impl Plugin {
    pub fn downcast_ref<T: ?Sized + 'static>(&self) -> Option<&T> {
        unsafe {
            if let Some(obj) = self.__downcast_ref(TypeId::of::<T>()) {
                Some(*(&obj as *const TraitObject as *const &T))
            } else {
                None
            }
        }
    }
}

Minor Inconvenience

Note that we cannot transmute TraitObject directly to the type &T using Some(std::mem::transmute(obj)), since we would get an error:

   |
53 |                 Some(std::mem::transmute(obj))
   |                      ^^^^^^^^^^^^^^^^^^^
   |
   = note: source type: std::raw::TraitObject (128 bits)
   = note: target type: &T (pointer to T)

The reason is that we cannot enforce T to be substituted with a trait. If it is substituted with the concrete type name, the size of the reference would be the size of one pointer and it would not be possible to transmute. Instead, we cast the pointer to the TraitObject into the pointer to &T and read the value using that pointer.

Test!

Finally, let’s rewrite our function to use Plugin instead of Any and change the test code:

pub fn rsvp(first_name: &str, last_name: &str, plugin: &Plugin) -> String {
    if let Some(gt) = plugin.downcast_ref::<Greeter>() {
        return gt.greet(first_name);
    }
    if let Some(gt) = plugin.downcast_ref::<FormalGreeter>() {
        return gt.greet_formal(first_name, last_name);
    }
    "Hello?".to_string()
}

#[test]
fn test() {
    let simple: &Plugin = &SimpleGreeter("Hi".to_string()) as &Plugin;
    let formal: &Plugin = &CountingGreeter(Default::default()) as &Plugin;
    assert_eq!("Hi, Andrew!", rsvp("Andrew", "Baker", simple));
    assert_eq!("Greetings, Andrew Baker.", rsvp("Andrew", "Baker", formal));
}

It compiles and the test passes!

Few Improvements

Casting to the Concrete Type

We made our Plugin trait to support casting to other traits. However, compared to Any, we lost the ability to cast to the concrete type. Can we add it back?

Let’s consider the following alteration to the __downcast_ref of SimpleGreeter (and the similar block of code for the CountingGreeter):

if target == TypeId::of::<SimpleGreeter>() {
    return Some(TraitObject {
        data: self as *const _ as *mut (),
        vtable: std::ptr::null_mut(),
    })
}

If the target is the concrete type itself, we are constructing a TraitObject with a valid data pointer, but the null pointer for the virtual table. Remember that minor inconvenience with casting TraitObject into &T? Turns out, the way we did that casting is actually beneficial! If T is a concrete type, the reference to T would be a size of one pointer. Therefore, we would only read the first field, data, from the TraitObject, making our casting work for the concrete type itself, too!

Let’s add the following piece to the test to see if that works:

// cast back to the concrete type
let simple = simple.downcast_ref::<SimpleGreeter>().unwrap();
assert_eq!(simple.0, "Hi");

// cast back to the concrete type
let formal = formal.downcast_ref::<CountingGreeter>().unwrap();
assert_eq!(1, formal.0.load(Ordering::Relaxed));

And all the tests pass! With that addition, we can now also cast to the concrete data types, too!

Creating Macro

One unfortunate property of this approach is the need to write this implementation of __downcast_ref manually. Let’s create a macro to do that:

#[macro_export]
macro_rules! declare_interfaces (
    ( $typ: ident, [ $( $iface: ident ),* ]) => {
        impl Plugin for $typ {
            fn __downcast_ref(&self, target: ::std::any::TypeId) -> Option<::std::raw::TraitObject> {
                unsafe {
                    $(
                    if target == ::std::any::TypeId::of::<$iface>() {
                        return Some(::std::mem::transmute(self as &$iface));
                    }
                    )*
                }
                if target == ::std::any::TypeId::of::<$typ>() {
                    Some(::std::raw::TraitObject {
                        data: self as *const _ as *mut (),
                        vtable: std::ptr::null_mut(),
                    })
                } else {
                    None
                }
            }
        }
    }
);

Now instead of implementing function manually we can do:

declare_interfaces!(SimpleGreeter, [Greeter, FormalGreeter]);
declare_interfaces!(CountingGreeter, [FormalGreeter]);

Cross-casting

Finally, there is one small annoying thing that it is not possible to cross-cast from one plugin interface to another:

let simple = simple.downcast_ref::<Greeter>().unwrap();
// Nope! Compilation error!
let simple = simple.downcast_ref::<FormalGreeter>().unwrap();

It wouldn’t be an issue if it was possible to upcast from a Greeter or FormalGreeter trait into a Plugin trait, but it is not⁶. This could be solved, though, by adding another trait, PluginBase (or, alternatively, could be added to Plugin trait and declare_interfaces! macro changed to implement it):

pub trait PluginBase {
    fn as_plugin(&self) -> &(Plugin + 'static);
}

impl<T: Plugin> PluginBase for T {
    fn as_plugin(&self) -> &Plugin {
        self
    }
}

Then making Plugin to require PluginBase (so every &Plugin get as_plugin function):

pub trait Plugin: PluginBase + 'static { ... }

Finally, making each plugin trait to require Plugin bound (which wasn’t required before, but if we want to get that as_plugin function, we have to do it):

trait Greeter: Plugin { ... }

Final Thoughts

So, this is one way you can do casting between traits! One limitation, however, limited in that it requires a 'static bound on the type.

Another limitation is that all “implemented” interfaces need to be declared upfront (because in this implementation we need to generate a “dispatching” function that should know all supported interfaces). This shortcoming could be lifted by implementing a dynamic registry for the implemented traits. In the end, all we need is a mapping from two type identifiers (TypeId of the concrete type and TypeId of a trait) into a virtual table, corresponding to that combination. It could be a shared map that is somehow dynamically built and then looked up during the casting. Initializing such map, however, could be tricky – the beauty of the implementation above is that it does not require any initialization.

The full source code is here.

P.S. There is a library on crates.io implementing this idea. It also implements more features, like dynamic registration of interfaces and support for referenced counted pointers (Rc and Arc).

References

Discussion on /r/rust.

Full disclaimer: I did use Java a lot, so be warned! ↩
I’m using an AtomicUsize for the interior mutability to avoid dealing with mutable references (all references to trait objects will be shared references). ↩
This is indeed a reference to the Component Object Model which works similarly (the way I remember it)! ↩
Object Safety Is Required for Trait Objects ↩
It might, though ↩
Why doesn’t Rust support trait object upcasting? ↩