Personal collection of Rust hacks
Rust is a great language. However, I have the (probably bad) habit of using everything I get my hands on to the very limits. That’s like ideal gas ‒ it expands until it hits some walls. Not that I would try on purpose, it just somehow happens. There’s no surprise I’ve hit things that looked impossible or weren’t ready yet even in Rust… but in many cases I’ve beaten the problem long enough until it gave up and I came up with an ugly hack to work around it.
I’ve decided to share the ones I can remember from top of my head in case someone gets to a similar situation. None of them requires enabling any nightly features, but in one case the current stable compiler is unable to crunch through it yet (beta is, it’s probably some bug that got fixed).
Also note I’m typing this without compiling the examples, so you may need to fix typos or add some type hints ‒ but the idea itself should hold.
Poor Man’s try block
While I haven’t paid the attention to the discussion about final syntax, a try block is being planned. That’ll allow limiting an early return by a question mark to the scope of block instead of a whole function, something like this:
fn load_cache(path: impl AsRef<Path>) -> Result<Cache, Box<Error + Send + Sync>> {
let data = try {
// Question marks here exit only the try block, not the function
let mut f = File::open(path)?;
let mut bytes = Vec::new();
f.read_to_end(&mut bytes)?;
bytes
};
let cache = match data {
// But this question mark exits the function
Ok(data) => Cache::from_data(data)?;
Err(_) => {
warn!("Couldn't read cache, using an empty one instead");
Cache::default()
}
};
Ok(cache)
}
The try
block is matter of convenience. It doesn’t bring anything new couldn’t
do without it, but grouping some error sites together and handling them in a
bunch is handy and doing it with .and_then
or other similar methods often
feels awkward.
But the try
block is not there yet. So, let’s roll our own:
macro_rules! pmtb {
($($body: tt)*) => {
(|| -> Result<_, Box<Error + Send + Sync>> {
Ok({
$($body)*
})
})()
}
}
fn load_cache(path: impl AsRef<Path>) -> Result<Cache, Box<Error + Send + Sync>> {
let data = pmtb! {
// Question marks here exit only the try block, not the function
let mut f = File::open(path)?;
let mut bytes = Vec::new();
f.read_to_end(&mut bytes)?;
bytes
};
unimplemented!("The rest of the stuff...");
}
What does that do? The basic idea is, create a closure (that will work as the
barrier for ?
and whatever return
happens to be inside) and execute it right
away. This is AFAIK different from the proposed try
block in that it doesn’t
work for all Try
types, only for results (that is left as an exercise for the
reader O:-)) and return
in the real try
blocks would exit the outer
function, not only the block (that probably can’t be made to work, but who cares
about that?).
This variant is with „OK wrapping“, one without would be trivial to construct too.
By the way, I’ve been told this is used in javascript and actually has a name ‒ but I forgot the name.
Poor Man’s FnBox
We have the nice Box<Fn()>
and Box<FnMut()>
. You can create a
Box<FnOnce()>
too, but there’s no way to call it (yet), because the call
consumes the FnOnce
. For that, Rust needs to place it to the top of the stack
and for that it needs to know how large it is ‒ which is precisely the thing we
don’t want to care about when boxing trait objects. This is not problem with
only FnOnce
, but any method that consumes self
and similar technique would
work too.
The standard library has a workaround called FnBox
, but it is unstable.
Eventually, Box<FnOnce()>
will start working (out of the box?). Until then,
let’s do it this way:
fn set_cback<F: FnOnce()>(&mut self, f: F) {
let mut f = Some(f);
// self.cback is Box<FnMut()>
self.cback = Box::new(move || {
(f.take().expect("FnOnce called more than once!"))()
});
}
The trick here is, we wrap the FnOnce
into a new FnMut
. As we create a new
FnMut
one for each F
that is passed to us, this FnMut
does know the
proper size and can call the FnOnce
. And boxing FnMut
works. Needless to
say, if you try to call it more than once, this FnMut
will panic.
If you worry about the panic, you can wrap this into a type that hides these
details ‒ holds the Box<FnMut>
inside itself privately and the .call
method
consumes the type, re-introducing the compile-time check back. But it’s a bit
sad this type can’t be called directly with ()
.
Anonymous Associated Types
In Rust, some types are so very evil and we can’t even type their name. There
are some places that let you talk about such types, as generic parameters or
impl Trait
. But other places don’t ‒ specifically, associated types in
traits. There’s this existential type
thing coming… but it’s not here yet.
Often, one needs to put such type there because the trait somehow produces
that type. And there’s one thing in Rust with superpowers that actually has
the ability to have anonymous associated type ‒ the Fn
(and friends). We are
going to abuse that. How? Through blanket implementations, like this:
use std::fmt::Debug;
pub trait ProduceStuff {
type Product: Debug;
fn produce_stuff(&self, how_much: usize) -> Self::Product;
}
struct ProduceFn<F>(F);
impl<F, P> ProduceStuff for ProduceFn<F>
where
F: Fn(usize) -> P, // Here is where we steal the associated type
P: Debug,
{
type Product = P;
fn produce_stuff(&self, how_much: usize) -> P {
(self.0)(how_much)
}
}
pub fn make_producer() -> impl ProduceStuff {
ProduceFn(|how_much| unimplemented!())
}
You can get this very far… probably as far as your stomach for trait bound length and ugly error messages can keep up.
HRTB and Associated Types
Now, this one is the ugly abomination on which rustc’s brain in stable explodes.
Let’s say we have some lifetime-parametric traits, like this:
trait Extractor<'a, Source: 'a> {
type Fragment: 'a;
fn extract(&self, source: &'a Source) -> Self::Fragment;
}
trait Cruncher<Input> {
type Report;
fn crunch(&self, fragment: Input) -> Report;
}
We would like to feed the Extractor
some big data type, get smaller one out
and crunch the data. Of course, we would like to borrow the data on the way, so
Fragment
will be something containing reference or references ‒ but maybe not
just &Vec<_>
, maybe something like (&PartA, &PartB)
(so we can’t just type
-> &Self::Fragment
). We want the Report
at the end to be an owned type (eg.
'static
, no references there).
And now, we would like to build a pipeline ‒ something that every time it is
fed with the big data, it produces a report. This, obviously, needs to work with
all possible lifetimes of the data we feed it, so it calls for the
HRTB
thing.
where
E: for<'a> Extractor<'a, Source>,
C: for<'a> Cruncher<E as Extractor<'a, Source>::Fragment>,
This has one downside, though: we just asked for an infinite number of
Extractor
and Cruncher
trait bounds. Each one has a (potentially) different
associated type ‒ Extractor<'a, T>::Fragment
is some other type than
Extractor<'b, T>::Fragment
. So, what is the Report
type we want to
eventually return? Well, because these types on the way differ only in lifetimes
and the Report
is owned, then there can be just one that is common for all the
traits. We know that. But Rust doesn’t. So it complains that it can’t pick
one of these infinitely many types if we ask for it.
There’s a way out. First, we need to get hands on arbitrary one of these
infinitely many but as we know equivalent types. We do that by providing the
type equations above with a concrete lifetime ‒ and there’s only one concrete
lifetime we have, that’s 'static
. So, the return type will be:
-> Cruncher<E as Extractor<'static, Source>>::Fragment>::Report
However, if we try to write the body of such function, we’ll be using a
different lifetime and the resulting Report
might (in theory) be a
different type. So, how do we tell the compiler to accept only types where the
Report
happens to be the same for all the other lifetimes? Well, just by
translating this sentence to Rust. Literally:
where
...
C: for<'a> Cruncher<
E as Extractor<'a, Source>::Fragment,
// Here we add infinitely many type-equality constrains on associated
// types of similar-but-not-the-same traits.
Report = Cruncher<E as Extractor<'static, Source>::Fragment>::Report
>,
I’m not sure I should be doing this, though. The stable compiler is able to
resolve this only in case the Fragment
is also owned type (it fully works
with beta and newer). I’ve seen two different compiler crashes (both are
reported, one is already fixed) on the way when debugging it. When something
doesn’t line up in the types (well, my actual use case has a few extra trait
bounds), it complains a trait bound is not satisfied, but it is missing the
additional information needed to actually track down what doesn’t line up
(which is worked around by provided a companion function that does nothing at
all and has a smaller subset of the bounds, so one can ask for a better error
message).
Are there other hacks?
I’m pretty sure there are and that people have their favorites. I’ll be happy to see them shared too, both because one might need them, but also because they are interesting. Or as horror stories for campfire.