Announcing the contrie ‒ concurrent maps and sets
This is partly an announcement of a new crate folks might find useful, partly a call for participation and help and partly a journal like story how the crate came to being. Read on (or not) or skip to the parts that seem interesting to you.
And I know, I’m bad at naming libraries, the name is silly.
What’s in the crate
The goal of the contrie
crate is to provide a concurrent lock-free map and
set, based on a concurrent hash trie data structure. That is, something with
similar interface as HashMap
or HashSet
, but allowing multiple threads
to do stuff to it at the same time, without locking.
What good is that? Well, let’s say you’re writing a chat server ‒ maybe a
new implementation of an IRC daemon. And you want it to be asynchronous and
multi-threaded. You need to keep certain data structures around, for example
a list of currently connected users. Sometimes, you need to send a message to a
particular user (look up in that data structure), sometimes you need to send a
message to all of them (iterate through it), and also people join and leave all
the time. You could certainly use something like RwLock<HashMap<UserName,
User>>
. But, when a user is being added to the map, other threads get blocked
if they want to iterate or lookup in there. That might not be a big deal, but it
certainly doesn’t sound great either. And from certain load that lock becomes a
choke point of performance, no matter how many cores the server has.
So, this crate provides a data structure offering the functionality without any blocking ‒ you can iterate and lookup and modify from as many threads at once as you like. There can be some contention in there, but mostly only if two threads try to add or remove the same key.
Under the hood, it is a modified and simplified version of the data structure from this Wikipedia entry.
Maturity
I’ve just released the 0.1
version on crates.io. And I like the philosophy
of „release early, release often“. In other words, this definitely is not
finished and highly polished library yet. I do intend to continue working on it
during my limited spare time though, and I’d like the library to reach that
state eventually, not abandon it. But it is also an opportunity for you to
both help with it and influence its further direction of development.
Currently, there are mainly two things available:
- The
Raw
type. This one holds the actual implementation of the data structure, all theunsafe
code and similar. The interface is somewhat minimal and closer to what is comfortable for the implementation that the use. But the aim here is to separate theunsafe
code from building a reasonable API and to reuse the same difficult code between the map and the set (or, possibly multiple flavours of these). - The
ConMap
type wraps theRaw
one into somewhat more convenient API to implement a map. This one holds elements behindArc
s, to work around some technical limitations. There are other possible options, so this is where other flavours could come in.
Things that are definitely missing are:
- A set type.
- A feature-gated support for the
rayon
crate. - I’d like the
Raw
to be a bit more flexible ‒ under the hood it is a trie, but currently it is restricted to being a trie of 64bit hashes. It should be possible to extend it to allow arbitrary bit keys, or at least arbitrary fixed-sized ones.
Also, I’m not exactly satisfied with the API of ConMap
. I’m certainly better
at implementing data structures than designing their APIs.
Help wanted
If you search crates.io for a concurrent map, you’re likely to find several immature and abandoned ones. I’d really like this one to be different and to grow into a mature and usable one eventually. If you, too, would find one useful, I could use some help with it.
So, what would really help:
Provide feedback
If you have a use case, please try it out if it works for you. If it gives you pain while using it, I want to know what is cumbersome. If you have ideas on how to improve it, even better.
Or at least read the documentation and think if it could work for you or where you think the pain points would be.
Code review + try to break it
I’ve tried to read the code multiple times. I’ve written comments, arguing why the unsafe code should in fact be safe. But, well, bugs happen. So, if you enjoy reading unsafe code, please have a look at it.
Alternatively, try to break the thing. Failing tests (that are correct themselves) in form of pull requests or bugreports with code snippets are great :-).
I’m always happy to learn what I did wrong (well, my ego of course prefers learning what I did right, but I try to learn from mistakes).
Help writing the code
The crate is split into the hard parts, where unsafe code lives, and the easier parts, where it is just modeling a nice API around it. Documentation and embedded code examples are always a good place for improvements.
So, there are opportunities for both easy and hard coding tasks and I’ll be happy to help out, mentor or discuss ideas on all the levels.
It lives in a github repository. I haven’t created issues for everything that needs doing yet, though there’s at least a file with a TODO list in there.
Eventually, I’ll probably get around to creating the issues and try to get them listed in the Call for Participation in This Week in Rust.
Experiment
As mentioned, implementing the data structure is only one thing of what makes a
good library. A nice and convenient API is very much needed too and that will
probably take some experimentation (after years of writing C++ code, my
intuition for what is a good API design suffered a bit ‒ I can differentiate
between several levels of „obviously terrible“, but once I get to „that’s
OK-ish“, I no longer see much difference, and I guess there are people who can
see the difference between mere „OK-ish“ and „brilliant“). If you have an idea
to try out, don’t hesitate. The existence of the Raw
type should provide the
building element, without having to invest in understanding all the details of
the data structure and the unsafe code.
It is fine to add a different flavour of a map or a set and let multiple of them live in the crate, at least for now. Eventually it would be nice to choose only the best ones instead of having a dozen of slightly different map types, but that means having something to choose from.
How to join in
The project is too small to have some specific rules in that regard. In other words, you can create issues, send pull requests or reach out through email, as you like.
Behind the scenes
This is the story why the crate came into existence.
Some time ago there was this thread about not having concurrent maps. Well, that one actually talks about why they are not in the standard library, nevertheless the gist is there are no mature ones outside of it either. And, as with the motivation for ArcSwap, it felt wrong that Java could have some goodies Rust doesn’t. But that didn’t trigger any action from me yet.
But then, I’ve stumbled on this Wikipedia entry. That thing looked interesting. It looked like the kind of challenge I like ‒ a hard to implement data structure, nevertheless with well defined purpose and boundaries and relatively small so it’s possible to actually implement during occasional free day or evening (yes, some people have a weird definition of relaxation, but I just need to stretch the mind to the limits sometimes).
However, Wikipedia contains just enough information to grasp the basic idea of the data structure and the general properties, but not all the details needed for actual implementation. Luckily, the original article can be downloaded from the Internet, so I’ve started to study that one too.
I’m not going into too many details (for them you can study both the article
and my source code in the repository), but I have to admit that the data
structure described there seemed very complex at first. In particular, there are
4 different kinds of nodes in the trie. Between each neighboring levels of
branching nodes there’s a single-pointer walk-through node (which they call
an I
-node) that helps with avoiding some kinds of race conditions and with
implementing snapshots of the data structure for consistent iteration through
it. And all that complexity is in the context of a lock-free code, which is in
general error prone and very hard to test in all the corner cases.
In other words, I felt like there was no chance to get anywhere near to correct implementation during the first attempt and that I have to simplify it. Also, storing the single-pointer nodes felt wrong ‒ that doubles the number of memory loads to reach the element and allocating a lot of small structures has high overhead for the memory allocator. But the practical implementability was the bigger reason.
So, the current code doesn’t contain the I
-nodes. It doesn’t offer the feature
of consistent iteration ‒ if you iterate through the data structure, the
iteration will reflect some changes done during the iteration (which is
probably fine for most use cases ‒ in the IRC example, the message of the day
may or may not be sent to the user that just joined during the iteration). The
branching nodes in the article use a bitmap field + only the non-null pointers
(therefore, if the node is only half-full, it allocates only half of the size a
full one would). Mine are always the full size, containing the null pointers.
In return for that, I hope I got a slightly faster implementation (to be sure
I’d have to write the other version too, because comparing it to a Java
implementation wouldn’t really work). But more importantly, I have much more
confidence that what I wrote is actually sane.
To do that, I had to rethink and re-design some of the corner-case details. For example, the process of pruning dead branches after removal of element is somewhat different. Their approach however provided the inspiration ‒ I do it differently, but in the general direction of their idea.
And then I had to actually spend a lot of the free evenings, adding
functionality piece by piece, writing tests, trying to argue correctness and
figuring out how to fit all that into Rust ownership model (crossbeam-epoch
helps a lot with managing the memory, but not with wrapping it behind an API
wall).
I’m not ruling out trying to implement the original version one day. It might be another level of the challenge and being able to compare them would be useful. But not today. However, after implementing the simpler version I feel like I finally understand what they do in the article.
The future
It’s always hard to make predictions. But as I said above, with some help it should be possible to bring this up to a usable shape. The split between the core algorithm and API layers should make researching suitable interface easier.
I’m not sure if this should stay a separate crate in the long term ‒ it might be
easier to find it if I eventually „donated“ it to the crossbeam
project. But
I have no idea if the project would like such donations and I feel like at this
immature state, the crate is better on its own.
All that depends on if someone actually starts using it or if it goes down as another exercise project.