In my previous article how to Write CRaP Rust Code, I warned you against overusing generics. This is still a good idea for a binary crate or an initial version of any code.

However, you can often use generics to good effect when designing an API for the Rust library plate: looser input for us can give callers the opportunity to avoid some allocation or find a different representation of their input data that better suits them.

In this guide, we demonstrate how to make the Rust library API looser without losing any functionality. But before we begin, let’s take a look at the potential downsides.

First, generic functions provide the type system with less information about what is what. If the original concrete type is now an IMPL, the compiler will have a harder time infering the type of each expression (and may fail more often). This may require your users to add more type comments to get their code compiled, leading to arguably worse ergonomics.

In addition, by specifying a specific type, we get exactly one version of our function compiled into the resulting code. For generics, we either pay the run-time cost of dynamic scheduling or choose singleton at the risk of ballooning binaries with multiple versions — in Rust terms, we choose the Dyn Trait versus the IMPl Trait.

Which point you choose depends largely on the use case. Note that dynamic scheduling has some run-time costs, but code bloatiness can also increase cache errors and therefore negatively impact performance. As always, two measurements, one code.

That said, there are some rules of thumb you can follow for all public methods.

A fragment of character

If possible, take a slice (&[T]) instead of &vec

(the one that actually has a shear Pete). Your caller may use a [VecDeque] (https://doc.rust-lang.org/std/collections/struct.VecDeque.html), It has one [. Make_continuous ()] (https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.make_contiguous) method returns a & mut [T]) rather than a (Vec) (https://doc.rust-lang.org/std/vec/struct.Vec.html), or an array.

If you can also take two pieces. [VecDeque: : as_slices] (https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.as_slices) can work for your users without the need to move any value . Of course, you still need to know your use case to decide if it’s worth it.

If you unreference only your fragment elements, you can use &[impl Deref

]. Please note that in addition to [Deref] (https://doc.rust-lang.org/std/ops/trait.Deref.html), And a [AsRef] (https://doc.rust-lang.org/std/convert/trait.AsRef.html), it is often used in the path, because STD method may require a AsRef < T > for cheap quote.

For example, if you want to take a set of file paths, &[impl AsRef

] will work on many more types than &[String].

fn run_tests( config: &compiletest::Config, filters: &[String], mut tests: Vec<tester::TestDescAndFn>, ) -> Result<bool, io::Error> { // much code omitted for brevity for filter in filters { if dir_path.ends_with(&*filter) { // etc. } } // . }Copy the code

The above situation may be expressed as.

fn run_tests(
    config: &compiletest::Config,
    filters: &[impl std::convert::AsRef<Path>],
    mut tests: Vec<tester::TestDescAndFn>,
) -> Result<bool, io::Error> { 
// ..

Copy the code

Filters can now be strings, &str, or even a fragment of Cow<‘_, OsStr>. For variable types, there are [AsMut < T >] (https://doc.rust-lang.org/std/convert/trait.AsMut.html). Similarly, if we require any reference to T to work the same way as T itself in terms of equality, order, and hash, We can use [Borrow < T >] / (https://doc.rust-lang.org/std/borrow/trait.Borrow.html) [BorrowMut < T >] (https://doc.rust-lang.org/std/borrow/trait.BorrowMut.html) instead.

What exactly does that mean? It means that the type implementing Borrow must guarantee that a.row () == b.row (), a.row () < b.row () and a.row ().hash() return the same as a == b,a < b and a.hash(), If the type implements [Eq] (https://doc.rust-lang.org/std/cmp/trait.Eq.html), [word] (https://doc.rust-lang.org/std/cmp/trait.Ord.html) and the type of [Hash] (https://doc.rust-lang.org/std/hash/trait.Hash.html), You must ensure that, and return the same value as, and.

Let’s iterate again

Similarly, if you only iterate over bytes of a string slice, you can simply take an argument AsRef<[u8]> unless your code somehow needs utF-8 – guaranteed by STR and string to work correctly.

In general, if your iteration only once, you can take a [Iterator < Item = T >] (https://doc.rust-lang.org/std/iter/trait.Iterator.html). This allows your users to provide their own iterators, which may use discrete slices of memory, intersperse other operations in your code, or even evaluate your input at run time. In doing so, you don’t even need to make the item type generic, because iterators can usually easily produce a T if needed.

In fact, if your code iterates only once, you can use an impl Iterator

>; If you need the project more than once, use one or two fragments. If your Iterator returns an owned Item, such as the recently added array IntoIterators, you can discard the impl Deref and use the impl Iterator

.

Unfortunately, IntoIterator’sinto_iter will consume self, so there’s no general way to take an Iterator, let’s iterate multiple times — except, perhaps, to take an impl Iterator<_> + Clone, But this [Clone] (https://doc.rust-lang.org/std/clone/trait.Clone.html) operation can be very expensive, so I don’t recommend to use it.

Into the woods

Unrelated to performance, but also often popular, is the implicit conversion of the IMPL Into<_> argument. This can often make an API feel magical, but be careful. [Into] (https://doc.rust-lang.org/std/convert/trait.Into.html) is likely to be expensive.

However, there are a few tricks you can use to get a good score on usability. For example, using a Into < Option < T > >, rather than a [Option < T >] (https://doc.rust-lang.org/std/option/enum.Option.html) will make omit Some of your users. For example.

use std::collections::HashMap;

fn with_optional_args<'a>(
    _foo: u32,
    bar: impl Into<Option<&'a str>>,
    baz: impl Into<Option<HashMap<String, u32>>>
) {
    let _bar = bar.into();
    let _baz = baz.into();
    // etc.
}

// we can call this in various ways:
with_optional_args(1, "this works", None);
with_optional_args(2, None, HashMap::from([("boo".into(), 0)]));
with_optional_args(3, None, None);

Copy the code

Also, there may be types that implement Into

Control code bloat

Rust singularizes common code. This means that for each unique type your function is called on, a version of all of its code will be generated and optimized to use that particular type.

The benefit of this is that it leads to inlining and other optimizations that give Rust the powerful performance we know and love. It also has the disadvantage that a large amount of code can be generated.

As a possible extreme example, consider the following function.

use std::fmt::Display; fn frobnicate_array<T: Display, const N: usize>(array: [T; N]) { for elem in array { // ... 2kb of generated machine code } }Copy the code

This function will be instantiated for each item type and array length, even if we are just iterating. Unfortunately, there is no way to avoid code bloatiness, and there is no way to avoid copy/cloning, because all of these iterators contain their size in their type.

If we can reference items, we can iterate over slices without using sizes.

use std::fmt::Display; fn frobnicate_slice<T: Display>(slice: &[T]) { for elem in slice { // ... 2kb of generated machine code } }Copy the code

This allows for at least one version per project type. Even so, suppose we only use arrays or slices to iterate. We can then take into account a type-dependent frobnicate_item method. More importantly, we can decide whether to use static or dynamic scheduling.

use std::fmt::Display; /// This gets instantiated for each type it's called with fn frobnicate_with_static_dispatch(_item: impl Display) { todo! () } /// This gets instantiated once, but adds some overhead for dynamic dispatch /// also we need to go through a pointer fn frobnicate_with_dynamic_dispatch(_item: &dyn Display) { todo! ()}Copy the code

The peripheral frobnicate_array method now contains only one loop and one method call, so there is less code to instantiate. Code bloat avoided!

In general, it’s a good idea to take a good look at the interface to your methods to see where generics are being used or discarded. In both cases, there is a natural boundary where we can use a function to remove generics.

If you don’t want all of these types and are comfortable with a small increase in compile time, you can use my momocRate to remove generic features like AsRef or Into.

What’s wrong with code bloat?

For some backgrounds, code bloat has an unfortunate consequence: today’s cpus adopt a hierarchical cache. Although these caches are very fast in processing local data, they lead to very non-linear use effects. If your code takes up more buffers, it will probably make the rest of the code slower so Amdar’s law will no longer help you find areas to optimize when dealing with memory.

First, this means that optimizing one part of your code in isolation by measuring microbenchmarks can be counterproductive (because the whole code may actually be slower). On the other hand, when writing library code, optimizing your library can make your users’ code pessimistic. But neither you nor they would know that from microscopic benchmarks.

So how do we decide when to use dynamic scheduling and when to make multiple copies? I don’t have a clear rule here, but I do notice that dynamic scheduling is definitely underused in Rust! First, it is considered slow (which is not entirely wrong, considering that vtable lookups do add some overhead). Second, it is often not clear how to get there while avoiding distribution.

Even so, Rust makes it easy enough if the measurements show that moving from dynamic scheduling to static scheduling is beneficial, and since dynamic scheduling can save a lot of compile time, I recommend starting with dynamic scheduling when possible and only doing it when the measurements show that singleton scheduling is faster. This gives us a quick turnaround time, which gives us more time to improve other areas of performance. It is better to have a practical application to measure, rather than a microbenchmark.

That concludes my rant about how to use generics effectively in Rust code. Go, merry Rust!

The postImproving overconstrained Rust library APIsappeared first onLogRocket Blog.