Recently I've been reading Effective Modern C++ by Scott Meyers. It's a great book that contains tons of practical advice, as well as horror stories to astound your friends and confuse your enemies. Since Rust shares many core ideas with modern C++, I thought I'd describe how some of the C++ advice translates to Rust, or doesn't.
This is not a general-purpose Rust / C++ comparison. Honestly, it might not make a lot of sense if you haven't read the book I'm referencing. There are a number of C++ features missing in Rust, for example integer template arguments and advanced template metaprogramming. I'll say no more about those because they aren't new to modern C++.
I may have a clear bias here because I think Rust is a better language for most new development. However, I massively respect the effort the C++ designers have put into modernizing the language, and I think it's still the best choice for many tasks.
There's a common theme that I'll avoid repeating: most of the C++ pitfalls that result in undefined behavior will produce compiler or occasionally runtime errors in Rust.
Chapters 1 & 2: Deducing Types / auto
This is what Rust and many other languages call "type inference". C++ has always had it for calls to function templates, but it became much more powerful in C++11 with the auto
keyword.
Rust's type inference seems to be a lot simpler. I think the biggest reason is that Rust treats references as just another type, rather than the weird quasi-transparent things that they are in C++. Also, Rust doesn't require the auto
keyword — whenever you want type inference, you just don't write the type. Rust also lacks std::initializer_list
, which simplifies the rules further.
The main disadvantage in Rust is that there's no support to infer return types for fn
functions, only for lambdas. Mostly I think it's good style to write out those types anyway; GHC Haskell warns when you don't. But it does mean that returning a closure without boxing is impossible, and returning a complex iterator chain without boxing is extremely painful. Rust is starting to improve the situation with -> impl Trait
.
Rust lacks decltype
and this is certainly a limitation. Some of the uses of decltype
are covered by trait associated types. For example,
template<typename Container, typename Index>
auto get(Container& c, Index i)
-> decltype(c[i])
{ … }
becomes
fn get<Container, Index, Output>(c: &Container, i: Index) -> &Output
where Container: ops::Index<Index, Output=Output>
{ … }
The advice to see inferred types by intentionally producing a type error applies equally well in Rust.
Chapter 3: Moving to Modern C++
Initializing values in Rust is much simpler. Constructors are just static methods named by convention, and they take arguments in the ordinary way. For good or for ill, there's no std::initializer_list
.
nullptr
is not an issue in Rust. &T
and &mut T
can't be null, and you can make null raw pointers with ptr::null()
or ptr::null_mut()
. There are no implicit conversions between pointers and integral types.
Regarding aliases vs. typedefs, Rust also supports two syntaxes:
use foo::Bar as Baz;
type Baz = foo::Bar;
type
is a lot more common, and it supports type parameters.
Rust enums
are always strongly typed. They are scoped unless you explicitly use MyEnum::*;
. A C-like enum (one with no data fields) can be cast to an integral type.
f() = delete;
has no equivalent in Rust, because Rust doesn't implicitly define functions for you in the first place.
Similar to the C++ override
keyword, Rust requires a default
keyword to enable trait specialization. Unlike in C++, it's mandatory.
As in C++, Rust methods can be declared to take self
either by reference or by move. Unlike in C++, you can't easily overload the same method to allow either.
Rust supports const iterators smoothly. It's up to the iterator whether it yields T
, &T
, or &mut T
(or even something else entirely).
The IntoIterator
trait takes the place of functions like std::begin
that produce an iterator from any collection.
Rust has no equivalent to noexcept
. Any function can panic, unless panics are disabled globally. This is pretty unfortunate when writing unsafe
code to implement data types that have to be exception-safe. However, recoverable errors in Rust use Result
, which is part of the function's type.
Rust supports a limited form of compile-time evaluation, but it's not yet nearly as powerful as C++14 constexpr
. This is set to improve with the introduction of miri.
In Rust you mostly don't have to worry about "making const
member functions thread safe". If something is shared between threads, the compiler will ensure it's free of thread-related undefined behavior. (This to me is one of the coolest features of Rust!) However, you might run into higher-level issues such as deadlocks that Rust's type system can't prevent.
There are no special member functions in Rust, e.g. copy constructors. If you want your type to be Clone
or Copy
, you have to opt-in with a derive
or a manual impl
.
Chapter 4: Smart Pointers
Smart pointers are very important in Rust, as in modern C++. Much of the advice in this chapter applies directly to Rust.
std::unique_ptr
corresponds directly to Rust's Box
type. However, Box
doesn't support custom deallocation code. If you need that, you have to either make it part of impl Drop
on the underlying type, or write your own smart pointer. Box
also does not support custom allocators.
std::shared_ptr
corresponds to Rust's Arc
type. Both provide thread-safe reference counting. Rust also supports much faster thread-local refcounting with the Rc
type. Don't worry, the compiler will complain if you try to send an Rc
between threads.
C++ standard libraries usually implement shared_ptr
as a "fat pointer" containing both a pointer to the underlying value and a pointer to a refcount struct. Rust's Rc
and Arc
store the refcounts directly before the value in memory. This means that Rc
and Arc
are half the size of shared_ptr
, and may perform better due to fewer indirections. On the downside, it means you can't upgrade Box
to Rc
/Arc
without a reallocation and copy. It could also introduce performance problems on certain workloads, due to cache line sharing between the refcounts and the data. (I would love to hear from anyone who has run into this!) Boost supports intrusive_ptr
which should perform very similarly to Rust's Arc
.
Like Box
, Rc
and Arc
don't support custom deleters or allocators.
Rust supports weak pointer variants of both Rc
and Arc
. Rather than panicing or returning NULL
, the "upgrade" operation returns None
, as you'd expect in Rust.
Chapter 5: Rvalue References, Move Semantics, and Perfect Forwarding
This is a big one. Move semantics are rare among programming languages, but they're key in both Rust and C++. However, the two languages take very different approaches, owing to the fact that Rust was designed around moves whereas they're a late addition to C++.
There's no std::move
in Rust. Moves are the default for non-Copy
types. The behavior of a move or copy is always a shallow bit-wise copy; there is no way to override it. This can greatly improve performance. For example, when a Rust Vec
changes address due to resizing, it will use a highly optimized memcpy
. In comparison, C++'s std::vector
has to call the move constructor on every element, or the copy constructor if there's no noexcept
move constructor.
However the inability to hook moves and the difficulty of creating immovable types is an obstacle for certain kinds of advanced memory management, such as intrusive pointers and interacting with external garbage collectors.
Moves in C++ leave the source value in an unspecified but valid state — for example, an empty vector or a NULL
unique pointer. This has several weird consequences:
- A move counts as mutating a source variable, so "Move requests on
const
objects are silently transformed into copy operations". This is a surprising performance leak. - The moved-out-of variable can still be used after the move, and you don't necessarily know what you'll get.
- The destructor will still run and must take care not to invoke undefined behavior.
The first two points don't apply in Rust. You can move out of a non-mut
variable. The value isn't considered mutated, it's considered gone. And the compiler will complain if you try to use it after the move.
The third point is somewhat similar to old Rust, where types with a destructor would contain an implicit "drop flag" indicating whether they had already been moved from. As of Rust 1.12 (September 2016), these hidden struct fields are gone, and good riddance! If a variable has been moved from, the compiler simply omits a call to its destructor. In the situations where a value may or may not have been moved (e.g. move in an if
branch), Rust uses local variables on the stack.
Rust doesn't have a feature for perfect forwarding. There's no need to treat references specially, as they're just another type. Because there are no rvalue references in Rust, there's also no need for universal / forwarding references, and no std::forward
.
However, Rust lacks variadic generics, so you can't do things like "factory function that forwards all arguments to constructor".
Item 29 says "Assume that move operations are not present, not cheap, and not used". I find this quite dispiriting! There are so many ways in C++ to think that you're moving a value when you're actually calling an expensive copy constructor — and compilers won't even warn you!
In Rust, moves are always available, always as cheap as memcpy
, and always used when passing by value. Copy
types don't have move semantics, but they act the same at runtime. The only difference is whether the static checks allow you to use the source location afterwards.
All in all, moves in Rust are more ergonomic and less surprising. Rust's treatment of moves should also perform better, because there's no need to leave the source object in a valid state, and there's no need to call move constructors on individual elements of a collection. (But can we benchmark this?)
There's a bunch of other stuff in this chapter that doesn't apply to Rust. For example, "The interaction among perfect-forwarding constructors and compiler-generated copy and move operations develops even more wrinkles when inheritance enters the picture." This is the kind of sentence that will make me run away screaming. Rust doesn't have any of those features, gets by fine without them, and thus avoids such bizarre interactions.
Chapter 6: Lambda Expressions
C++ allows closures to be copied; Rust doesn't.
In C++ you can specify whether a lambda expression's captures are taken into the closure by reference or by value, either individually or for all captures at once. In Rust this is mostly inferred by how you use the captures: whether they are mutated, and whether they are moved from. However, you can prefix the move
keyword to force all captures to be taken by value. This is useful when the closure itself will outlive its environment, common when spawning threads for example.
Rust uses this inference for another purpose: determining which Fn*
traits a closure will implement. If the lambda body moves out of a capture, it can only implement FnOnce
, whose "call" operator takes self
by value. If it doesn't move but does mutate captures, it will implement FnOnce
and FnMut
, whose "call" takes &mut self
. And if it neither moves nor mutates, it will implement all of FnOnce
, FnMut
, and Fn
. C++ doesn't have traits (yet) and doesn't distinguish these cases. If your lambda moves from a capture, you can call it again and you'll see whatever "empty" value was left behind by the move constructor.
Rust doesn't support init capture; however, move capture is supported natively. You can do whatever init you like outside the lambda and then move the result in.
Like C++, Rust allows inference of closure parameter types. Unlike C++, an individual closure cannot be generic.
Chapter 7: The Concurrency API
Rust doesn't have futures in the standard library; they're part of an external library maintained by a core Rust developer. They're also used for async I/O.
In C++, dropping a std::thread
that is still running terminates the program, which certainly seems un-fun to me. The behavior is justified by the possibility that the thread captures by reference something from its spawning context. If the thread then outlived that context, it would result in undefined behavior. In Rust, this can't happen because thread::spawn(f)
has a 'static
bound on the type of f
. So, when a Rust JoinHandle
falls out of scope, the thread is safely detached and continues to run.
The other possibility, in either language, is to join threads on drop, waiting for the thread to finish. However this has surprising performance implications and still isn't enough to allow threads to safely borrow from their spawning environment. Such "scoped threads" are provided by libraries in Rust and use a different technique to ensure safety.
C++ and Rust both provide atomic variables. In C++ they support standard operations such as assignment, ++
, and atomic reads by conversion to the underlying type. These all use the "sequentially consistent" memory ordering, which provides the strongest guarantees. Rust is more explicit, using dedicated methods like fetch_add
which also specify the memory ordering. (This kind of API is also available in C++.)
This chapter also talks about the C++ type qualifier volatile
, even though it has to do with stuff like memory-mapped I/O and not threads. Rust doesn't have volatile types; instead, a volatile read or write is done using an intrinsic function.
Chapter 8: Tweaks
Rust containers don't have methods like emplace_back
. You can however use the experimental placement-new feature.
Conclusions
Rust and C++ share many features, allowing a detailed comparison between them. Rust is a much newer design that isn't burdened with 20 years of backwards compatibility. This I think is why Rust's versions of these core features tend to be simpler and easier to reason about. On the other hand, Rust gains some complexity by enforcing strong static guarantees.
There are of course some differences of principle, not just historical quirks. C++ has an object system based on classes and inheritance, even allowing multiple inheritance. There's no equivalent in Rust. Rust also prefers simple and explicit semantics, while C++ allows a huge amount of implicit behavior. You see this for example with implicit copy construction, implicit conversions, ad-hoc function overloading, quasi-transparent references, and the operators on atomic
values. There are still some implicit behaviors in Rust, but they're carefully constrained. Personally I prefer Rust's explicit style; I find there are too many cases where C++ doesn't "do what I mean". But other programmers may disagree, and that's fine.
I hope and expect that C++ and Rust will converge on similar feature-sets. C++ is scheduled to get a proper module system, a "concepts" system similar to traits, and a subset with statically-checkable memory safety. Rust will eventually have integer generics, variadic generics, and more powerful const fn
. It's an exciting time for both languages :)
The bonus for me is that Rust does not have exceptions. C++ has exceptions but the way they are specified makes me prefer Golang or Rust.
ReplyDeleteA few questions/comments:
ReplyDelete* You say that Rust's type inference "seems to be a lot simpler". Do you mean from a language-use perspective, an implementation/rules perspective, or both? Having spent a decent chunk of time talking to C++ proponents, I initially read "simpler" to mean "less powerful", which is a pretty standard way for (some) C++ proponents to dismiss other languages, but of course "simpler" often *doesn't* mean "less powerful". Judging from the rest of that section, I'm guessing you mean from a language-use perspective, which is probably worth clarifying. It may well be that Rust's Hindley-Milner-ish (but not exactly HM proper) type inference actually *is* simpler to implement than C++'s imposing collection of ad-hoc rules, and it is almost certainly the case that the question of the relative "power" of each type inference system is independent of the issue of simplicity, but I'm not sure either concern is actually very relevant to most language users.
* Are you sure that moving a `std::vector` moves the *individual elements* contained by that vector? That does not sound correct to me. I believe the underlying data storage is not affected at all when a `std::vector` is moved; this is in fact the point of the move operation, and the reason it's an efficiency gain!
Nominated for Quote of the Week!
ReplyDelete