Post

Rust Expanded My Mind: Constructors

Rust Expanded My Mind: Constructors

Should constructors really be special?

You’ll often here new Rust developers say “Rust isn’t object oriented”. This is false, but why is it such a pervasive myth? The Rust book references that some consider inheritance to be a necessary feature for a language to be object oriented, but avoids taking a stance on this. I’ve encountered this argument, but there’s another that I encounter far more often that’s less discussed.

Rust has no constructors.

At first I didn’t think much of this. I mostly just viewed it as a consequence of Rust not supporting function overloading and trying to appeal to the fancies of C developers. That was until I went to write some C++ again and realized that there were multiple classes of potential bugs in something as simple as a constructor that I never needed to consider in Rust but had always subconsciously though about in other languages.

What is a constructor?

A constructor is used to initialize and object. That is, given allocated memory it writes the initial state of the object into it. In C++, you aren’t guaranteed to initialize all members. In other languages (or C++ with extra tools) this may be enforced by the compiler or a linter.

When constructing an object, it undergoes these steps:

  1. Memory is allocated.
  2. The constructor is invoked, setting state and/or throwing an exception.
  3. Control is returned to the caller, implicitly “giving” them the object.

Problems

The biggest problem this creates is that it’s possible to create invalid objects. You might have uninitialized memory, and object half filled with nonsensical defaults, or other state that violates business rules. This complicates everything, because I can’t know that the object I’m holding is valid. If we were to make it impossible to create invalid objects, users get an extremely powerful invariant for simplifying their code.

As an example, this is how Rust’s smart pointers work. A Box is guaranteed to be non-null. This is because the underlying pointer isn’t publicly accessible, and the factory function Box::new() will never return a Box with an underlying null. This invariant means you never need to check for a null pointer. This reduces code and complexity. Because it can’t happen, you never need to consider if it might happen in a specific block of code.

To make this more clear, let’s consider these constructors:

1
2
3
4
5
6
7
class Foo {
    int age = 0;
}

Foo::Foo() = default;
Foo::Foo(int age): age(age) {}
Foo::Foo(const Foo& other) { ... }

This is misleading. Whether or not constructors are “methods” comes down to your definitions, but they’re certainly special case functions. Namely, there’s no return type. If the compiler didn’t have a special case syntax for us, the function would really look like this:

1
2
3
4
5
6
class Foo {
    // Note `this` is an output parameter
    static void Foo(Foo& this) = { this.age = 0; }
    static void Foo(Foo& this, int age)  { this.age = age; }
    static void Foo(Foo& this, const Foo& other) { this.age = other.age; };
}

Yikes. With the ugly truth revealed, the problems are quite clear. In no particular order:

  1. Terrible function names. What’s the intent of each overload?
  2. Where did the passed this come from? We’re initializing here, so that means we somehow pre-created it in who knows what state.
  3. Output parameters are bad.
  4. Failures can only be (reasonably) communicated by throwing here. This is a terrible context to throw in. As with all exceptions in C++ and many other languages, the caller doesn’t know if it can throw. An exception is super heavy handed here and will cause extra code to be generated.

What does Rust do instead?

I can’t speak to the official motivations for why constructors aren’t supported, but it is easy to see that regardless of what the language creators wanted to do it was simply impractical to allow them. As we covered above, errors are handled in constructors by throwing. In Rust there are no exceptions, so this means you either have to require that constructors are infallible or add an object specific success flag as a member the caller needs to implicitly know to check. Both of these are clearly untenable.

Even if it were possible, we’ll see that not using them is better. In fact, I recommend that you don’t use constructors wherever possible in other languages.

The best way to describe what Rust does is that it’s honest and transparent. There’s no special case for constructors. Where you would have a constructor you just write a normal factory function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
struct Foo {
    age: u32,
}

impl Foo {
  	// Equivalent to Foo::Foo(int age)
    pub fn new(age: u32) -> Self {
        Foo { age }
    }
}

impl Default for Foo {
  	// Equivalent to Foo::Foo()
    fn default() -> Self {
        Foo { age: 0 }
    }
}

impl Clone for Foo {
  	// Equivalent to Foo::Foo(const Foo& other)
    fn clone(&self) -> Self { 
        Foo { age: self.age }
    }
}

More will be said on why the other “constructors” are trait (i.e. interface) implementations instead of functions associated with the Foo struct directly. For now just accept that as is. We’ve solved all of the problems we discussed before. Since we have a return type, we can return errors easily and without ever exposing invalid objects. If the constructor is fallible, it just becomes:

1
2
3
4
5
impl Foo {
    pub fn new(age: u32) -> Result<Self, ErrorTypeHere> {
        Foo { age }
    }
}

Nothing is missing (in Rust)

In Rust there are “special” patterns, not syntax. Some are baked into the language, some are just by convention. In the same way that constructors are special, so are destructors. When objects go out of scope in Rust, the compiler implicitly and inherently inserts a call to <Foo as Drop>::drop() if an implementation exists (whereas in other languages an implicit call is made to Foo::~Foo()).

This is an example of a “Canonical Trait”. Instead of special hard coded syntax, a pattern is expressed as a normal interface and a dependency is taken on it. Other common patterns are default constructors and copy constructors. Those same concepts are “present” in Rust, it’s just that they’re expressed by the Default and Copy traits respectively.

If you want to write generic code, you may need to create default instances of an arbitrary object and/or make copies of it (e.g. if you’re writing a vector). In most languages this is expressed by the special default and copy constructors. This is one reason why the constructor’s name has to match the name of the type, you need a clear token to use as a placeholder for these function calls in your template.

At the end of the day though, this is just an interface. The cool thing that Rust does is it expresses these as actual interfaces in the standard library. Now you can just take a dependency on the trait that defines the platonic ideal of the operation you wish to require for compatibility with your generic or other code.

This is the one area that you’ll run into trouble in other languages. For “non-canonical” constructors, I strongly encourage you to use a factory function instead. However, if you want your object to be usable in certain contexts it will still require defining default and/or copy constructors.

All rights reserved by the author.