Semantics, Implementation and Delimitation Against Pointers

Using std::optional<T> for Optional or Uninitialized Values in C++

boost::optional<T> has been out there for a while to solve the problem of uninitialized values. Now std::optional has been accepted into the C++17 standard. In this article, we have look at the problem to which optional is the solution and analyse semantics, usage, implementation of optional as well as the difference to pointers.

The Problem

Magic values in the code: sometimes they are used as an ill-advised solution for the problem of not being able to initialize a variable. Functional programmers may ask: “If you cannot initialize it, why did you put it in this scope at all?” Well, the duteous procedural programmer ignores this question and shivers when encountering magic values during code review. Code like the following snippet silently produces bugs waiting to happpen TM and with increasing complexity reduce the comprehensibility.

int integerNumber = -1;
if (someCondition())
    integerNumber = function();
if (integerNumber != -1)
    anotherFunction();

Suboptimal Solutions

A baroque solution for this problem is the usage of a pointer. In modern C++, the value must be owned somewhere as a std::unique vptr, but we pass it to functions as a raw pointer:

void function(const int* arg)
{
    std::cout << "your value is"
              << (argument ? std::to_string(*arg) : " not initialized");
}

That is a solution to work with, since now we have an additional state that represents an uninitialized value. To be concrete, 0 or std::nullptr as pointer value is now distinct from the value of the pointee. But the elephant in the room is: to achieve this, we require heap allocation (for things that may be as small as a char).

An alternative, std::pair<T,bool> can do the same for us without requiring the heap. Once more, uninitialized values can be expressed explicitly. But for this solution to work we need T to be default-constructible, which is not always the case (apart from std::pair<T,bool> being quite cumbersome).

The Solution

If boost or C++17 is an option for your code, optional<T> is the construct of choice. An idiomatic solution at last, just that is a striking advantage on its own. See below some illustrations on how to use it practice. The code cuts out some non-interesting parts. Since it has just been accepted into the C++17 standard, you need g++-6 to compile it.

  1. The example below illustrates inplace construction of a member variable. Of course, we do not need to initiate it in the constructor (which is the entire point of optional). Thus, optional is a good way of implementing two-step initialization, although my stance on it is to avoid it whenever you can. In the boost-world, you can achieve the same using inplace-factories.

  2. In a naturally way, we can use optional in conjunction with default arguments (see function1). What is nullopt in the new standard corresponds to boost::none in boost.

  3. We see optional as a return value of function2. Only use this, if the non-value nullopt actually represents a meaningful return value. Do not use it for ensuring dead code consistency (i.e., returning nullopt under the default: label in a switch statement, although you have listed all possible values).

  4. Then there is emplace() which allows for inplace-construction of a value replacing the current value. Notice, that this is more efficient than *_heavyObject = HeavyObject(...), which comes at the cost of an additional move operator invocation and an additional destructor invocation.

using std::experimental::optional; 
// will be std::optional later
class Object { 
    optional<HeavyObject> _heavyObject; // use it as a member public: 
    // --------------------------------------------------------------------
    Object() // inplace construction
        : _heavyObject(in_place, "hello", 1ll, 2, true)
    {}
    // --------------------------------------------------------------------
    // use it as an argument with default argument
    auto                                     function1(
        const optional<HeavyObject>& arg = nullopt) ->void;
    // --------------------------------------------------------------------
    auto                                     function2()
        ->optional<HeavyObject>     // use it as return value
    {
        if (_heavyObject)
        {
            // efficient inplace construction
            _heavyObject.emplace("goodbye", 2ll, 3, false);
        }

        // pointer semantics
        std::cout << _heavyObject->callFunction() << "\n";
        // but emplace would be more efficient below
        *_heavyObject = HeavyObject("a", 3ll, 4, true);

        return _heavyObject;
    }
};

Pointers versus optional

In the example above, you probably noticed that optional<T> is mostly used the same way as a pointer. The operators operator bool(), operator* and operator-> are implemented such that optional<T> pretty much can be used like a pointer.

In stark contrast to pointers however, the value of an optional<T> is not allocated on the heap. Typically, optional<T> is implemented as discriminated union light:

  • there is a member bool (to indicate whether present)

  • and a union of T and a dummy char (for avoiding that the object needs to be default constructible).

So while we can use it like a pointer, it actually has value semantics and copy and move operations for optional<T> are as expensive as for the underlying type T. In conclusion, sizeof(optional<T>) is sizeof(T) + 1 plus padding bytes. Keep that in mind, it means 16 bytes for optional<double> on my machine.

“All of this sounds wonderful!”, you might think. “Why would I ever want to use a pointer at all?” Well, as the value is kept on the stack, T cannot be an abstract type. Furthermore, because of value semantics, shared ownership of the value is not possible. Thus, optional<T> takes over a domain for which we previously needed to abuse pointers. There is no gray area in between the two and it should always be clearly decidable when to use what. Without deep meditation on the topic, I would argue that even shared_ptr<optional<T>> and unique_ptr<optional<T>> are considered good practice as they keep the heap allocation/ownership aspect and the optional aspect separate (given that both aspects are really needed).

As a closing note, be careful and considerate when using optional<bool>.

Post a Comment

All comments are held for moderation; Markdown formatting accepted;
By posting you agree to the following privacy policy.

This is a honeypot form. Do not use this form unless you want to get your IP address blacklisted. Use the second form below for comments.
Name: (required)
E-mail: (required, not published)
Website: (optional)
Name: (required)
E-mail: (required, not published)
Website: (optional)