ACCU 2011 Nuggets: 2. Move semantics

Or: “This is C++: everything is a lie”

This nugget from Scott Meyers’s talk on Perfect Forwarding, slides here .

I should point out that I’m pretty late to the scene here; the original proposal is more than four years old, and far sharper people closer to the source have already written concisely and exhaustively on this topic.

Essentially, the problem that needed solving is the following:

  vector<string> split(string const & to_be_split, string const & splitter) { 
    vector<string> result;
    // how can we return a vector here without lots of copying
    // or an annoying out parameter polluting our method signature?
    // or by having to bind the result to a const &?
    return result;
  }

N.B The original version of this post involved changing the return type of split to be an rvalue reference. That is somewhat out of date, and also very dangerous! In fact, if you’re using an up to date compiler and you implemented split with that signature, the code would use the perfect forwarding provided by vector such that no copying occurs.

  // Here, splitByComma would be move constructed from the result of
  // split. Even in older C++ versions, you still might get lucky and find
  // the copy is elided by return value optimization
  vector<string> const splitByComma(split(string("foo,bar,baz"), string(",")));

Further (and better) motivations and examples of move semantics are detailed here (I highly recommend reading some of this before progressing).

So, we want to write code that will take advantage of this new move support in C++11. In order to do this, we need an understanding of what is safely movable from! It turns out that C++ already has concepts within it to allow us to reason about this without completely losing our sanity: lvalues, rvalues, and lvalue references.

Scott provides the following handy guide for identifying whether a particular variable is an lvalue, an lvalue reference or an rvalue.

lvalues: things you can take the address of

  • string const foo("asdf") // foo is an lvalue
  • int i, *pInt // all of i, pInt and *pInt are lvalues

lvalue references: what you think of as references now (both of our arguments to split from earlier)

rvalues: things you cannot take the address of (unnamed temporaries)

  • split("foo, bar, baz", ","); //both characters strings are rvalues
  • function returns (result from split )

In general:

  • You can’t move from an lvalue (other people can still use it – what happens when they do?). Never do this. OK, almost never (see alternative title).
  • You can move from an rvalue (no-one can get at it – you’re safe!). One might say you should move from an rvalue, unless the cost of copying is irrelevant.

These rules are handy, but there’s one more major gotcha to go. Let’s assume that someone has already made std::string compatible with move semantics ( so it has a move constructor: std::string(std::string &&) ).

  class StringMoverPuzzle {
  public:
    StringMoverPuzzle(string && to_be_moved) :
      moved_string_(to_be_moved) // do we copy or move here?
    {}
  private:
    string const moved_string_;
  };

The answer is annoying – in fact, the copy constructor is called! Outrage! We labelled our parameter with && , it should be an rvalue, damnit! Look further though, it has a name ( to_be_moved ), and, if we were in the body of the constructor, we could take its address. Shock, horror, it’s an lvalue!

So when we try to construct our moved_string_ , the compiler looks for a constructor that takes an lvalue rather than the rvalue we were hoping for, and we end up copying (curses!).

The solution?

    FixedStringMover(string && to_be_moved) :
      moved_string_(std::move(to_be_moved))
    {}

That handy call to std::move is actually just a cast – we’re casting away the lvalueness of to_be_moved so that the correct string constructor is used.

This isn’t anything like the end of the move story (and I haven’t got anywhere near ‘perfect’ forwarding), but I thought the mind break of rvalue reference becoming lvalue was worth a post of its own.

For further material, I highly recommend Scott’s slides, or the video of the session here .