C++ Object Lifetimes

Some of the most surprising bugs I have come across happened when someone (often enough myself) accessed an object outside of its lifetime. There are some pitfalls, common misunderstandings and lesser known facts about object lifetimes that I want to address here. 

What is an object?

In C++ standardese, the term “object” does not only refer to instances of a class or struct. It also refers to instances of built in types like e.g. `int`.  Pointers, enumerators, booleans, doubles and arrays are objects, too. Functions and classes aren’t.  In other words, an object is a piece of memory, but functions don’t count even if they occupy storage.

Every object has a type. Objects that are instances of classes or structs are called “objects of class type”. Those Objects can have subobjects which themselves are objects, of course.

Storage duration

Before we come to object lifetimes, there is another important concept named storage duration. I’ll just quote the standard here:

“Storage duration is the property of an object that defines the minimum potential lifetime of the storage
containing the object. The storage duration is determined by the construct used to create the object and is
one of the following:

  • static storage duration
  • thread storage duration
  • automatic storage duration
  • dynamic storage duration”

The standard definitions for these are somewhat lengthy, especially for dynamic storage duration. To sum it up they’re roughly the following: Static storage exists from program start to program end. Thread storage exists from thread start to thread end for each thread. Automatic storage exists from the point of definition of a variable to the end of the surrounding scope. Dynamic storage exists from the allocation until the deallocation.

The storage duration of subobjects is that of their containing object. This already is a hint that the lifetime of an object is not always the same as the storage duration, because two subobjects of the same containing object will not always come to life at the same time. Obviously, if there is no storage, there is no object, therefore we can say `object lifetime <= object storage duration`.

Object lifetime

Start

So when does the object actually start to exist? The answer is pretty intuitive: when it is complete and ready to roll. In other words, when it is initialized – as far as initialization for the object in question goes. So what does that mean in detail?

If the object is of build int type and the definition has no initializer, no initialization takes place and start of the object’s lifetime is the same as the start of its storage duration. It will contain garbage values, which can be dangerous especially if it is a pointer, but you can use it right away. If there is an initializer, object lifetime starts immediately after the object has been initialized with that value, which means effectively immediately at start of the storage duration as well.

It gets more interesting for compound objects, i.e. arrays and objects of class type. Their lifetime starts when the liftetime of each subobject has started and – if present – the constructor has completed normally. This can well take some time, so the start of the storage duration, the start of the lifetimes of each subobject and the start of the lifetime of the enclosing object itself can be all different points in time.

End

The end of an object’s lifetime is determined exactly symmetrical to its start: If there is no destructor or if the destructor is trivial, the object’s lifetime ends with its storage duration. Pretty boring stuff. However, if there is a destructor, the lifetime of the object stops as soon as the destructor body starts to execute. After that, the subobjects are destroyed one after the other, in reverse order to their initialization, and their lifetime stops as soon as their destruction begins.

Why do we care?

Object lifetime is a useful concept when reasoning about program semantics and correctness. When the lifetime of an object has not yet begun or has already ended, there is no object. It max be that subobjects exist, e.g. during the execution of constructors and destructors, but the object in question itself does not exist. If there is no object, it can have no state, and no invariants can be met.

That means, that we have to be careful when we call member functions in a constructor or destructor, because that member function may rely on an invariant that has not yet been established or has been destroyed already. It also means that the cleanup we perform in a destructor should not be able to cause too much trouble: We can’t fix a broken object that does not exist any more.

Another consideration is the lifetime of subobjects. Member subobjects are initialized in the order in which they are declared in the class definition, and before that base class subobjects are initialized in the order in which the base classes are provided in the inheritance list. That means especially that the lifetime of members starts after the lifetime of base classes. We can pass pointers to a class member to  any base class constructor, because its storage duration has already started, but if we actually use them in the base class constructor, we get into trouble because the member does not exist yet.

The evil changeling

Consider this little example:

struct Changeling {
  Changeling(int n) : number{n} {}
  void foo(int);
  int number;
  ~Changeling() { foo(number); }
};

int main() {
  Changeling changeling{ 42 };
  Changeling* pc = &changeling;
  int* pn = &changeling.number;

  pc->~Changeling(); //destroy it...
  new(pc) Changeling{ 43 }; //and create it again in the same place

  pc->foo(81);
  std::cout << *pn << '\n';
}

What do you think will happen here? How many Changeling objects are there?

It will probably work as you expect: do whatever `foo(81)` does and print 43. However, it is not guaranteed to work, and quite honestly, it is plain evil a few ways. By manually destroying the first object, we end the lifetime of Changeling No. 42. After that, `pc` and `pn` are only addresses to memory where nothing is living.

After that, we create a new Changeling in the same place. The comment is misleading, it is not created again, it is a different object, with its own lifetime. `pc` and `pn` still are only adresses. They referred to the first Changeling, not to the second, and it is only by lucky chance that they happen to point to addresses where another Changeling now lives. While this works in practice, it is in fact illegal as long as you don’t reassign the pointers to the new object and its member:

  pc = new(pc) Changeling{ 43 }; //and create it again in the same place
  pn = &pc->number;
  pc->foo();
  std::cout << *pn << '\n';

However, there is a last issue that you can’t fix in this scenario: The implicit destructor call the compiler will insert at the end of the function. It too is meant for the original object – imagine it as `changeling.~Changeling();`. It will do the right thing, but it’s not the right thing to do.

Conclusion

Keep object lifetimes in mind, and don’t mess with them. Apart from being illegal and error prone, it will confuse readers of your code.

Facebooktwittergoogle_plusredditlinkedinFacebooktwittergoogle_plusredditlinkedinby feather
Posted in

1 Comment

  1. David Haim

    Life time is something many developers who come from managed languages mess with without properly understand.
    it is worth mentioning that lifetime is not bound only to objects but for primitives as well. returning a reference to local integer is an example of a problem regarding lifetime and no objects are involved.

    I think a simple schema to where declare object that works 85% of the time works as follow:

    Should the whole program know about certain variable and this variable is alive throughout the running tume of the program? it should be global or static (static storage duration)

    is the variable is temporary in a sense that is used throughout some chain of functions and then disappear? it should be local (automatic storage duration). this is the case for maybe 90% of the variables we use.

    is a variable should be known thoughout the thread lifetime, but not between threads? it should be allocated with the thread local storage. it is not common to allocate huge memory blocks from the thread local storage.

    now we get the dynamic allocation. the common uses of dynamic allocations are
    1. when the size of the block can change in size
    2. for dynamic objects like linked-lists, dynamic arrays, etc.
    3. when the needed size of the allocation is huge (bigger than few kilobytes). a common use of “huge” memory blocks is IO functions like reading from files, sockets, http responses etc.
    4. to achieve polymorphism -> when we need an assigment like Base* base = new Derived
    5. when we need a variable to be shared safetly between threads for specific amount of time.
    in any case of heap allocation it is the best to use smart pointers to achieve owenership.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *