Contents

I am going to tell you a story about an issue that brought me a rather painful debugging session, and the lessons that can be learned from it.

A few years back I was restructuring the libraries of a larger project. Restructuring means I shoved around some compilation units that were in different libs before and put them in the same lib, because they belonged together. Compilation and linking worked without problems, and since I had not touched any code, I was very sure that everything should work as before. A basic functionality test suite did not show any failures, so I checked everything in and went home.

The end.

Not quite.

The next morning I started the application and it almost immediately crashed. I started the debugger to find out where the crash originated from, in order to tell the developer who was responsible for that part.

Hours of debugging and swearing later, I had found the source of the crash. A pointer that could not possibly be touched by anything legal got altered and accessing it resulted in the crash. A data breakpoint told me it was a std::string constructor that altered the pointer. More precisely it was the construction of the third string member of an object that had only two strings and nothing else. What the…?

Examining where that constructor call originated from, I got to a code location that looked like this:

//SomeFooBarClass.cpp

class NamingHelper {
  string name_;
  string suffix_;
  string info_;
public:
  NamingHelper(string const& name, string const& suffix)
    : name_(name), suffix_(suffix), info_("default-info")  //<======!!!====
  {}
  //...
};

void SomeFooBarClass::doSomeNaming() {
  NamingHelper helper("meow", "-fix");
  //...
}

In the initialization list, the construction of info_ was what seemed to wreak havoc with some memory that did belong to another object. When I looked one call back in the stack trace, I got this:

//SomeFooBazClass.cpp

class NamingHelper {
  string name_;
  string suffix_;
public:
  NamingHelper(string const& name, string const& suffix)
    : name_(name), suffix_(suffix)  
  {}
  //...
};

void SomeFooBazClass::doSomeNaming() {
  NamingHelper helper("meow", "-fix");          //<======!!!====
  //...
}

See the difference? NamingHelper had only two members. Seconds before it had three. It took me a while to realize that I was in a different source file. The surrounding code was almost identical, as were the class names.

Cause of the Problem

Sometime in the past, there had been only SomeFooBarClass, with a two-element NamingHelper. When the very similar SomeFooBazClass was needed, someone just copy-pasted the whole class, made some minor changes and checked in, violating several rules of clean coding, like the DRY principle. He should have factored out the common behavior or generalize the existing class by adding a parametrization for the differing behavior.

Months later, someone else made a minor change to SomeFooBarClass, adding the info_ member to the NamingHelper of that class’ implementation. The two classes were in different libraries at that time, otherwise, the crash or a similar problem should have occurred back then already.

When I put both compilation units into the same lib, I unknowingly violated the One Definition Rule: I had two differing class definitions with the same name, in the same namespace, in two different compilation units. The linker does not need to warn about that, it simply can assume the class definitions are the same. Having the same class definition in different translation units happens all the time if you include the same header in more than one source file.

Having two instances of the constructor NamingHelper::NamingHelper(string const&, string const&) is not an issue for the linker, either. Both constructors have been defined inside the class definition, making them implicitly inline, and having the same inline function defined in several translation units is a linker’s daily business. It can assume that each definition is the same because the ODR says they have to be, and then picks whichever definition it wants.

In this case, it picked the constructor definition of the three-element NamingHelper. The compiler on the other hand, while compiling SomeFooBazClass.cpp did know only a two-element NamingHelper and therefore reserved only enough space for the two strings on the stack. When the three-element constructor was executed, the third string got constructed in another object’s memory, a typical case of undefined behavior.

Lessons Learned

The first thing to notice is that this error can not only occur because someone carelessly copy-pastes some code that should be refactored instead. NamingHelper is a very generic name for a helper class, and it is not very unlikely that two developers come up with the same names for their helper classes. That is not necessarily bad, although too generic names or poor names, in general, tend to obfuscate the meaning of code.

Take care when naming things. When using short, generic names make sure they don’t reach too far.

In other words, i may be OK for a loop variable, but it certainly is not for something that can be accessed globally. In this case, the generic helper class names reached out of their respective translation units, which should have been prevented. This can be achieved easily by using anonymous namespaces, which I used to solve the problem in the end.

Consider giving internal linkage to helper classes and functions, e.g. by putting them in an unnamed namespace.

In hindsight, I should have done what the copy-paste guy hadn’t done: Refactoring the two classes. Of course, one should approach refactoring in a pragmatic way, not in a dogmatic way, i.e. don’t start a huge refactoring session after you changed just two or three lines of code. But in this case, a very ugly code smell had caused me several hours of debugging, so I should have spent an hour or two to set things right. I just had not read about “Clean Code” and heard of the “Boy Scout Rule” yet.

Instead of just fixing bad code, improve it.

Last but not least there was another error I had made: I had made false assumptions. By only running a basic test suite before checking in I exposed others to an error I could have found running all unit tests.

Don’t just think that nothing can go wrong, prove it. Run all unit tests before checking in.

Yes, I mean all unit tests. But I mean only real unit tests, not integration tests or full system tests that are labeled as “unit tests”. There is a huge difference between the two, which I will write about in a future post.

Arne Mertz
10 years ago Permalink

By parametrizing the class I meant the original SomeFooBarClass, i.e. a class that, depending on some parameter would either behave the “Bar” way or the “Baz” way. I don’t remember what the two classes did exactly and what the difference was, so I have to be a bit vague. However, code duplication never is a good sign.

As for your options: The NameHelper classes were very lightweight classes, not much more than data structures. The author was right to put it in the source file of SomeFooBarClass and not make it a full grown class with its own header and source, because it was only a little implementation detail. So the inheritance option is not applicable. The same applies for your option 3, putting a wrapper around it would have made it just a fat little class without much benefits.

I would kind of go with option 2: If you don’t plainly copy SomeFooBarClass but parametrize/generalize it, then there is only one doSomeNaming method and only one NameHelper is needed. In order to deal with the Bar/Baz parametrization the doSomeNaming and the NameHelper would have been generalized a bit for the two possible options, and there would have been only one NameHelper, capable of dealing with both cases.

6 Comments

Roger Orr
9 years ago Permalink

If you’re running on a recent enough version of gcc then compiler warning -Wodr, in conjunction with -flto, can find this sort of problem at link time.

Timothy Onogu
10 years ago Permalink

Ouch… That’s so annoying!
Thanks for the lesson.

By the way, I have a question for the paragraph if “Cause of the Problem”
You said, the copy-paste programmer should have parametized the existing class.
In you own suggestion, which option would have been best? And can you please comment on these options/scenerio.

I have thought of few options including:
(1) Simply Inheriting from the NamingHelper and do his Addition. //Note: No need for vtables here since there arent virtual functions and the NamingHelper2 wouldn’t ever be manipulated from base class. Either at compiletime or runtime

(2) Liase with the maintainer of NameHelper to add an additional std::string member to his class. So that I can use it. This of cause is, far as the performance/usage requirements of NameHelper permits it.

(3) If NameHelper was a class I just use as a client (i.e I don’t care about a bulk of its implementation details, I am only sticking to contracts). Should I wrap NameHelper in another ‘light’ class that just adds the behaviour I need?

regards,
Timothy

1. Timothy Onogu
  10 years ago Permalink
  
  Forgive my typos please….
  
2. Arne Mertz
  10 years ago Permalink
  
  By parametrizing the class I meant the original SomeFooBarClass, i.e. a class that, depending on some parameter would either behave the “Bar” way or the “Baz” way. I don’t remember what the two classes did exactly and what the difference was, so I have to be a bit vague. However, code duplication never is a good sign.
  
  As for your options: The NameHelper classes were very lightweight classes, not much more than data structures. The author was right to put it in the source file of SomeFooBarClass and not make it a full grown class with its own header and source, because it was only a little implementation detail. So the inheritance option is not applicable. The same applies for your option 3, putting a wrapper around it would have made it just a fat little class without much benefits.
  
  I would kind of go with option 2: If you don’t plainly copy SomeFooBarClass but parametrize/generalize it, then there is only one doSomeNaming method and only one NameHelper is needed. In order to deal with the Bar/Baz parametrization the doSomeNaming and the NameHelper would have been generalized a bit for the two possible options, and there would have been only one NameHelper, capable of dealing with both cases.
  
  1. Timothy Onogu
    10 years ago Permalink
    
    Ohh. Ok I get, Thanks for clarifying.
    With respect to your post,
    https://arne-mertz.de/2015/02/adherence-to-design-patterns/
    
    Originaly,
    
    NameHelper
    
    should be as simple as it was. Fine.
    
    Now that we needed same interface but slightly different behaviour, can we then adopt a design pattern? 🙂
    
    (i.e A minimal Policy-based design? Optionally supply an to change the behaviour? )
    
    Thanks.
    
    1. Arne Mertz
      10 years ago Permalink
      
      It depends. As far as I remember the difference in this case was only the info attribute, so it probably could be just made an empty string for the original case.
      would usually think twice (at least) before considering a design pattern for helper classes. helper classes usually provide only a small part of a functionality. using a design pattern usually implies having several classes providing a complete functionality, i.e. more than just a helper class.

Write clean and maintainable C++

Helper Classes Deserve Some Care, Too

The end.

Not quite.

Cause of the Problem

Lessons Learned

6 Comments

Leave a Reply Cancel reply