Use Stronger Types!

Arne Mertz November 2, 2016 23

Contents

Is everything we use a string for really just a bunch of characters? Is everything we use an int for really just a number? Probably not. We can have stronger types than that.

Imagine we are programming a role play game. We’ll need something to store our character’s data, like the name, the current level, experience points, attributes like stamina and strength and the amount of gold we own. The usual stuff. It’s simple:

typedef std::tuple<
  std::string, //name
  int, //level
  int, //XP
  int, //stamina
  int, //strength
  int //gold
> Character;

Okay, that’s too simple. Nobody would do that. Almost nobody. We hope. Let’s be realistic:

class Character {
  std::string name;
  int level;
  int xp; 
  int stamina;
  int strength;
  int gold;
};

That’s more like it. Obviously, that class is missing some methods. But let’s concentrate on the variables for now.

Simple types, simple problems

As it stands, we could have a character with 4678285 gold, level 772999566, negative XP and the telling name “meh 56%&8450p&jntr \n gr?==) Bobby Tables“.

If you already know little Bobby Tables or clicked the link, you know where I am going with this: We’ll have to check that whenever we create a new character, the values we assign to those attributes have to make sense. XP usually are not negative. A name usually does not contain special characters.

While we’re at it, character creation is not the only time we can mess up those attributes. Add a big negative number to the XP and we get in trouble, too.

Of course, this can be fixed easily: xp should be an unsigned instead of an int, so it can’t be negative. The name should be const because a character can not change its name, and then it only needs to be checked during character creation.

Except that this will fix only very few of all the problems we can run into. unsigned can underflow, giving impossible large amounts of XP. The level probably can only go as far as 70 or 80 or so (70 was the limit when I last played Wolrd of Warcraft), and that’s not a limit any built-in type can give us.

We can left-shift an int – but what does that mean if we calculate character.stamina << 5? It does not make any sense – so we’d better not be able to make errors like that in the first place.

Now let’s have a look at one of the methods:

void Character::killMonster(Monster const& monster) {
  gold += monster.loot();
  level += monster.bonusXP();
}

This doesn’t look right – the bonus XP granted by killing the monster probably should be added to the character’s XP, not to its level. The additional gold looks about right unless the loot is calculated in some other monetary unit that has to be converted first.

Simple problems, simple solutions: Use stronger types

The first problem we observed above is that we assigned very general types to variables that had additional semantics. The second was that we used the same general types for variables that have different, incompatible semantics.

A std::string is just a bunch of characters, but a name that has been sanitized to be suitable for an RPG character is much more (and, in some ways, less) than that. An int is just a number, while a monetary amount, points and levels are more than just that.

Strong typedef

The solution is to the exchangeability problem is to use what is commonly called a strong typedef. With a normal C++ typedef, a Level type introduced by typedef int Level still is int – it’s just another name for the same type.

A strong typedef is a completely different type that simply happens to behave the same as its base type, in this case, the int. Strong typedefs are simple wrappers around a variable of their base type.

Thanks to optimizing compilers, those wrappers usually have the same performance as the base types. They do not change the runtime code, but they can prevent a lot of errors at compile time.

Other restrictions

It is relatively simple to write classes that can contain only certain values and provide only operations that do not invalidate them again. For example, a class for a valid character name would need some way to construct such a name from a plain std::string. If we do not allow the insertion of arbitrary chars into a Name and can only assign valid Name objects, that constructor would be the only point where we need to check the validity of a name.

For our XP we could use something like a strong typedef that does not provide subtraction (unless we can actually lose XP) and does not allow bit shifting and other stuff that is nonsense for experience points.

In the end, our character class could look something like this:

class Character {
  CharacterName name;
  Level level;
  ExperiencePoints xp;
  Attribute stamina;
  Attribute strength;
  Gold gold;
// ...
  void killMonster(Monster const& monster) {
    gold += monster.loot();
    // level += monster.bonusXP(); //ERROR - no matching function for operator+(Level, XP)
    xp += monster.bonusXP();
  }  
};

In addition to the added safety, the explicit type names make the code even easier to read. Compare this to the tuple<std::string, int, int int...>. Of course, this last example is an extreme we probably never go to, but it may be worth exploring the possibilities between that and the lazy way using only built-in types.

Conclusion

If we really look into the things we model in our program, there are many things that are not “just a number” or “just a string”. While it can be a lot of work to define separate, stronger types for each of these different things, it can also prevent a whole class of bugs.

Luckily there are libraries that can help with the boilerplate involved in defining those types. Examples are “Foonathan”‘s type_safe library, Boost Strong typedef (which is only part of a library), PhysUnits/quantity and Boost.Units.

The net cost will be some implementation time and a little compilation time (those classes tend to be not very complex), but usually little or no runtime cost (if in doubt, use a profiler!).

Thanks to Björn Fahller aka. “Rollbear” for inspiring me to write this post.

23 Comments

Tailor Standard Containers to Your Needs -
7 years ago Permalink
Enhance type safely using Opaque Typedefs aka phantom types – nullptr.nl
7 years ago Permalink
Prefixed Names -
8 years ago Permalink
Strong Typing – Using C++
8 years ago Permalink
Jonathan Boccara
9 years ago Permalink

How would you handle data that don’t make sense ? For instance in your example of a class representing a name where all the checks are contained within the constructor, how would you choose to handle a failed check in this constructor ?
Also, I noted that strong types were also most useful to make interfaces more difficult to use incorrectly, by clarifying the interface’s intentions and by preventing passing parameters in the wrong order.
I feel that strong typing is a very popular topic in C++ at the moment, but I haven’t heard of anything about native features being planned for it in the language. Would you know if anything is being officially planned for the language to cover this ?

Reply
1. Arne Mertz
  9 years ago Permalink
  
  I have written somewhere that objects should not violate their invariants. If we say that one of the invariants is that an object may not represent data that don’t make sense, then such an object may never exist. Therefore, trying to construct an object of nonsensical data would either require the constructor to correct the data (which might involve some guesswork) or fail by throwing an exception.
  If the data you are using to construct an object come from an untrusted source, like a user typing nonsense on his keyboard, then throwing an exception might be too harsh. After all, it’s not exceptional to make typos. In those cases, check the data for validity before you try to construct an object from them.
  
  Reply
paul
9 years ago Permalink

One thing missing is the equivalent of Haskell’s newtype deriving. I.e. you can define a new Haskell type like “Gold” that wraps an integer, but also automatically generates arithmetic operations, so you can add two quantities of gold together, etc. In C++ if you wrap an integer in a class, you have to explicitly define methods for operator+ and so on. I’m not sure how Ada handles this.

Ada has types for integer ranges, which is apparently very useful. It’s done with runtime checking, though you can turn it off for speed if you don’t mind giving up some safety. I don’t know of other languages that support that, though it would be useful.

Reply
Kamil Kisiel
9 years ago Permalink

When dealing with values that have a limited range of options I’ve found “enum class” to be quite helpful in adding some much needed type checking. It also helps ensure all values are accounted for when using switch statements.

Reply
Richard
9 years ago Permalink

One way to arrive at this sort of code is to do an exercise where you observe the “No Naked Primitives” coding constraint. Namely, primitive types (int, std::string, etc.) can only appear as arguments to constructors and everything else has to be a type that expresses some idea in the problem domain.

Reply
drivebycommenter
9 years ago Permalink

Ada was a lot like that. The Ada community used to (possibly still does) think that it is very funny that C++ is thought of as a strongly typed language.

Reply
Clément
9 years ago Permalink

This is very nice — but it’s still pretty limited: for example, you can’t specify a range of valid levels easily. The problem boils down to type checking becoming undecidable if your types are too powerful (“dependent”), which means that languages with strong dependent types (Coq, Agda, …) end up doubling as theorem provers.

Reply
Christian Hausknecht
9 years ago Permalink

Item 9 from “Effective Modern C++”: Prefer alias declarations to typedefs!

😉

Reply
1. Arne Mertz
  9 years ago Permalink
  
  Yes, definitely. I have thought about using alias declarations, but I used typedef in order to not lose the connection to the term “strong typedef” which is no typedef nor alias declaration at all. Maybe it’s time to re-coin the term as “strong type alias”.
  
  Reply
  1. Christian Hausknecht
    9 years ago Permalink
    
    Maybe it’s time to re-coin the term as “strong type alias”.
    
    Sounds kinda nice! 🙂
    
    Reply
xtofl
9 years ago Permalink

Great idea!

And in this Cppcon talk, Ben Deane shows how you can even ‘declare’ your state machine with strong types!

https://www.youtube.com/watch?v=ojZbFIQSdl8

Reply
Gavin Lock
9 years ago Permalink

A project I worked on years ago had many functions such as
AuthoriseDriver(int DriverID, int VehicleID),
AddVehicle(int VehicleID, int DriverID),
AddTag(int TagID, int DriverID).

When we refactored to use strong types for vehicle, driver and tag IDs a bunch of run-time bugs (some that Q&A hadn’t discovered yet) became compile-time failures.

We spent a day or two implementing the strong types and refactoring, and saved ourselves a few hundred man hours in testing and bug-fixing.

Reply
Marc Clifton
9 years ago Permalink

Welcome to semantic computing, something I’ve written a few articles about. 😉

Reply
Rob Grainger
9 years ago Permalink

I completely agree. I wish this area was better supported in the standard – both languages and library. While foonathon’s efforts for strong typedef are commendable (and highly recommended), I can’t help but feel that the barrier to these approaches may well put off less experienced developers.

Another area that can help is user-defined literals, which can nicely complement strong typedef approaches to yield code which is both safer and more readable. It’s just a shame that authoring such types can be arcane.

Reply
Kyle Markley
9 years ago Permalink

Please see also my opaque typedefs library, originally announced at CppCon 2015:
https://sourceforge.net/p/opaque-typedef/wiki/Home/
I designed this library with particular attention to making it easy to define the allowed and forbidden interactions between an opaque typedef and other types. And it does have an (experimental) facility for opaque string typedefs.

Reply
fenbf
9 years ago Permalink

Often you start with a simple solution but it evolves into something complicated. With stronger types it’s easier to maintain the code. Instead of typedef you can create a separate class and implement some part of logic there.
Better than working just on ints…

Reply
1. Arne Mertz
  9 years ago Permalink
  
  Yes, strong typedefs often are just a start towards more complex classes.
  
  Reply
  1. fenbf
    9 years ago Permalink
    
    I sometimes start with a really basic class, wondering if it’s even worth to put this code in a new class… but then it turns out my simple class becomes a monster 🙂
    
    Reply
    1. Henri Tuhola
      9 years ago Permalink
      
      If you had never started giving class for everything, maybe the monster wouldn’t have born?
      
      Reply

Write clean and maintainable C++