Contents
Is everything we use a string
for really just a bunch of characters? Is everything we use an int
for really just a number? Probably not. We can have stronger types than that.
Imagine we are programming a role play game. We’ll need something to store our character’s data, like the name, the current level, experience points, attributes like stamina and strength and the amount of gold we own. The usual stuff. It’s simple:
typedef std::tuple<
std::string, //name
int, //level
int, //XP
int, //stamina
int, //strength
int //gold
> Character;
Okay, that’s too simple. Nobody would do that. Almost nobody. We hope. Let’s be realistic:
class Character {
std::string name;
int level;
int xp;
int stamina;
int strength;
int gold;
};
That’s more like it. Obviously, that class is missing some methods. But let’s concentrate on the variables for now.
Simple types, simple problems
As it stands, we could have a character with 4678285 gold, level 772999566, negative XP and the telling name “meh 56%&8450p&jntr \n gr?==) Bobby Tables“.
If you already know little Bobby Tables or clicked the link, you know where I am going with this: We’ll have to check that whenever we create a new character, the values we assign to those attributes have to make sense. XP usually are not negative. A name usually does not contain special characters.
While we’re at it, character creation is not the only time we can mess up those attributes. Add a big negative number to the XP and we get in trouble, too.
Of course, this can be fixed easily: xp
should be an unsigned
instead of an int
, so it can’t be negative. The name should be const
because a character can not change its name, and then it only needs to be checked during character creation.
Except that this will fix only very few of all the problems we can run into. unsigned
can underflow, giving impossible large amounts of XP. The level probably can only go as far as 70 or 80 or so (70 was the limit when I last played Wolrd of Warcraft), and that’s not a limit any built-in type can give us.
We can left-shift an int
– but what does that mean if we calculate character.stamina << 5
? It does not make any sense – so we’d better not be able to make errors like that in the first place.
Now let’s have a look at one of the methods:
void Character::killMonster(Monster const& monster) {
gold += monster.loot();
level += monster.bonusXP();
}
This doesn’t look right – the bonus XP granted by killing the monster probably should be added to the character’s XP, not to its level. The additional gold looks about right unless the loot is calculated in some other monetary unit that has to be converted first.
Simple problems, simple solutions: Use stronger types
The first problem we observed above is that we assigned very general types to variables that had additional semantics. The second was that we used the same general types for variables that have different, incompatible semantics.
A std::string
is just a bunch of characters, but a name that has been sanitized to be suitable for an RPG character is much more (and, in some ways, less) than that. An int
is just a number, while a monetary amount, points and levels are more than just that.
Strong typedef
The solution is to the exchangeability problem is to use what is commonly called a strong typedef. With a normal C++ typedef, a Level
type introduced by typedef int Level
still is int
– it’s just another name for the same type.
A strong typedef is a completely different type that simply happens to behave the same as its base type, in this case, the int
. Strong typedefs are simple wrappers around a variable of their base type.
Thanks to optimizing compilers, those wrappers usually have the same performance as the base types. They do not change the runtime code, but they can prevent a lot of errors at compile time.
Other restrictions
It is relatively simple to write classes that can contain only certain values and provide only operations that do not invalidate them again. For example, a class for a valid character name would need some way to construct such a name from a plain std::string
. If we do not allow the insertion of arbitrary chars into a Name
and can only assign valid Name
objects, that constructor would be the only point where we need to check the validity of a name.
For our XP we could use something like a strong typedef that does not provide subtraction (unless we can actually lose XP) and does not allow bit shifting and other stuff that is nonsense for experience points.
In the end, our character class could look something like this:
class Character {
CharacterName name;
Level level;
ExperiencePoints xp;
Attribute stamina;
Attribute strength;
Gold gold;
// ...
void killMonster(Monster const& monster) {
gold += monster.loot();
// level += monster.bonusXP(); //ERROR - no matching function for operator+(Level, XP)
xp += monster.bonusXP();
}
};
In addition to the added safety, the explicit type names make the code even easier to read. Compare this to the tuple<std::string, int, int int...>
. Of course, this last example is an extreme we probably never go to, but it may be worth exploring the possibilities between that and the lazy way using only built-in types.
Conclusion
If we really look into the things we model in our program, there are many things that are not “just a number” or “just a string”. While it can be a lot of work to define separate, stronger types for each of these different things, it can also prevent a whole class of bugs.
Luckily there are libraries that can help with the boilerplate involved in defining those types. Examples are “Foonathan”‘s type_safe library, Boost Strong typedef (which is only part of a library), PhysUnits/quantity and Boost.Units.
The net cost will be some implementation time and a little compilation time (those classes tend to be not very complex), but usually little or no runtime cost (if in doubt, use a profiler!).
Thanks to Björn Fahller aka. “Rollbear” for inspiring me to write this post.
Permalink
Permalink
Permalink
Permalink
Permalink
How would you handle data that don’t make sense ? For instance in your example of a class representing a name where all the checks are contained within the constructor, how would you choose to handle a failed check in this constructor ?
Also, I noted that strong types were also most useful to make interfaces more difficult to use incorrectly, by clarifying the interface’s intentions and by preventing passing parameters in the wrong order.
I feel that strong typing is a very popular topic in C++ at the moment, but I haven’t heard of anything about native features being planned for it in the language. Would you know if anything is being officially planned for the language to cover this ?
Permalink
I have written somewhere that objects should not violate their invariants. If we say that one of the invariants is that an object may not represent data that don’t make sense, then such an object may never exist. Therefore, trying to construct an object of nonsensical data would either require the constructor to correct the data (which might involve some guesswork) or fail by throwing an exception.
If the data you are using to construct an object come from an untrusted source, like a user typing nonsense on his keyboard, then throwing an exception might be too harsh. After all, it’s not exceptional to make typos. In those cases, check the data for validity before you try to construct an object from them.
Permalink
One thing missing is the equivalent of Haskell’s newtype deriving. I.e. you can define a new Haskell type like “Gold” that wraps an integer, but also automatically generates arithmetic operations, so you can add two quantities of gold together, etc. In C++ if you wrap an integer in a class, you have to explicitly define methods for operator+ and so on. I’m not sure how Ada handles this.
Ada has types for integer ranges, which is apparently very useful. It’s done with runtime checking, though you can turn it off for speed if you don’t mind giving up some safety. I don’t know of other languages that support that, though it would be useful.
Permalink
When dealing with values that have a limited range of options I’ve found “enum class” to be quite helpful in adding some much needed type checking. It also helps ensure all values are accounted for when using switch statements.
Permalink
One way to arrive at this sort of code is to do an exercise where you observe the “No Naked Primitives” coding constraint. Namely, primitive types (int, std::string, etc.) can only appear as arguments to constructors and everything else has to be a type that expresses some idea in the problem domain.
Permalink
Ada was a lot like that. The Ada community used to (possibly still does) think that it is very funny that C++ is thought of as a strongly typed language.
Permalink
This is very nice — but it’s still pretty limited: for example, you can’t specify a range of valid levels easily. The problem boils down to type checking becoming undecidable if your types are too powerful (“dependent”), which means that languages with strong dependent types (Coq, Agda, …) end up doubling as theorem provers.
Permalink
Item 9 from “Effective Modern C++”: Prefer alias declarations to typedefs!
😉
Permalink
Yes, definitely. I have thought about using alias declarations, but I used typedef in order to not lose the connection to the term “strong typedef” which is no typedef nor alias declaration at all. Maybe it’s time to re-coin the term as “strong type alias”.
Permalink
Sounds kinda nice! 🙂
Permalink
Great idea!
And in this Cppcon talk, Ben Deane shows how you can even ‘declare’ your state machine with strong types!
https://www.youtube.com/watch?v=ojZbFIQSdl8
Permalink
A project I worked on years ago had many functions such as
AuthoriseDriver(int DriverID, int VehicleID),
AddVehicle(int VehicleID, int DriverID),
AddTag(int TagID, int DriverID).
When we refactored to use strong types for vehicle, driver and tag IDs a bunch of run-time bugs (some that Q&A hadn’t discovered yet) became compile-time failures.
We spent a day or two implementing the strong types and refactoring, and saved ourselves a few hundred man hours in testing and bug-fixing.
Permalink
Welcome to semantic computing, something I’ve written a few articles about. 😉
Permalink
I completely agree. I wish this area was better supported in the standard – both languages and library. While foonathon’s efforts for strong typedef are commendable (and highly recommended), I can’t help but feel that the barrier to these approaches may well put off less experienced developers.
Another area that can help is user-defined literals, which can nicely complement strong typedef approaches to yield code which is both safer and more readable. It’s just a shame that authoring such types can be arcane.
Permalink
Please see also my opaque typedefs library, originally announced at CppCon 2015:
https://sourceforge.net/p/opaque-typedef/wiki/Home/
I designed this library with particular attention to making it easy to define the allowed and forbidden interactions between an opaque typedef and other types. And it does have an (experimental) facility for opaque string typedefs.
Permalink
Often you start with a simple solution but it evolves into something complicated. With stronger types it’s easier to maintain the code. Instead of typedef you can create a separate class and implement some part of logic there.
Better than working just on ints…
Permalink
Yes, strong typedefs often are just a start towards more complex classes.
Permalink
I sometimes start with a really basic class, wondering if it’s even worth to put this code in a new class… but then it turns out my simple class becomes a monster 🙂
Permalink
If you had never started giving class for everything, maybe the monster wouldn’t have born?