Contents
When an enum controls the behavior of a class, that behavior can sometimes be expressed by class hierarchies.
Last week I wrote about replacing a fixed set of strings with an enum. I rightfully got responses that instead of enums one can often also use a class hierarchy instead.
An example
Let’s consider a rudimentary class for moving objects in a game:
class MovingGameObject { enum ObjectType { HUMAN, BIRD, MONSTER }; ObjectType const objectType; public: void moveTowardsTarget() { calculateDirection(); switch(objectType) { case HUMAN: runOnTwoFeet(); break; case MONSTER: gallopOnAllFour(); break; case BIRD: flyFlappingWings(); break; default: thrown UnknownMOBTypeError(); } } void calculateDirection(); private: void flyFlappingWings(); void gallopOnAllFour(); void runOnTwoFeet(); };
I think it’s pretty obvious that this is not a very good design. A `MovingGameObject` that is a `HUMAN` would still theoretically be able to fly flapping its wings, which is ridiculous.
If there were other behavioral differences between the types, we would need to have more of the switch/case statements. We would also have more specialized behaviors in the class accessible for types that don’t care about it.
Last but not least, we might be adding new types of objects, e.g. `FISH`. This would cause us to alter every single of those behavioral functions to add another case. Missing a switch case statement could lead to all kinds of errors. Imagine we had created a flying fish by accident!
Refactoring
If you find a class’ behavior to be dependent on an enum, consider to extract that behavior in subclasses instead.
This is the general advice, and here is how it’s done in our example case:
Move each switch/case statements into its own function. Make those functions virtual and private.
class MovingGameObject { enum ObjectType { /* ... */ }; ObjectType const objectType; public: void moveTowardsTarget() { calculateDirection(); move(); } private: virtual void move() { switch(objectType) //... } } void calculateDirection(); };
For each enumerator, create a class that derives from the class you are refactoring. For each switch/case function, move each case into an overriding function in the derived class that corresponds to the enumerator.
If there is a sensible default case, leave it in the base class function. If the default was to throw an exception, just erase it and make the base class function pure virtual.
class MovingGameObject { public: void moveTowardsTarget() { calculateDirection(); move(); } void takeDamage(int damage); private: virtual void move() = 0; void calculateDirection(); }; class Bird : public MovingGameObject { private: void move() override { flyFlappingWings(); } void flyFlappingWings(); }; class Monster: public MovingGameObject { /* ... */ }; class Human : public MovingGameObject { /* ... */ };
By this, we have moved each entity with different behavior into its own class. The `moveTowardsTarget` method in the base class now is a template method which implements only the common behavior and delegates the specialized behavior to the new classes.
In case you wonder: Even if there is no common behavior, the separation into a public nonvirtual and a private virtual method often makes sense. It facilitates later refactoring if such general behavior is added and is called “Non-Virtual Interface Idiom”.
Considering other examples
It is understandable that whenever we see an enum that mandates behavior we could be tempted to refactor it into its own class hierarchy. Probably we even have given it a name that ends with “Type”.
An example would be the `MessageType` enum I use in the blog post about the string-to-enum refactoring. If we properly wrap it together with the message text we get a small structure:
struct Message { MessageType messageType; string messageText; };
In that post, I also briefly mentioned a function to print the message to the console. With our new struct it could look like this:
void printMessage(Message const& msg) { switch (msg.messageType) { case WARNING: std::cout << "WARN: "; //... } std::cout << msg.messageText; }
We clearly see the switch/case statement selecting the different behaviors. So isn’t it time to refactor `Message` into subclasses that implement these different behaviors?
Behavior of other classes
As it turns out, the printing of a message is not necessarily behavior of the message itself. It can very well be the behavior of some console UI. Imagine a GUI that can display those messages on screen, displaying different kinds of icons for the different message types.
Then there could be a logger that is configured to log only messages of a certain severity. The distinction of the different message types and the decision whether to log them or not would definitely part of the logger behavior, not of the message behavior.
Visitor pattern?
Usually when it comes to behavior of other classes that depends on the type, we use the visitor pattern. In all its full-fledged object oriented glory, it would look like this:
class MessageVisitor; class Message { string messageText; public: virtual void accept(MessageVisitor& visitor) const = 0; string const& text() const { return messageText; } }; class InfoMessage; class WarningMessage; class ErrorMessage; class MessageVisitor { public: virtual void visit(InfoMessage const& infoMessage) = 0; virtual void visit(WarningMessage const& warningMessage) = 0; virtual void visit(ErrorMessage const& errorMessage) = 0; }; class WarningMessage : public Message { public: void accept(MessageVisitor& visitor) const final override { visitor.visit(*this); //overload resolution -> visit(WarningMessage&) } }; //... class ConsoleUI : public MessageVisitor { public: void printMessage(Message const& message) { message.accept(*this); std::cout << message.text(); } void visit(WarningMessage const&) final override { std::cout << "WARN: "; } //... };
This is a lot of code, and it’s not nearly all of it. Just to get rid of determining the behavior by the enum we introduced a whole bunch of additional classes and functions. They do nothing but dispatching calls back and forth to find the right function that does what once was one line in a simple switch/case statement.
Keep it simple
The KISS principle demands that we do not make our code more complicated than it needs to be. In the firs example the enum-to-classes refactoring gave us a clean separation of the different types’ behavior. In this case it only introduced an unnecessarily hard to grasp madness of classes.
Luckily, in C++ we don’t have to force everything in some class hierarchy. Therefore in this case we should refrain from overengineering and just leave our `Message` as it was. The `Type` suffix on the enum may have mislead us, because we clearly don’t need different types here. So let’s just give it a better name, like `MessageCategory` or `MessageLevel`.
Conclusion
While “string to enum” is a good first refactoring step, it need not be the last. Moving from enums to a class hierarchy to separate different behaviors can be a next step if it brings the desired benefit. This step is however not always sensible, especially if the dependent behavior is not part of the class.
Since I touched the visitor pattern today, I will have a closer look at it next week.
Permalink
I was trying to do this when I found your posting… there is a wrinkle that has me scratching my head.
The new class hierarchy seems to want to ‘become’ the class that used to just ‘have’ the enum.
The enum could have been defined outside of the class, and be used in multiple classes, or structs, or just variables running around. If the enum and the associated switch statements are coded as a hierarchy of (data-member-free) classes, the original parent class does not naturally ‘have’ one of these new classes the way it would ‘have’ an enum field. It’s like the parent object is supposed to ‘have’ one of the classes, not any particular object in one of those classes. Instead, as in your example, it is the original parent class that should be subclassed.
Instead of a vector of structs defined like this (where INT and TEXT8 are enums):
const vector
<
section> section_vector = {
{“Intro”, 6, 4, INT },
{“CityName”, 128, 8, TEXT8 },
};
I think I will land up with a vector of objects that have to be defined like this:
const vector section_vector = {
blockInt {“Intro”, 6, 4},
blockText8{“CityName”, 128, 8},
};
Am I on the right track?
Permalink
Good article! There are tradeoffs as you point out and although OO proponents will sometimes tell you than switches (or if) should always be replaced by OO inheritance, there are plenty of cases in which it does not improve the code. What is especially interesting is that some paradigms will actually encourage you to use pattern matching, which is closer to switches than to inheritance. So clearly, it is not as simple as forcing everything into type hierarchies.
Permalink
Two quick thoughts of mine:
1) don’t forget making the destructor virtual if the class has virtual function unless the developer is 100% sure he won’t make base-class-pointer-toderived-class -object structure
2) as a side note, I would’t go to the other side of extreme with class hierarchies. complex class hierarchies make the program hard to understand. sometimes enum-like options are ok, sometimes some sort of “traits” is enough to configurate the object. sometimes even templates can entirely repace an inheritance.
Permalink
Hi David, thanks for your thoughts!
You have a point of course, in both matters. I did not include the destructors because they would not have added value to the example, and the classes are obviously incomplete anyways.