Visitor Pattern Part 1- the Object Oriented Way

If you have read the “Gang of Four” book about design patterns or just have been long enough in software development, you will have heard of the Visitor pattern. In its fully object oriented manifestation, this pattern can be rather complex.

Example

Let’s consider a simple parser for a small external DSL, e.g. for some mathematical expressions. Among other things, the parser may have classes to represent all kinds of expressions in an abstract syntax tree (AST).

Those expressions can be numbers, unary operators applied to a subexpression, and binary operators applied to subexpressions. Here is a selection of a few simple AST classes:

class Expression {
public:
  virtual ~Expression() = 0;
};
inline Expression::~Expression() = default;
using ExpressionPtr = std::unique_ptr<Expression>;

class BinaryExpression : public Expression {
  ExpressionPtr lhs;
  ExpressionPtr rhs;
public:
  BinaryExpression(ExpressionPtr left, ExpressionPtr right) 
    : lhs(move(left)), rhs(move(right))
  { assert(lhs && rhs); }
  
  Expression& left() { return *lhs; }
  Expression& right() { return *rhs; }
};
  
class AddExpression : public BinaryExpression {
public:
  using BinaryExpression::BinaryExpression;
};
    
class MultiplyExpression : public BinaryExpression {
public:
  using BinaryExpression::BinaryExpression;
};

class NumberExpression : public Expression {
  double number;
public:
  NumberExpression(double d) : number(d) {}
  double getNumber() const { return number; }
};

A snippet of that DSL could look like `3 + 4 * 6`. It’s AST could then be created like this:

auto expression = std::make_unique<AddExpression>(
  std::make_unique<NumberExpression>(3),
  std::make_unique<MultiplyExpression>(
    std::make_unique<NumberExpression>(4),    
    std::make_unique<NumberExpression>(6)
  )
);

Visitor pattern – the object oriented way

This is all pretty straight forward. However, we already see that the `AddExpression` and `MultiplyExpression` are essentially the same, as would be a `SubtractExpression`, `DivideExpression`, `LogicalOrExpression`, `GreaterExpression`, and so on.

Now imagine we would like to work with the AST. There usually are a bunch of different things we could do with it: Print the expression, print or otherwise display the tree structure itself, calculate the result of our expression, and so on.

All those actions are not part of the tree’s behavior. The tree is merely a data structure, and the behavior belongs to an expression printer, a tree display and a calculator.

This is a classical example for the visitor pattern: Whenever you have a hierarchy of classes and a set of actions that do belong to external classes, it is a hint that the visitor pattern should be applied. More so if the classes are less likely to change than the external actions.

The basic idea

The basic idea of the visitor pattern is to have a `Visitor` base class that visits a bunch of objects of the class hierarchy (i.e. the `Expression`s) in question. It calls an `accept` or `acceptVisitor` method on each object.

class ExpressionVisitor;

class Expression {
  //...
public:
  virtual void accept(ExpressionVisitor&) = 0;
};

This method in turn is implemented in each class of the hierarchy. Its responsibility is to call back a `visit` method on the visitor specific to the visited object’s class. In our case those could be named `visitAdd`, `visitMultiply`, `visitNumber` etc.

class ExpressionVisitor {
public:
 virtual void visitAdd(AddExpression&) = 0;
 virtual void visitMultiply(MultiplyExpression&) = 0;
 virtual void visitNumber(NumberExpression&) = 0;
 //...
};
class AddExpression : public BinaryExpression {
  //...
public:
  void accept(ExpressionVisitor& visitor) override {
    visitor.visitAdd(*this);  
  }
};
// repeat for all Expression subclasses

Now we can derive a special visitor for each external action from the visitor base class and implement these class specific `visit` methods.

class ExpressionPrinter : public ExpressionVisitor {
  std::ostream& os;
  
  void visitBinaryExpression(BinaryExpression& binExpr, std::string const& infix) {
    binExpr.left().accept(*this);
    os << infix;
    binExpr.right().accept(*this);
  }
  
public:
  ExpressionPrinter(std::ostream& ostream) : os(ostream) {}
  void print(Expression& expr) {
    expr.accept(*this);
    os << '\n';
  }

  void visitAdd(AddExpression& addExpr) override {
    visitBinaryExpression(addExpr, " + ");  
  }
  void visitMultiply(MultiplyExpression& mulExpr) override {
    visitBinaryExpression(mulExpr, " * ");  
  }
  void visitNumber(NumberExpression& numExpr) override {
    os << numExpr.getNumber();
  }
};

You can see the full code for the current state on this revision of my GitHub repository.

Taking stock

Let’s gather the number of classes and methods we have now: We have one abstract visitor base class and one concrete visitor for each external action. Let’s call that latter number of actions A.

We also have a number of abstract classes for the expression class hierarchy, and one concrete class for each different flavor of expression (Add, Multiply, …) I’ll call the number of concrete expression classes E.

Each concrete expression class has to implement the accept method which is a trivial one-liner – but it has to be done, E times. Each concrete visitor has to implement the visit method for each concrete expression, which makes a total of E × A visit methods.

If we have A different actions that really do different things for each expression flavor, there is no way around the E × A complexity. However, if we look at the expression classes, we have lots of repetition.

Except for the getters and constructors there is only one single function in each expression class that actually does something: The `accept` method.

Conclusion

You see, if we really stick to the book, we come out with an implementation that is rather complex for this otherwise simple example.

Next week I will pick up at this point and show an alternative implementation that does have less impact on the side of the expression classes.

Previous Post
Next Post

11 Comments


  1. when you are building the hierarchy of vistitor classes, why you are using different functions like :
    VisitMultiply
    VisitAdd
    instead of overloading ?
    just only one Visit with different signatures – because these take a different visited classes

    Reply

    1. Hi Ernst,
      thanks for that question. You normally can use both versions. There may however be situations where overloading a general visit method can make problems: Imagine a not so simple visitor hierarchy where you do not only call visit in the dispatch methods of the Expression classes but also call them explicitly in some of the most derived visitors. If you also happen to overload only some of the visit methods in those most derived visitors, the methods you did not overload are hidden in that class unless you explicitly import them with using directives from the base class.

      It also is a more secure and readable way if you migrate to the enum based visitor pattern I write about in the next post. There you also have a visit method for the base class, and just calling visit on an arbitrary Expression object does not make it clear whether we are calling the dispatch method visit(Expression) or simply visit(AddExpression).

      Another reason why I picked up the habit of writing the action explicitly instead of overloading a general visit method is that some compilers warn about the hidden overloads I mentioned earlier. The really bad and bugged ones warn even if the other overloads are not hidden because you have a using directive.

      Reply

  2. Nice post as usual.
    I guess we can simplify the usage of multiple operations inside visitor at the expense of exposing hierarchy structure to visitor sub-classes. An example is given below,

    class Visitor;

    // We want to add new behaviours to Expression class structure.

    // Interface of hierarchy to which we want to add operations on demand.
    class Expression {
    public:
    virtual void Accept(Visitor &) = 0;
    };

    // Concrete Element
    class ExpressionAdd : public Expression {
    public:
    void Accept(Visitor &v);
    };

    // Concrete Element
    class ExpressionMultiply : public Expression {
    public:
    void Accept(Visitor &v);
    };

    // Visitor Base
    class Visitor {
    public:

    virtual void Visit(Expression &e) = 0;
    };

    // Operation Print (Concrete Visitor)
    class OpearationPrint : public Visitor {
    public:
    void Visit(Expression &e);
    };

    // Definitions
    void ExpressionAdd::Accept(Visitor &v) {
    cout << "Add expression visited" << endl;
    v.Visit(*this);
    }

    void ExpressionMultiply::Accept(Visitor &v) {
    cout << "Multiply visited" << endl;
    v.Visit(*this);
    }

    void OpearationPrint::Visit(Expression &e) {
    // Here OpearationPrint should be friend of each class of Expression hierarchy
    cout << "Printing the expression" << endl;
    }

    // Client
    void Print(Expression &e) {
    OpearationPrint x;

    e.Accept(x);
    }

    void Client() {
    ExpressionAdd oa;
    ExpressionMultiply ob;

    Expression &a = oa;
    Expression &b = ob;

    Print(a);
    Print(b);
    }

    int main() {
    Client();

    return 0;
    }

    Please comment on downsides of the above implementation.

    Reply

    1. Sorry to spam. Got the issue, each visitor subclass’s visit() implementation needs to know the type of passed Expression object in order to print the concrete expression. It violates OCP.

      Reply

      1. Thanks for you comment!
        Visitor and derivates will have to be extended anyways if there ever is a new Expression class. I will show a simpler implementation that goes even further in the direction you point out next Wednesday.

        Reply

  3. Thanks for post. Since yesterday i was trying to fit new solution into existing code. Just figured that out. Not only Visitors are good for composite, even better when data is organized into Graphs{DAG}.
    Graph-Iterators-visitors and Surprisingly code looks better and solves too many complications by separating my graph-walk and graph-data. Cheers!

    @Arne Mertz Any pitfalls this design?

    Reply

  4. Do you see that pattern often at work/projects you’ve been? To be honest, I haven’t seen that pattern so far.

    I’ve used the pattern in my projects while I was studying – I had some visitors that run on my render tree in OpenGL apps.

    Reply

    1. I have. I have worked with a project where a large configuration tree in the GUI was mirrored by a corresponding model tree. It was visitors all the way.
      I then wrote a DSL for configuration analysis – and the interpreter basically had to walk the AST and the configuration tree simultaneously. Double visitor, yay 😉 The AST visitor wasn’t full OOP though, I’ll write about that next time.

      Reply

      1. ok, so visitors do exist in real projects! 🙂 even double visitors… sounds scary.

        ok, so I look forward to reading the next post about visitors.

        Reply

    2. Well I have seen it on a larger project where we had a huge configuration tree in the GUI, mirrored by another tree in the model layer. It was visitors all the time.
      Later I wrote a DSL to be able to analyse that configuration, and the interpreter had to walk both the configuration tree and the DSL AST. Double visitors, yay 😉
      The AST visitor wasn’t a full OO visitor though, but a simplified implementation I’ll write about next week.

      Reply

    3. Visitors in combination with variants (like Boost.Variant) are a real alternative to classical OO polymorphism and we often use it in the real world.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *