visitor
A behavioral pattern
Define new operations without changing the classes of the elements on which they operate.
PNG
SVG
hide notes
The visitor Design Pattern
Frequency
Complexity

The Visitor pattern is one of the classical design patterns initially described by the Gang of Four. While you might not need it that often for your own code, chances are high that you will find it in a third-party library you're using (check, for example the Roslyn Compiler).

The pattern can be handy when you have to add functionality to large or complex object structures but don't want to (or can't) edit the types of objects within those structures. The reason why it's not used more often might be that it uses a technique called double-dispatch (more on this later), which can become a bit confusing at first sight.

A small example

But to take a more in-depth look into this pattern's architecture, let's start with a small example:

Let's assume we want to model the employment structure in a software development company. We take a very simplified approach and say that our company's only employees are developers, who are either juniors or seniors.

A developer class and two subclasses, for junior and senior developers.
Our company has developers who can either be juniors or seniors.

While it would also make sense to use an abstract base class or an interface for the Developer base class, we skip this here for simplicity.

Please note that this design is only for demonstration, and would not be a good fit for a real-life scenario. But for our example, let's assume that it's sufficient enough.

Separation of Concerns

The first task we get from our product owner is to create a list with the names of all developers in our company.
We could simply add a print method directly to the Developer class. Still, to separate our concerns, we prefer to put our printing logic into an external class. In that way, our Developer class contains only the relevant domain logic (which, I have to admit, is currently relatively sparse).

Since our external class has to visit each Developer object and request its name for printing it, we will call it PrintingVisitor.

The PrintingVisitor is an external class that prints the name of developers.
The PrintingVisitor is an external class that handles the printing logic.

By doing this, we've already successfully implemented the first part of our Visitor pattern: We extended our existing objects with new functionality without modifying their types. This is especially useful if the classes to which we want to add functionality are part of an external library and we can't modify or extend their implementation.

To finally print our developers' names, we just have to organize them in a simple list and call our visit method on each object.

const developers: Developer[] = [
    new JuniorDeveloper('Bob'), 
    new SeniorDeveloper('Lisa')
];

const visitor = new PrintingVisitor(); 
developers.forEach(d => visitor.visit(d));

But we are not finished yet, as there is another problem we have to solve...

Implementing Type-Specific Functionality

Let's assume that we get a new requirement from our PO: The experience level (junior or senior) should be printed right besides the developer's name.

Again, this seems like an easy task. We just have to add some method overloads to our PrintingVisitor, each overload handling a specific developer subtype.

A PrintingVisitor class with sevral overloads for the visit method.
Our new PrintingVisitor implementation uses one overload per concrete developer type.

Note that at compile time, the object's type we're visiting is always Developer, as this is also the type of our Array (see the earlier code snippet). However, our runtime environment should be able to choose the correct overload, depending on the actual type of our object (either Junior or Senior), or not?

While that sounds good, it does not work that way, at least not for most common OOP languages...


No matter which concrete type our Developer object has at runtime, we always end up with a call to the overload that accepts the base type: visit(developer: Developer).

The reason is that most languages (Java, C#, TypeScript, Python - to only name a few) implement a so-called single dispatch approach: During a method dispatch, the choice of which method to call depends only on a single object - the one which implements the operation (or receives the message - to stay in strict OOP terms). In our case, we only have a single class in our hierarchy here, the PrintingVisitor, so there is not much to choose from.

But this also means that the concrete parameter type will not be resolved during the dispatch operation. Since, at compile time, our parameter is of the type Developer, the dispatcher will only look for the method with this exact signature. Whether the actual type during execution might be Junior Developer or Senior Developer does not have any effect on the final choice.

While this might sound like a substantial limitation, it helps to keep the dispatcher's implementation efficient and transparent. For most cases, this single dispatch approach is also sufficient. But there are cases, like the one described earlier in our example, where choosing the correct method depending on its receiver and its parameter might be helpful. And luckily, there is also a solution for that - it's called Double Dispatch.

Double Dispatch

Our problem is that, during runtime, only the compile-time parameter type is considered when choosing the correct method implementation. Therefore, if we want to ensure that the dispatcher calls the correct method, we must provide it with the proper parameter type - and that already at compile time.

To achieve this, we use a small trick: In our Developer class, we add a new method, accept, that takes the visitor as its input parameter. Inside this method, we call the visit method with our this pointer, and voilĂ : We have ensured that visit is called with the correct type.

If you look at these invocation, it becomes obvious why this approach is called double-dispatch: We use two method calls to narrow down the types involved in the process:

  1. First dispatch: developer.accept(visitor); and then
  2. Second dispatch: visitor.visit(this);

What's important: For this double-dispatch to work, we must override the accept method in every subtype of our Developer class. Otherwise, we would still be doing a single dispatch.

Please check the following diagram for the complete design. Each subtype of the developer class implements its own accept method:

The Developer classes implement accept methods to provide the visitor with the correct type.
Using an accept method to implement double-dispatch.

If we apply this design to our implementation from the beginning, it would produce the expected results:

const developers: Developer[] = [
    new JuniorDeveloper('Bob'), 
    new SeniorDeveloper('Lisa')
];

const visitor = new PrintingVisitor(); 
developers.forEach(d => d.accept(visitor));
// prints:
// Bob (junior)
// Lisa (senior)

As mentioned, this approach might look slightly confusing as it changes the original invocation direction. In our initial design, the visitor first called the developer, but now, the developer calls the visitor instead.

Traversing Internal Object Structures

We've seen that adding the accept method allows the dispatcher to choose the correct method overload. But that's not all, this additional indirection also has another advantage: We can now even traverse complex object structures easily.

In our previous example, our object structure was flat (a simple array of developers), and each element was easily accessible. However, in most real-world scenarios, chances are high for us to deal with more complex object structures, like trees or other hierarchical constructs.

To reflect this, we also update our example and add a new class, the Department Head. Each head supervises several developers, which we organize in asubordinates list.

A 'Department Head' class has a list of subordinates.
A more complex object structure:
Department Heads have subordinates.

Please note that the subordinates list is defined as private (signaled by the - in front of the name) and thus not accessible from outside.

But this restriction is not a big issue thanks to our accept method. We can override the operation in the Department Head class and implement custom logic that lets the visitor first visit the head itself, then traverses through the child list and let each child accept the visitor again. In that way, we can keep the internals of our object structure private. Still, each element can be reached by our visitors.

Conclusion

That's basically all to know about the Visitor pattern. To sum it up:

  1. It allows us to add new functionality to existing objects without changing their implementation. Instead, we can factor out this logic into separate Visitor classes. This is especially handy when you have to add functionality to external libraries. All the library objects have to do is provide an accept method for your custom visitors.
  2. By using a double-dispatch mechanism, we can execute type-specific visiting logic without the need for explicit type casts or reflection/introspection. This feature might be the most confusing part of the pattern as it turns the invocation direction around. Still, it also makes the Visitor pattern very flexible and is one of its key concepts.
  3. We can traverse even complex object structures without exposing their internal layout. The order and control of how the objects are traversed stays within the objects.

Let's now have a short outlook at some potential alternatives and things to consider when implementing this pattern.

FAQ

This double-dispatching is quite confusing. Isn't there an easier way?

While most OOP languages support only single dispatching, there are some constructs in modern languages to work around this limitation. Let's look at some of them:

Using runtime type information. This might be one of the most prominent approaches. We could simply use a single visit method where we use a technique like reflection or introspection to check the concrete type of the object we're visiting. Consider, for example, the following implementation:

public void visit(element: Element) {
    if(element is ConcreteElementA) {
        // visiting logic for ConcreteElementA
    } else if(element is ConcreteElementB) {
        // visiting logic for ConcreteElementB 
    } else {
        // default case
    }
}

This approach and similar ones (for instance, by using pattern matching) don't require double-dispatching, making an additional accept method obsolete. However, one drawback of these techniques is that the logic for all types is concentrated in a single visit method, making it more challenging to maintain in the long run. Despite that, we've also seen that the accept method can contain additional logic that we would otherwise have to place inside our visitor. Also, the reflection-based versions require an additional cast which could affect the performance, especially when traversing very large object structures. Nevertheless, these approaches can still be a viable solution, especially for smaller use cases.

Using dynamic dispatching. Languages like C# allow switching between the default (static) runtime and a dynamic runtime (DLR). Using the DLR, the dispatcher dynamically resolves the concrete parameter types at runtime and can therefore find a more suitable method override. This also removes the necessity of an additional accept method we needed for double dispatching to work. To activate dynamic dispatching in C#, all we have to do is annotating the parameter of our visit method with the dynamic keyword:

visitor.visit((dynamic)element);

While this approach does work, opinions differ somewhat on whether this is a good solution. One the one side, it reduces the overall code required for the Visitor, but dynamic dispatching can also slightly affect the performance (see this benchmark for a comparison). Also, some people argue that using the DLR in a situation where it's technically not required can harm the readability of your code.

How do I know whether this pattern is a good fit for my concrete use case?

While this might depend on several factors, There are some indicators that you might consider:

Do I have an object structure that can accept visitors? This one is relatively obvious: For double-dispatch to work correctly, you need some kind of accept method implemented by the objects of your structure to call your visitors back. If such a mechanism is available (or could be easily added by you), your system might be a potential candidate for implementing the Visitor. If this is not the case, but you still want to use this pattern, you could try one of the approaches mentioned earlier that doesn't require double-dispatching.

Is the type hierarchy of my object structure stable? If you often find yourself adding new subtypes to the hierarchy of objects you're visiting, then using the Visitor pattern might be problematic. Each new subtype requires you to update all your existing visitors with an additional visit method for this new type. Depending on the number of Visitor classes you have, this can become a considerable implementation effort. Having a default implementation might help reducing this effort, however, such an implementation is not always possible. In that case, adding the functionality directly to the object structure might be a better solution.

Can my visitors access all relevant information? Note that when using the Visitor pattern, you move implementation out of your object structure into your visitors. That means that the visited elements must make all required data publicy available and accessible by your visitors. Otherwise, you can not externalize this logic. So if the public interface of the objects you want to visit is insufficient, you better extend the objects themselves - unless your language supports constructs like the friend keyword in C++ that allows you to access otherwise private or protected members.