Visitor Pattern
In this article I will show what the Visitor Pattern is and how it can help you write robust and SOLID code when facing said adversaries. The example application will be written in TypeScript.
Last updated
Was this helpful?
In this article I will show what the Visitor Pattern is and how it can help you write robust and SOLID code when facing said adversaries. The example application will be written in TypeScript.
Last updated
Was this helpful?
Every developer working with compositional data structures (Hello, !) will eventually come across the Visitor Pattern, especially if their data is very heterogeneous (e.g. different types of tree nodes) and needs to work with an open set of operations on these.
In this article I will show what the Visitor Pattern is and how it can help you write robust and code when facing said adversaries.
I will exemplify the pattern with the help of a real-world example written in TypeScript. The example will be a simple notebook app. We will start with a naive implementation of basic features and see where our code needs improving.
Along the way you will learn about the following core concepts of software development in general, and Object-Oriented Programming specifically:
Compositional data types: trees and nodes
KISS principle: Keep It Simple, Stupid!
Polymorphism and polymorphic behavior in OOP
SOLID Code I: Single-Responsibility Principle (SRP)
SOLID Code II: Open-Closed Principle (OCP)
Double Dispatch
Visitor Pattern
You can get the .
Imagine writing a notebook app for your fellow developers. These notebooks are hierarchical in nature: A notebook contains pages, a page contains cells. A cell is generally any kind of specific content that can be displayed and interacted with on a page, e.g. text, source code, images.
If these features sound familiar, you are absolutely right: We are talking about here.
Given the key players just mentioned, we can draw a class diagram like the following:
As can be seen from the diagram, all players involved here inherit from the base type Node
, which is the key element of our tree structure. The tree is a compositional data type (every Node
instance can have multiple child nodes and, hence, a parent node) with heterogeneous data items (nodes of different types). Node
subtypes are Notebook
, Page
, and Cell
. The Cell
type is an abstract type inherited from by different content cells, namely TextCell
(for text), SourceCodeCell
(representing source code), and ImageCell
(for images). It is more than likely that other content cells will be added in the future.
So far, this looks all fine and dandy. Now, imagine your colleagues having already written the entire stack for your notebook application: The backend is running smoothly and the frontend lets your notebooks shine in a modern javascript-based web app. Now comes your part!
Your boss enters the stage and asks you to implement a new feature. Users are very happy with the app but feel a bit locked in. They would very much like to be able to export their notebooks to Markdown, XML, and HTML (for starters). Keep in mind that this is a potentially open-ended feature, other formats to export to can be and will be added in the future.
So, how would you implement the export feature?
To keep things simple, you feel pretty confident that you can just add an exportToHtml()
method to the Node
type:
Nice, that surely looks simple!
You pause for a second and contemplate your work: This is a good start, but you could potentially improve this a lot by making the method abstract. That way each concrete implementation of Node
(e.g. ImageCell
, TextCell
) can produce a specific HTML output. After all, an ImageCell
will very likely produce HTML that is markedly different from that of a TextCell
.
So, you go ahead and make the exportToHtml()
method abstract and have your different Node
subtypes implement it to their specific needs. Here are two examples:
You pause again to think: Now that you have added the export functionality for HTML, you just have to add the same logic for the remaining export formats. And again, you would make those new methods abstract and implement them in their specific sub types. It's quite a lot of typing, and starts to feel a bit verbose, but for now you just plug away...
Wow, now that escalated quickly. Look at all that code!
Something doesn't feel right here. Your code surely looks simple, but you don't like the idea of having to add a new method for every new export format that could potentially come your way in the future. It works, but this is one hell of a maintenance nightmare.
Clearly, the code you just wrote is open for extension (you can add any export format you want) but it is definitely not closed for modification. On the contrary: with each new export format you would have to modify the Node
type and its various sub types.
But there is more to it: The entire export logic is distinctly different from what those Node
subtypes usually deal with. Their core logic revolves around the management of Page
s (for Notebook
s), Cells
(for Page
s), text (for TextCell
s), source code (for SourceCodeCell
s), etc. The export logic you just added feels misplaced (not to mention the fact that the various target formats of HTML, Xml, and the like, are themselves incompatible to one another!).
The problem with our export logic inside the Node
subtypes is that whenever the requirements regarding the node subtype logic change, you would have to modify the subtypes. Likewise, whenever the requirements for any of the export features change (new version of HTML, Markdown, Xml, or the like), you would have to modify the subtypes yet again. Those are too many reasons for your classes to change, hence too many responsibilities.
Looking at your code you come to the conclusion that something is going terribly wrong. By adding just a few simple features, and despite your best efforts of KISSing and using polymorphism, you have managed to violate two core SOLID principles (OCP and SRP). That's quite an achievement!
Now let's see how we can get out of this mess to make both you, your fellow developers and your boss happy.
So far, the core problem is this: We have a heterogeneous, compositional data structure (our tree with its TextCell
s, SourceCodeCell
s, ImageCell
s, Page
s, ...) that needs to accommodate different types of operations (exportToHtml()
, exportToXml()
, exportToMarkdown()
, ...). These operations are very specific to the types they operate on and the entire set is principally open-ended, meaning that it is more than likely that more operations will have to be added at a later point.
Extending the set of operations should, however, not lead to a modification of the existing class hierarchy. It should be possible to extend the current functionality without modifying the existing classes, thus keeping in line with the Open-Closed Principle.
Moreover, we need to add new operations in such a way that classes remain in charge of single responsibilities, thus obeying the Single Responsibility Principle.
We shall implement the Visitor pattern for our solution in two simple steps.
At this point we know we have to keep the export logic in separate classes in order to satisfy the SRP. We will have to write separate classes, one for each export format. All these classes will have to accept any Node
subtypes in order to export the entire Notebook
.
Let's create a simple interface to this logic and name it NodeExporter
:
We can now implement the different NodeExporter
subtypes according to the formats required. Let's begin with the HTML export feature:
There it is — nice and crispy: One responsibility for the HtmlExporter
class, namely exporting various node types to HTML.
Let's do the same for the XML export feature:
We have implemented the two export features in separate classes, keeping perfectly in line with the SRP. Now how can we plug our classes into our existing class hierarchy?
Now we need to find a way to have any concrete NodeExporter
implementation (e.g. XmlExporter
and HtmlExporter
) operate on our class hierarchy. Specifically, we need to loosely couple those implementations to our class hierarchy, so as to follow the OPC.
We could take a Notebook
instance, traverse its tree structure and feed every node inside of it to the desired NodeExporter
implementation. As Notebook
and all types contained therein derive from Node
, we can just extend its interface as follows:
Essentially, such double dispatching makes use of polymorphic behavior — very much the same way we did in our initial (naïve) implementation.
Here are a few examples of what our code would look like now with Double Dispatch in place:
Let's have a look at what we have accomplished and test our implementations.
First, we will instantiate our Notebook
and add some Page
s and Cell
s to it:
Now, let's see how we can export our Notebook
instance to Html:
This should yield the following console output:
And the same with Xml:
And this is what we get in the console:
That's it! We have used the Visitor Pattern to keep our codebase clean and SOLID. We can now add any sort of export format by just implementing the NodeExporter
interface in another type. We will not have to touch any part of our existing Node
s class hierarchy.
Adding new Node
sub types requires us to extend the NodeExporter
interface with just another method name (or method overload in languages that support these) to make sure that existing format exporters retain compatibility with the new tree structure.
Congratulations, you have just learned about a good deal of standard programming and OOP concepts! If you would like to dig deeper, try to implement a JsonExporter
, or add a LinkCell
to the class hierarchy of our Notebook
app.
Let's start exporting our notebooks to HTML. No biggie, you think and get to work. You know about the and what it requires you to do now:
What you are doing here now is thinking in terms of a core concept of Object-Oriented Programming (OOP) called :
Indeed, looking at your code thoroughly it starts dawning on you that you have just violated one of the core SOLID principles, namely the :
The export logic being part of the core Node
subtypes logic now violates yet another principle: the :
These requirements are a perfect fit for the :
This way we could feed to any of our Node
sub types an instance of any concrete NodeExporter
implementation, and this in turn would call the proper method on the NodeExporter
instance passing itself along to it. Such double calling of methods is called and has a long history in OOP.