How many grammars are needed? Only one. Let's see.

We’ll start with a thought experiment. Imagine:
-
We have a domain-specific language. For instance, the Bird DSL to define binary data formats.
-
We have two identical developers, except…
-
Developer #1 uses Notepad. For instance:
-
Developer #2 uses Visual Studio Code with syntax highlighting. For instance (same code as above):
Now, which developer is likely to be more productive?
Spoiler
Developer #2. That's the beauty of syntax highlighting: it visualizes both the structure and meaning of code, thereby reducing the cognitive load on the developer.
But how can we build the language technology needed?
Spoiler
Normally, it's hard. At Swat.engineering, though, we use state-of-the-art tooling to make it a lot easier. This blog explains how.
Syntax highlighting 101: What and why
Syntax highlighting is the use of special colors and fonts to display code. The idea is that additional visual cues enhance our ability to comprehend code and catch mistakes. It makes many of us more productive. And scientific evidence backs this up. Syntax highlighting has become one of the cornerstones of modern editors and IDEs.
This example illustrates the utility of syntax highlighting. How many of the following Java strings contain "hello, world" as a substring?
- Without syntax highlighting:
var s1 = /*"+*/"+/*/*"+"+"+"+/*"+/**//*+" "+hello, world"+*/"+*/"+/**//*"+"+"+"+"+"+"+*/""; var s2 = /**//**//**/"+"+/*"+"+"+*/"+"+ "+hello, world/*"+/**//*"+*//**//*"+*//**/"+";
- With syntax highlighting (exact same code):
var s1 = /*"+*/"+/*/*"+"+"+"+/*"+/**//*"+ "+hello, world"+*/"+*/"+/**//*"+"+"+"+"+"+"+*/""; var s2 = /**//**//**/"+"+/*"+"+"+*/"+"+ "+hello, world/*"+/**//*"+*//**//*"+*//**/"+";
Although this example is a bit silly, it hammers home the point: syntax highlighting improves productivity. If new languages – domain-specific or otherwise – are designed to increase productivity, then implementing a syntax highlighter should be worth the investment as well.
How to build a syntax highlighter
Building an effective syntax highlighter is hard because it needs to be both fast and accurate. That is:
- A superfast syntax highlighter that colors the wrong code is mostly distracting.
- A super accurate syntax highlighter that takes ages to color code is mostly annoying.
A common solution for this problem is to deliver speed and accuracy in two separate stages. The first stage is fast but less accurate. The second stage – also known as semantic highlighting – is slower, but accurate. So, each time code is edited, stage #1 immediately gives quick provisional results. Shortly after, stage #2 refines colors and fonts where needed (e.g. different highlighting for function parameters and local variables). This approach is used in all major IDEs, including Visual Studio Code, IntelliJ IDEA, and Xcode.
This two-staged approach is great for editor users. But is this also true for implementors of the editor? Essentially, two syntax highlighters need to be built: one for stage #1 and another for stage #2. This means that two versions of the language’s grammar need to be defined and maintained. These are typically written in very different notations. Little synergy is to be gained between them, scarcely any expertise can be transferred, and there is hardly any experience to share. It is simply double the amount of effort (if not more). Moreover, for type checking, smell detection, code generation, et cetera, different versions of the grammar are often required.
To cut a long story short, building a syntax highlighter is normally a serious investment.

Syntax highlighters at Swat.engineering
Our approach is different. We use the Rascal language workbench to define only one fully accurate version of a language’s grammar: a Rascal grammar definition. Using this single source of truth, our state-of-the-art tooling is powerful enough to automatically generate a two-stage syntax highlighter. By leveraging open technologies and modern standards (e.g. GLL parsing, TextMate tokenization, and the Language Server Protocol), our syntax highlighters can be readily packaged as extensions to Visual Studio Code. Moreover, when the Rascal definition evolves, the syntax highlighter can be automatically updated.
The Rascal definition of the grammar is also a key component of other language technology we build, such as typecheckers, smell detectors, code generators – you name it. So, there is a lot of synergy to be gained, and it is all modular.

Key takeaways
- Syntax highlighting improves the productivity of DSL users.
- However, building a syntax highlighter is normally a serious investment and maintenance burden on the DSL developers.
- At Swat.engineering, we build Rascal-based syntax highlighters using a single-source-of-truth philosophy. This improves the synergy among language tools and simplifies their maintenance.
Get in touch
We’re happy to discuss syntax highlighting or any other language engineering challenges you face.
Header image is an combination of Kingfisher Natural Nird Wildlife by Ellen Chan from Pixabay and the output of PBNify