Domain Specific Languages in C++ – Part 2: External DSLs

Last week I wrote about what makes C++ a good choice as a host language for embedded DSLs. This week’s post will be about external DSLs with C++ as host language.

Although external DSLs need not necessarily be text based, I won’t go into graphical DSLs and other more exotic stuff here. I will concentrate on DSLs that can be written in a common text editor.

External DSLs with C++ compared to other languages

The main tasks that distinguish external from embedded DSLs are syntactic and semantic analysis, i.e. lexing and parsing, and interpreting the syntax tree or other structures that have been populated by the parsing step.

It is not easy to say why C++ would be worse or better than other languages when it comes to those tasks. If you want to write a parser and lexer by hand, one language is as good as the other, C++ does not have special features that would considerably facilitate those tasks.

However, for anything beyond a very simple DSL syntax and learning projects on how to write a parser by hand I would recommend to use tools and libraries that help with the mostly mechanical parts of lexing and parsing. Don’t try to reinvent the wheel.

Tooling

When it comes to tooling, one might think that C++ should have an advantage, because many compilers and interpreters are written in C or C++ and there are tools that support those language implementations.

However, while those tools might be powerful and produce performant parsers, they are mostly targeted at general purpose languages which means they tend to be more generic and potentially more complex than simpler tools targeted at DSLs.

So, C++ tools might lose the race when it comes to ease of use. For example, there are well known tools specifically for DSL development in Java, for example AntLR and Xtext. The latter even gives you support for syntax highlighting and other cool stuff in Eclipse – you probably won’t find a tool like that for any C++ IDE.

The next level: Embedded DSLs in external general purpose languages

This is where C++ can shine again. Instead of inventing your own syntax, you can implement an embedded DSL in some scripting language and include an interpreter for that language in your program.

For many scripting languages, there are fast and lightweight interpreters available as libraries for C++, including good tools to translate objects from that language into C++ constructs and vice versa.

That way you get the best of both embedded and external worlds: You don’t have to care about the parsing and interpreting business, because that’s done by the interpreter in the library.

On the other hand, unlike embedded DSLs in C++, the scripts in that language can be interpreted at runtime, so you may load them dynamically from a file or even have the user type in small chunks of it when needed.

Some examples for scripting languages are Python, Lua, Javascript and ChaiScript, but you could as well embed Lisp in your C++ program.

Integrating one of those interpreters in your program may be a slightly bigger step than developing your own interpreter for your custom DSL. In addition, be aware that you have less freedom in choosing the syntax, since the syntax of the embedded DSL is constrained by the syntax of the host scripting language.

On the other hand, once you have an interpreter for a scripting language in your application, it is much easier to add more DSLs than to implement each one from scratch.

Conclusion

When it comes to DSLs, you have the choice between embedding them in C++, parse them as external DSL or embed them into a scripting language. Each approach has it pros and cons, but they are all perfectly doable in C++.

Facebooktwittergoogle_plusredditlinkedinFacebooktwittergoogle_plusredditlinkedinby feather

Leave a Reply

Your email address will not be published. Required fields are marked *