Last week I wrote about what makes C++ a good choice as a host language for embedded DSLs. This week’s post will be about external DSLs with C++ as host language.
Although external DSLs need not necessarily be text based, I won’t go into graphical DSLs and other more exotic stuff here. I will concentrate on DSLs that can be written in a common text editor.
External DSLs with C++ compared to other languages
The main tasks that distinguish external from embedded DSLs are syntactic and semantic analysis, i.e. lexing and parsing, and interpreting the syntax tree or other structures that have been populated by the parsing step.
It is not easy to say why C++ would be worse or better than other languages when it comes to those tasks. If you want to write a parser and lexer by hand, one language is as good as the other, C++ does not have special features that would considerably facilitate those tasks.
However, for anything beyond a very simple DSL syntax and learning projects on how to write a parser by hand I would recommend to use tools and libraries that help with the mostly mechanical parts of lexing and parsing. Don’t try to reinvent the wheel.
When it comes to tooling, one might think that C++ should have an advantage, because many compilers and interpreters are written in C or C++ and there are tools that support those language implementations.
However, while those tools might be powerful and produce performant parsers, they are mostly targeted at general purpose languages which means they tend to be more generic and potentially more complex than simpler tools targeted at DSLs.
So, C++ tools might lose the race when it comes to ease of use. For example, there are well known tools specifically for DSL development in Java, for example AntLR and Xtext. The latter even gives you support for syntax highlighting and other cool stuff in Eclipse – you probably won’t find a tool like that for any C++ IDE.
The next level: Embedded DSLs in external general purpose languages
This is where C++ can shine again. Instead of inventing your own syntax, you can implement an embedded DSL in some scripting language and include an interpreter for that language in your program.
For many scripting languages, there are fast and lightweight interpreters available as libraries for C++, including good tools to translate objects from that language into C++ constructs and vice versa.
That way you get the best of both embedded and external worlds: You don’t have to care about the parsing and interpreting business, because that’s done by the interpreter in the library.
On the other hand, unlike embedded DSLs in C++, the scripts in that language can be interpreted at runtime, so you may load them dynamically from a file or even have the user type in small chunks of it when needed.
Integrating one of those interpreters in your program may be a slightly bigger step than developing your own interpreter for your custom DSL. In addition, be aware that you have less freedom in choosing the syntax, since the syntax of the embedded DSL is constrained by the syntax of the host scripting language.
On the other hand, once you have an interpreter for a scripting language in your application, it is much easier to add more DSLs than to implement each one from scratch.
When it comes to DSLs, you have the choice between embedding them in C++, parse them as external DSL or embed them into a scripting language. Each approach has it pros and cons, but they are all perfectly doable in C++.
” For many scripting languages, there are fast and lightweight interpreters available as libraries for C++, including good tools to translate objects from that language into C++ constructs and vice versa. ”
Could you suggest one such library which works as an interpreter for python3?
Also if possible could you elaborate more on this idea? The best thing would be with a sample implementation as an example for external DSL.
For python there are several libraries and binding generators: https://wiki.python.org/moin/IntegratingPythonWithOtherLanguages
As for external DSLs, a very simple example are configuration files. You usually parse them either with handcoded functionality or with a library. But they need to have a syntax and they are specific for the configuration of your app. That makes them a simple DSL.