Discover more from Matt's Blog
Don’t give me another DSL. Give me a library
I don’t want to learn another domain specific language unless it’s an extension to a programming language. And what I really want are…
I don’t want to learn another domain specific language unless it’s an extension to a programming language. And what I really want are libraries for interacting with your system.
My biggest hesitation in learning Terraform is that I don’t want to have to learn Hashicorp Configuration Language (HCL). Instead, I’d rather have a library that writes out the JSON form of HCL that can be interpreted by Terraform. Then I can use all the strengths of my current language and libraries to define my configuration. Is anyone aware of such a library for Scala (Java) or Python?
As a graduate research assistant (GRA), I had to learn numerous domain-specific languages (DSLs) to perform my research. Here are a few instances I remember from back in the day (2008–2013):
Editor Configuration. First VIM and then EMACS
Visualization Tool Configuration. I believe VMD was the main program for visualizing simulations in 3D with various overlays and I seem to recall it using TCL for configuration. TCL was a reasonable choice among the options back when VMD was developed.
POV-Ray Tracer. To get sharper figures for publication and presentations, I learned this configuration language. Luckily, there was a great Python library for generating these configurations programmatically.
Simulation Engine Configuration. I used both NAMD and LAMMPS to simulate chemical systems. I was using LAMMPS so much that I wrote a Python library to generate LAMMPS configurations using programmatic templates. Side note, LAMMPS is a solid piece of software engineering infrastructure because it was developed by real software engineers under government contract. LAMMPS is what made my grad school experience bearable. It taught me what good C++ looks like.
LaTeX Configuration and Macros. Language for writing documents. I will never use LaTeX again. It is such a crufty tool to use and I don’t think the amazing formatting is worth it. I hope I forget everything I ever knew about LaTex.
BibTeX. Language for managing a database of thousand or so references for inclusion in LaTeX documents.
Note, I consider the configuration for all of these systems to be DSLs because they each have some combination of macros, loops, and conditionals. They may not all be Turing-complete, but they all strive to allow some level of programmatic configuration.
I was always frustrated to find the DSL lacking relative to the real programming languages I was used to. I would always search for a Python library to generate the DSL code. At times, I even wrote such libraries myself. It was just better to be able to use the full power of real language and its libraries when representing something programmatically.
Hence, going forward I want to avoid learning any new DSL.
I don’t even care if the DSL makes my code more concise. I can deal with more verbose code. At times, I may even include simple-format, handwritten configuration files in standard formats such as YAML, CSV, INI, or JSON. These are loaded in code using standard library parsers and my logic interprets the data to give me the most elegant way to represent my configuration in data. If your configuration can be represented in one of these standard formats then I’m open to using that.
Further, I can develop my own libraries for handling common tasks within the context of your tool. If these tasks are useful to others, then I can contribute them as Open Source code. We can all benefit from a collection of community developed libraries for working with your tech. I think AirFlow is a great example of doing this well. This workflow orchestration framework defines workflows in Python and there is a large collection of user-contributed modules for performing common tasks like interacting with AWS or GCP.
Update: I remembered that Spark gives the best of both worlds. You can configure Spark using basic, general format configuration files. You can also set configuration from the command line or programmatically within your Scala-based Spark job. You can even combine these approaches with default values coming from files and specific values overridden in a job. Owing to the simple format of these configuration files, they can be autogenerated with reasonable defaults within managed Hadoop services such as GCP Dataproc and Cloudera Director. This is what I think we should strive for in all systems that require configuration.
Whatever benefits your DSL provides will always be outweighed by all the benefits of a real programming language. Because I know as soon as I run into something that could be better represented using some sort of complex loop and business logic, I’ll wish I had a real language. I may even write a script to generate that specific part of the configuration and then use some sort of
include command. Over time that script may grow into a library so that I never have to write in your DSL again.
What do you think? Let me know in the comment section below.
To join our community Slack 🗣️ and read our weekly Faun topics 🗞️, click here⬇