YAML and Accidental Programming Language Design
I’m not a huge fan of YAML in general, but I do see it being useful for situations when a structured configuration language is needed. Something for which JSON would normally be used, but where human readability and maintainability is important.
What I don’t like is seeing YAML being used as a way to define a sequence of actions. I complained about Step Functions, but just as annoying is the various CI/CD services that use YAML as the means of defining the build sequence. Encoding a sequence of actions — complete with branches, loops, callouts, and error handling — using a language geared towards configuration is clunky at best, and I’m always curious as to why vendors choose to use it when a language more fit for purpose would be nicer to work with. (This rant is going to be about YAML but using JSON in these situations is just as bad, if not worse. At least YAML has comments).
Obviously being a well known syntax that is approachable and already has a number of available parsers is a huge advantage. But I don’t think that’s the whole story, as there are plenty of JavaScript parsers out there as well. It might be necessary to tweak the runtime a little, which is not easy, but certainly not beyond the skills of Amazon and the like. No, I wonder if the desirability of YAML here is the ease of symbolic manipulation, and something that I’ll call “functionality inflation” (this makes it sound like a bad thing, but I guess another way of putting it is “serving the insatiable needs of the user”).
I’ll touch on latter first, as it’s probably the most sinister one. It all begins with the first release of the feature, which starts off being very simply: just a series of actions that are executed in order. Here too is the moment where you need to define how these sequence of actions are to be encoded. Since the runtime is non-standard (in AWS Step Functions, actions may take up to a year to be completed), and the set of encodable actions is relatively small, it seems like going with a language like JavaScript would be overkill. Better off to to go with something simpler, like YAML. You don’t need to build the parser; you simply need to define how the actions look in terms of arrays and objects, since that’s what a parsed YAML document would give you. You choose a representation — say as an array, with each action being an object — and ship the feature. Easy enough.
The problem is that it never remains this simple. As time goes on, and people continue to use your system, their expectations of the system begin to grow. It eventually comes to the point where the single list of actions is not enough. Now you need to consider actions that should only run when a certain condition is met, or actions that should run when another action fails.
So you extend your YAML language. You add guards to your actions which will only run when the condition is true, or start adding catch clauses to the actions to jump to another one when an error occurs. You begin defining bizarre expression languages on top of your expanded YAML representation that never look quite right, and special fields on actions with act effectively like goto’s. Eventually, you ship that, and for a while your users are happy, but not for long. They want variables, or actions that run multiple times or over a set of items, or the ability to call out to other actions in another workflow so they don’t have to copy-and-paste, and on and on. One day, you wake up and you realised that you’ve accidentally built a Turing complete programming language. The only problem is that it’s written in YAML.
So what’s the alternative? I’d argue something that resembles a proper, Algol-type programming language is a superior method of representing these workflows. There are many benefits of doing so: the logic is clear, one step follows the other as you flow down the page. There are already standard conventions for indicating branching, loops, error handling and call-outs. And there’s a lot less line noise as you’re not mixing your logic in with your data and configuration.
I mentioned JavaScript earlier here, but would using it, or another language, be useful here? Perhaps. They’ve got available runtimes as well, but they may not be easy to work with at the source level. This touches on the second reason why I think YAML is used, which is ease of symbolic manipulation. Given that YAML is just a glorified way to represent arrays and objects, one could argue that the advantages of using YAML is that it’s just as easy to manipulate with machines as it is from a text editor. You could, for example, build a graphical designers on top of your definition language which will manipulate the YAML while preserving any edits done by hand. This is something that’s difficult to do with an existing language like JavaScript or Ruby. Difficult, but not impossible. Plus, there’s nothing stopping vendors from designing a language which is high level enough to be easily manipulatable from machines. It just needs to be easily parsed and unparsed between the code and the AST without any information loss. This can be baked in as a top-level requirement of the language during the design.
In any case, it doesn’t need to be this way. We shouldn’t need to encode our logic using a configuration language. We already have better facilities for doing this: called programming languages. I hope that next time a vendor wishes to build something that has a feature like this, they consider designing one instead of using something not quite fit for purpose like YAML.