Indexing In UCL
I’ve been thinking a little about how to support indexing in UCL, as in
getting elements from a list or keyed values from a map. There already
exists an index
builtin that does this, but I’m wondering if this can
be, or even should be, supported in the language itself.
I’ve reserved .
for this, and it’ll be relatively easy to make use
of it to get map fields. But I do have some concerns with supporting
list element dereferencing using square brackets. The big one being that
if I were to use square brackets the same way that many other languages
do, I suspect (although I haven’t confirmed) that it could lead to the
parser treating them as two separate list literals. This is because the
scanner ignores whitespace, and there’s no other syntactic indicators
to separate arguments to proc calls, like commas:
echo $x[4] --> echo $x [4]
echo [1 2 3][2] --> echo [1 2 3] [2]
So I’m not sure what to do here. I’d like to add support for .
for
map fields but it feels strange doing that just that and having nothing
for list elements.
I can think of three ways to address this.
Do Nothing — the first option is easy: don’t add any new syntax to
the language and just rely on the index
builtin. TCL does with
lindex, as does Lisp with nth, so I’ll be in good company
here.
Use Only The Dot — the second option is to add support for the dot
and not the square brackets. This is what the Go templating language
does for keys of maps or structs fields. They also have an index
builtin too, which will work with slice elements.
I’d probably do something similar but I may extend it to support index
elements. Getting the value of a field would be what you’d expect, but
to get the element of a list, the construct .(x)
can be used:
echo $x.hello \# returns the "hello" field
echo $x.(4) \# returns the forth element of a list
One benefit of this could be that the .(x)
construct would itself be a
pipeline, meaning that string and calculated values could be used as
well:
echo $x.("hello")
echo $x.($key)
echo $x.([1 2 3] | len)
echo $x.("hello" | toUpper)
I can probably get away with supporting this without changing the scanner or compromising the language design too much. It would be nice to add support for ditching the dot completely when using the parenthesis, a.la. BASIC, but I’d probably run into the same issues as with the square brackets if I did, so I think that’s out.
Use Parenthesis To Be Explicit — the last option is to use square brackets, and modify the grammar slightly to only allow the use of suffix expansion within parenthesis. That way, if you’d want to pass a list element as an argument, you have to use parenthesis:
echo ($x[4]) \# forth element of $x
echo $x[4] \# $x, along with a list containing "4"
This is what you’d see in more functional languages like Elm and I think Haskell. I’ll have see whether this could work with changes to the scanner and parser if I were to go with this option. I think it may be achievable, although I’m not sure how.
An alternative way might be to go the other way, and modify the grammar rules so that the square brackets would bind closer to the list, which would mean that separate arguments involving square brackets would need to be in parenthesis:
echo $x[4] \# forth element of $x
echo $x ([4]) \# $x, along with a list containing "4"
Or I could modify the scanner to recognise whitespace characters and use that as a guide to determine whether square brackets following a value. At least one space means the square bracket represent a element suffix, and zero mean two separate values.
So that’s where I am at the moment. I guess it all comes down to what works best for the language as whole. I can live with option one but it would be nice to have the syntax. I rather not go with option three as I’d like to keep the parser simple (I rather not add to all the new-line complexities I’ve have already).
Option two would probably be the least compromising to the design as a whole, even if the aesthetics are a bit strange. I can probably get use to them though, and I do like the idea of index elements being pipelines themselves. I may give option two a try, and see how it goes.
Anyway, more on this later.