Gave the sample Storytime episode for my train line a try, and it’s not for me. Aside from being something not available wherever I get my other podcasts, the sample was really overproduced, with backing audio and cheesey sound effects. Not a fan of those sorts of podcasts.
The nature of AWS is that, even with things like ChatGTP, there are still traps laying about for those poor souls that don’t know what they don’t know. For example: did you know that you cannot immediately delete a secret value? You can only “schedule” it to be deleted at a future date that’s no earlier than 7 days from now. The secret won’t show up in the console, but you can’t use the same secret ID until it’s actually gone.
So good luck recovering from any mistakes you’ve made creating a secret via the AWS console instead of using Cloud Formation, like I did today. I guess some things’ll never change.
Been working on a Cloud Formation stack that defines IAM resources: roles, policies, profiles, etc. I can do a little bit already, like change policy documents, but writing this all from scratch is beyond me. ChatGPT has been a great help here. Would’ve been bothering my coworkers all day otherwise.
Code merged and artefacts prepared. Now to deploy it on brand spanking new infrastructure.
So, this is how my morning went.
Apologies to my reviewers for all the notification emails they’re receiving during this battle with the CI/CD build.
Might be the only way I’ll learn another language is I put the spoken training audio to music, preferably something that can pass as a entry to Eurovision.
Linux administration is quite fun. I don’t usually get an opportunity to do it as part of my day-to-day, so it’s always a joy having a task that involves SSH and interacting with a shell. π§
πΊ Fallout: Season 1 (2024)
π¨βπ» New post on Linux over at Coding Bits: Packaging Services With Systemd
More Tools For Blogging Tool
Spent the last week working on Blogging Tool. I want to get as much done as a I can before motivation begins to wain, and it begins languishing like every other project I’ve worked on. Not sure I can stop that, but I think I can get the big ticket items in there so it’ll be useful to me while I start work on something else.
I do have plans for some new tools for Blogging Tool: making it easier to make Lightbox Gallery was just the start. This last week I managed to get two of them done, along with some cross-functional features which should help with any other tools I make down the road.
Move To Sqlite
First, a bit of infrastructure. I moved away from Rainstorm as the data store and replaced it with Sqlite 3. I’m using a version of Sqlite 3 that doesn’t use CGO as the Docker container this app runs in doesn’t have libc. It doesn’t have as much support out there as the more popular Sqlite 3 client, but I’ve found it to work just as well.
One could argue that it would’ve been fine sticking with Rainstorm for this. But as good as Rainstorm’s API is, the fact that it takes out a lock on the database file is annoying. I’m running this app using Dokku, which takes a zero-downtime approach to deployments. This basically means that the old and new app container are running at the same time.Β The old container doesn’t get shut down for about a minute, and because it’s still holding the lock, I can’t use the new version during that time as the new container cannot access the Rainstorm database file. Fortunately, this is not an issue with Sqlite 3.
It took me a couple of evenings to port the logic over, but fortunately I did this early, while there was no production data to migrate. I’m using Sqlc for generating Go bindings from SQL statements, and a home grown library for dealing with the schema migrations. It’s not as easy to use as the Rainstorm API but it’ll do. I’m finding working with raw SQL again to be quite refreshing so it may end up being better in the long run.
Imaging Processing
Once that’s done, I focused on adding those tools I wanted. The first one to sit alongside the gallery tool, is something for preparing images for publishing. This will be particularly useful for screenshots. If you look carefully, you’d noticed that the screenshots on this site have a slightly different shadow than the MacOS default. It’s because I actually take a screenshot without the shadow, then use a CLI tool to add one prior to upload. I do this because the image margins MacOS includes with the shadow are pretty wide, which makes the actual screenshot part smaller than I like. Using the CLI tool is fine, but it’s not always available to me. So it seemed like a natural thing to add to this blogging tool.
So I added an image processing “app” (I’m calling these tools “apps” to distinguish them from features that work across all of them) which would take an image, and allows you to apply a processor on it. You can then download the processed image and use it in whatever you need.
This is all done within the browser, using the Go code from the CLI tool compiled to WASM. The reason for this is performance. These images can be quite large, and I’d rather avoid the network round-trip. I’m betting that it’ll be faster running it in the browser anyway, even if you consider the amount of time it takes to download the WASM binary (which is probably around a second or so).
One addition I did add was to allow processors to define parameters which are shown to the user as input fields. There’s little need for this now β it’s just being used in a simple meme-text processor right now β but it’s one of those features I’d like to at least get basic support for before my interest wains. It wouldn’t be the first time I stopped short of finishing something, thinking to my self that I’d add what I’ll need later, then never going back to do so. That said, I do have some ideas of processors which could use this feature for real, which I haven’t implemented yet. More on that in the future, maybe.
Audio Transcoding And Files
The other one I added deals with audio transcoding. I’ve gotten into the habit of narrating the long form posts I write. I usually use Quicktime Player to record these, but it only exports M4A audio files and I want to publish them as MP3s.
So after recording them, I need to do a transcode. There’s an ffmpeg
command line invocation I use to do this:
ffmpeg -i in.m4a -c:v copy -c:a libmp3lame -q:a 4 out.mp3
But I have to bring up a terminal, retrieve it from the history (while it’s still in there), pick a filename, etc. It’s not hard to do, but it’s a fair bit of busy work
I guess now that I’ve written it here, it’ll be less work to remember.
But it’s a bit late now since I’ve added the feature to do this for
me. I’ve included a statically linked version of ffmpeg
in the
Docker container (it needs to be statically linked for the same reason
why I can’t use CGO: there’s no libc or any other shared objects) and
wrapped it around a small form where I upload my
M4A.
The transcoding is done on the server (seemed a bit much asking for this to be done in the browser) but I’m hoping that most M4A files will be small enough that it wouldn’t slow things down too much. The whole process is synchronous right now, and I could’ve make the file available then and there, but it wouldn’t be the only feature I’m thinking of that would produced files that I’d like to do things with later. Plus, I’d like to eventually make it asynchronous so that I don’t have to wait for long transcodes, should there be any.
So along with this feature, I added a simple file manager in which these working files will go.
They’re backed by a directory running in the container with metadata managed by Sqlite 3. It’s not a full file system β you can’t do things like create directories, for example. Nor is it designed to be long term storage for these files. It’s just a central place where any app can write files out as a result of their processing. The user can download the files, or potentially upload them to a site, then delete them. This would be useful for processors which could take a little while to run, or run on a regular schedule.
I don’t have many uses for this yet, apart from the audio transcoder, but having this cross-functional facility opens it up to features that need something like this. It means I don’t have to hand-roll it for each app.
Anyway, that’s the current state of affairs. I have one, maybe two, large features I’d like to work on next. I’ll write about them once they’re done.
Oof! These mornings have been really cold this last week. Had to bring out my wool and possum fur gloves for the walk to the cafe in 0.5Β°C weather.
π Adding Github-Style Markdown Alerts to Eleventy
GitHub has alerts (aka callouts) Markdown support where the syntax looks like [Obsidianβs.]
So apparently, if we were using Github instead of Gitlab, I couldβve had it all. π
One other thing I found this morning during my exploration of Markdown and Asciidoc is that many tools have a problem with JSON code blocks containing JavaScript-like comments. They’re reported as syntax errors, and sometimes they break the syntax highlighting. They’re still included in the rendered HTML, but it feels to me like the tools do so begrudgingly. Gitlab even marks them up with a red background colour.
Why so strict? The code blocks are for human consumption, and it’s really useful to annotate them occasionally. I always find myself adding remarks like “this is the new line”; or removing large, irrelevant chunk of JSON and replacing it with an ellipsis indicating that I’ve done so.
I know that some Markdown parsers support line annotations, but each one has a different syntax, and they don’t work for every annotation I want to make. But you know what does? Comments! I know how to write them, they’re easy to add, and they’re the same everywhere. Just let me use them in blocks of JSON code, please.
Oh, and also let me add trailing commas too.
Asciidoc, Markdown, And Having It All
Took a brief look at Asciidoc this morning.
This is for that Markdown document I’ve been writing in Obsidian. I’ve been sharing it with others using PDF exports, but it’s importance has grown to a point where I need to start properly maintaining a change log. And alsoβ¦ sharing via PDF exports? What is this? Microsoft Word in the 2000s?
So I’m hoping to move it to a Gitlab repo. Gitlab does support Markdown with integrated Mermaid diagrams, but not Obsidian’s extension for callouts. I’d like to be able to keep these callouts as I used them in quite a few places.
While browsing through Gitlabs’s help guide on Markdown extensions, I came across their support for Asciidoc. I’ve haven’t tried Asciidoc before, and after taking a brief look at it, it seemed like a format better suited for the type of document I’m working on. It has things like auto-generated table of contents, builtin support for callouts, proper title and heading separations; just features that work better than Markdown for long, technical documents. The language syntax also supports a number of text-based diagram formats, including Mermaid.
However, as soon as I started porting the document over to Asciidoc, I found it to be no Markdown in terms of mind share. Tool support is quite limited, in fact it’s pretty bad. There’s nothing like iA Writer for Asciidoc, with the split-screen source text and live preview that updates when you make changes. There’s loads of these tools for Markdown, so many that I can’t keep track of them (the name of the iA Writer alternative always eludes me).
Code editors should work, but they’re not perfect either. GoLand supports Asciidoc, but not with embedded Mermaid diagrams. At least not out of the box: I had to get a separate JAR which took around 10 minutes to download. Even now I’m fighting with the IDE, trying to get it to find the Mermaid CLI tool so it can render the diagrams. I encountered none of these headaches when using Markdown: GoLand supports embedded Mermaid diagrams just fine. I guess I could try VS Code, but to download it just for this one document? Hmm.
In theory the de-facto CLI tool should work, but in order to get Mermaid diagrams working there I need to download a Ruby gem and bundle it with the CLI tool (this is in addition to the same Mermaid command-line tool GoLand needs). Why this isn’t bundled by default in the Homebrew distribution is beyond me.
So for now I’m abandoning my wish for callouts and just sticking with Markdown. This is probably the best option, even if you set tooling aside. After all, everyone knows Markdown, a characteristic of the format that I shouldn’t simply ignore. Especially for these technical documents, where others are expected to contribute changes as well.
It’s a bit of a shame though. I still think Asciidoc could be better for this form of writing. If only those that make writing tools would agree.
Addendum: after drafting this post, I found that Gitlab actually supports auto-generated table of contents in Markdown too. So while I may not have it all with Markdown β such as callouts β I can still have a lot.
Must say I enjoyed The Rest Is History’s recent podcast on Dragons. They go into how these mythical beasts developed over the years, how they’re seen differently in different cultures, and how they entered the mainstream. Just watch out for the odd spoiler for House of the Dragon series 1. ποΈ
Eight months in and I’m still enjoying writing technical documents in Obsidian. I’ve never really appreciated how well it works for this form of writing. I wish we were using this for our knowledge base, instead of Confluence.
Key ring.
It’s always after you commit to a deadline that you find the tasks that you forgot to do.
I think if I ever created a Tetris game for the TI-83 graphing calculator, I would call it “Tetris Instruments.”
My Position On Blocking AI Web Crawlers
I’m seeing a lot of posts online about sites and hosting platforms blocking web crawlers used for AI training. I can completely understand their position, and fully support them: it’s their site and they can do what they want.
Allow me to lay my cards on the table. My current position is to allow these crawlers to access my content. I’m choosing to opt in, or rather, not to opt out. I’m probably in the minority here (well, the minority of those I follow), but I do have a few reasons for this, with the principal one being that I use services like ChatGTP and get value from them. So to prevent them from training their models on my posts feels personally hypocritical to me. It’s the same reason why I don’t opt out of Github Copilot crawling my open source projects (although that’s a little more theoretical, as I’m not a huge user of Copilot). To some, this position might sound weird, and when you consider the gulf between what value these AI companies get from scraping the web verses what value I get from them as a user, it may seem downright stupid. And if you approach it from a logical perspective, it probably is. But hey, we’re in the realm of feelings, and right now this is just how I feel. Of course, if I were to make a living out of this site, it would be a different story. But I don’t.
And this leads to the tension I see between site owners making decisions regarding their own content, and services making decisions on behalf of their users. This site lives on Micro.blog, so I’m governed by what Manton chooses to do or not do regarding these crawlers. I’m generally in favour of what Micro.blog has chosen so far: allowing people to block these scrapers via “robots.txt” but not yet blocking requests based on their IP address. I’m aware that others may not agree, and I can’t, in principal, reject the notion of a hosting provider choosing to block this crawlers at the network layer. I am, and will continue to be, a customer of such services.
But I do think some care should be considered, especially when it comes to customers (and non-customer) asking these services to add these network blocks. You may have good reason to demand this, but just remember there are users of these services that have opinions that may differ. I personally would prefer a mechanism where you opt into these crawlers, and this would be an option I’ll probably take (or probably not; my position is not that strong). I know that’s not possible under all circumstances so I’m not going to cry too much if this was not offered to me in lieu of a blanket ban.
I will make a point on some comments that I’ve seen that, if taken in an uncharitable way, imply that creators that have no problem with these crawlers do not care about their content. I think such opinions should be worded carefully. I know how polarising the use of AI currently is, and making such remarks, particularly within posts that are already heated due to the author’s feelings regarding these crawlers, risks spreading this heat to those that read it. The tone gives the impression that creators okay with these crawlers don’t care about what they push online, or should care more than they do. That might be true for some β might even be true for me once in a while β but to make such blanket assumptions can come off as a little insulting. And look, I know that’s not what they’re saying, but it can come across that way at times.
Anyway, that’s my position as of today. Like most things here, this may change over time, and if I become disenfranchised with these companies, I’ll join the blockade. But for the moment, I’m okay with sitting this one out.