• Spent the morning looking into what was causing the service I’m working on to be performing so badly. Turns out that one of the reasons was just us being stupid but another was actually quite surprising.

    Without giving too much away, the service I’m working on takes messages from a NATS Jetstream, and converts them to jobs to be sent to a pool of workers. The workers need to access an API to do the job, and the service I’m responsible for needs to setup the permissions so the worker can do so. This involves calling out to a bunch of other micro-services and producing a signed JWT to grant it access to the API. If the workers are fully utilised, then the service will send any incoming jobs to a queue. When a worker has finished a job, it will get the next one from the queue.

    One of the responsibility of this service is to make sure the workers are doing work as much as possible. The pool of workers is fixed, and we’re paying a fixed price for them, so we’d like them to be in use much like an airline would like their planes to be in the sky. So one of the things I added was that the Jetstream handler was to try and send a new job to a worker first, before adding it to the queue to be worked on later. This means that the job could completely skip the queue if a worker is available to pick it up job then and there.

    Turns out that this made a huge hit on performance. When the workers are all working on jobs, any requested work sent by this queue bypass logic would be refused, making all the prep work for preparing access to the API completely unnecessary. Time spent on calling the micro-service I can completely understand, but I was surprised to find that a significant chunk of prep work was spent on signing the JWT — a JWT that would never be used. This showed up on the CPU profile trace I took, but it also came through in the output of top and the CPU status on the EC2 dashboard. The CPU was maxed out at 99%, and virtually all of that was spent on the service itself. CPU idle time, system time, and I/O wait time was pretty much at 0%.

    Taking out this queue bypass logic dramatically improved things (although the CPU is still unusually high, which I need to look at), and what I’ll eventually do is add a circuit breaker to the Jetstream handler, so that if no worker is available to pick up the job then and there, all the other jobs will go straight to the queue. There are a few other things we could do here, like raise the number of database connections and Jetstream listeners.

    But I’m shocked that signing a JWT was having such an impact. One of my colleagues suggested building the service with GOAMD64=v3, which I’m guessing would enable some AMD extensions in the compiled binary. I’d be interested to see if this would help.

  • I generally don’t want to write about things going on at work, but sometimes it helps, and with the self-imposed rule of writing something a day, it’s usually the only thing worthy of comment. Classic case of the topic sometimes choosing you, rather than the other way around.

  • Seeing a problem at work where the performance of the service we’re working on is just not good enough. Tried bumping up the EC2 instance size and raising the DB connection pool to 20. But the pool won’t go higher than 7 connections.

    So it’s either that the connection pool is misconfigured, or the NATS Jetstream client has too few listeners. I suspect it’s the latter. I don’t know what the default is set to, but if it’s a multiple of the number of CPU cores, I’m not sure how well that’ll play with AWS EC2.

    So that’s the next thing to look at.

  • I enjoyed reading this post by Rohan Ganapavarapu. It’s fascinating getting the perspective of someone born after the early internet yet wishing they were there to experience it. The ending’s quite illuminating:

    There is neocites, and a small community of people who share this philosophy about the web (and that are relatively young), but I have not met anyone my age, in the real world, that would choose to do something like this.

    The majority of people (my age [of 18]) today would think sites like those (and, by extension, their creators) are weird.

    I guess, here’s to the weird ones. 🥂

  • Finished work and now I’m having a quick dinner before heading off to a meetup. And I can’t lie: I’ve been nervous all day. 😬

  • Kind of glad I don’t run a big, multinational corporation.

  • Scare with care. 🎃

    A street pole features a sign with pedestrian crossing instructions, using humorous cartoon characters with jack-o'-lantern placed over their faces to indicate "Do Not Cross," "Cross With Care," and "Complete Crossing Do Not Start to Cross."
  • Try-Catch In UCL - Some Notes

    Stared working on a try command to UCL, which can be used to trap errors that occur within a block. This is very much inspired by try-blocks in Java and Python, where the main block will run, and if any error occurs, it will fall through to the catch block: try { echo "Something bad can happen here" } catch { echo "It's all right. I'll run next" } This is all I’ve got working at the moment, but I want to quickly write some notes on how I’d like this to work, lest I forget it later. Continue reading →

  • I think Slack’s got an opportunity here to displace Confluence as the “system of record” for work documents. Their Canvas editor is quite good, much better than Confluence’s, and it’s one I really like using. Yet once a Canvas is written and published, they’re ridiculously hard to find again.

    Having a central place to browse Canvases and arrange them into categories, much like someone would do in a wiki, would go a long way to making them useful as documents unto themselves, rather than simply “big messages” with short-to-medium-term lifespans.

    That’s not to say there’s not a role for such documents. In fact, I wonder if that’s why wikis are always difficult to navigate: you’re mixing documents that have different expected lifespans. System designs sit alongside retrospectives from 2022, which sit alongside the agenda for a meeting next week.

    Here Canvases in Slack could be created with the default expectation that the lifespan will be a couple or weeks, or a month, and it’s only those that you explicitly “keep” that would be browsable in this new system. These are the ones you expect to last years and be always kept up to date. The others will still be there — Slack can archive them — but they’ll slowly fade into the background much like the message history.

    Anyway, just some random thoughts I had while starting to work on a design within a Canvas and wondering if it should actually go into Confluence.

  • Discovered that Enya is still releasing albums. Started listening to this one over the weekend. Quite good. 🎵

  • Weekly Update - 20 Oct 2024

    Yeah, I know, it’s been a while… again. A lot has been happening in life and there’ve been many days that I haven’t done any work on anything. Things are starting to settle down now, although I am expecting a few more bumpy days ahead, so we’ll see how we go with project work. Cyber Burger Yeah, I’m getting pretty tired of this one. I’m right in the trough of despair here, where the initial excitement has worn off and I just want to finish it. Continue reading →

  • I was poking around Dave Winer’s Software Snacks — a brilliant name for those — and I stumbled across Little Card Editor. Decided to give it a try.

    A cozy coffee table setup with a blue knitted item, blue headphones, and a smartphone displaying the time. The title ‘Morning Coffee Table’ is overlayed in the centre in a serif font.
  • Archie is no longer with us sadly, so my sister went out and got a new companion for Ivy. Say hello to Rico. 🦜

    Auto-generated description: A cockatiel with yellow and gray plumage is perched near a wooden stand and a tray of assorted seeds.
  • So it looks like Squarespace has been acquired by a private equity firm. I wonder if the new owners will keep buying podcast ads, or if they’ll pull them like Akamai did when they acquired Linode. I get the feeling a lot of shows are relying on Squarespace’s consistent ad money to remain viable.

  • I’m going to a meetup surrounding a book that I haven’t read yet. I wanted to finish what I was reading at the time and I figured I’d have about a week to read this book before the meetup started. I thought I remembered buying it, so when it came time to start reading it, I was a little surprised to find it missing from the Kindle app.

    Fearing that I was running out of time, I went to Amazon to buy it again, only to discover that I actually pre-ordered it and that it’s going to be released on the date of the meetup.

    So, yeah, feeling releaved about that.

  • 🔗 How to be confident

    A great post by Annie Mueller. And pretty much spot on, based on my understanding of how to gain confidence.

  • 🔗 Save the Web by Being Nice

    Found this while browsing Dave Winer’s blog-roll on Scripting News. I enjoyed reading this post so I thought I’d take his advice and be nice by sharing a link to it.

  • Oof, it’s been quite the week! Almost over though: only around 30 minutes left, then it’s the weekend.

  • I’m officially a Zio today. 🙌

  • I’m generally not someone who likes to talk to people working on my hair. Even so, if silent cuts were offered to me, I not sure I would accept. The occasional “what do you do” and “how’s business”, enough to acknowledge each other, is fine. Not even having that would seem a little strange.

  • Don’t use access permissions to control what a user can and can’t do if the correct functionality of the system you’re building depends on it.

    A user’s permission should dictate what a user has the right to do and see based on the policies of the resources themselves. But when it comes to the correct functionality of a system, it should be built such that if you were to disable all the permission checks, the user should be able to do whatever they can without breaking things. Relying on permissions to prevent this feels like a code smell to me, and can leave you with policies that have blanket denies for everyone that just can’t be taken out, and no one remembers why it was added there in the first place.

  • I don’t count myself a Safari fan, but full credit to Apple: they’ve made remote debugging for iPad Safari very easy. Plug the iPad in, tap “Trust the Device” a few times1, and Safari’s developer tools menu shows the iPad right there. It also works for SafariViewController sessions in modals, which is nice.


    1. There might be some setup stuff you’ll need to do on the iPad that I’ve forgotten about. ↩︎

  • Goland’s LLM-powered auto-complete is really good. It’s got to the point where it feels like Goland is broken when I’m using a version that doesn’t have it. I’m sure they hope to expand of this, and if I can make a request on what they could do next, it would be to add “auto-complete” suggestions in other areas of the code.

    For example, I’m working on a function which uses AWS’s Golang SDK to send an SQS message. I started writing out the call to send a message, when I found out that I forgot to define both the context and queue name in the function I’m working in. Nothing too hard to fix, of course, but it would mean moving away from where I’m am now, and conducting a mini context-switch away from calling the SDK to fixing my function definition.

    It would be nice for the LLM-based auto-completer to suggest adding the context as the first parameter of the function, as per the convention. The queue name is a little more ambiguous: it could either be suggested as another function parameter or as a field on the provider type. I suppose both are just as likely, but assuming that Goland is refining it’s model based on my trends, it could suggest adding the topic name as a field, along with adding it in as a parameter to the constructor function.

    Anyway, something for them to look at when they run out of work.

  • Maybe there’s still a chance for Apple to release a car of some sort, although probably not how they were planning to. 😜

  • I think I’ve added more features to my TUI-based table editor over the last couple of weeks than I have over the last couple of years. Today, I added a command to collate two CSV files together based on the value of a particular column — sort of like an inner join in a relational database — and also a command to remove duplicate rows. This is in addition to the changes made last week, which included making the editor “header aware” and a command to map the values of a column. Granted, all these features were implemented using the bare minimum necessary to get my work done, but they’re there, and they weren’t a few weeks ago.