Spent the morning looking into what was causing the service I’m working on to be performing so badly. Turns out that one of the reasons was just us being stupid but another was actually quite surprising.
Without giving too much away, the service I’m working on takes messages from a NATS Jetstream, and converts them to jobs to be sent to a pool of workers. The workers need to access an API to do the job, and the service I’m responsible for needs to setup the permissions so the worker can do so. This involves calling out to a bunch of other micro-services and producing a signed JWT to grant it access to the API. If the workers are fully utilised, then the service will send any incoming jobs to a queue. When a worker has finished a job, it will get the next one from the queue.
One of the responsibility of this service is to make sure the workers are doing work as much as possible. The pool of workers is fixed, and we’re paying a fixed price for them, so we’d like them to be in use much like an airline would like their planes to be in the sky. So one of the things I added was that the Jetstream handler was to try and send a new job to a worker first, before adding it to the queue to be worked on later. This means that the job could completely skip the queue if a worker is available to pick it up job then and there.
Turns out that this made a huge hit on performance. When the workers are all working on jobs, any requested work sent by this queue bypass logic would be refused, making all the prep work for preparing access to the API completely unnecessary. Time spent on calling the micro-service I can completely understand, but I was surprised to find that a significant chunk of prep work was spent on signing the JWT β a JWT that would never be used. This showed up on the CPU profile trace I took, but it also came through in the output of top
and the CPU status on the EC2 dashboard. The CPU was maxed out at 99%, and virtually all of that was spent on the service itself. CPU idle time, system time, and I/O wait time was pretty much at 0%.
Taking out this queue bypass logic dramatically improved things (although the CPU is still unusually high, which I need to look at), and what I’ll eventually do is add a circuit breaker to the Jetstream handler, so that if no worker is available to pick up the job then and there, all the other jobs will go straight to the queue. There are a few other things we could do here, like raise the number of database connections and Jetstream listeners.
But I’m shocked that signing a JWT was having such an impact. One of my colleagues suggested building the service with GOAMD64=v3
, which I’m guessing would enable some AMD extensions in the compiled binary. I’d be interested to see if this would help.
Playing around with some possible UI design choices for that Android RSS Feed Reader. I think I will go with Flutter for this, seeing that I generally like the framework and it has decent (although not perfect) support for native Material styling.
Started looking at the feed item view. This is what I have so far:
Note that this is little more than a static list view. The items comes from nowhere and tapping an item doesn’t actually do anything yet. I wanted to get the appearance right first, as how it feels is downstream from how it works.
The current plan is to show most of the body for items without titles, similar to what other social media apps would show. It occurred to me that in doing so, people wouldn’t see links or formatting in the original post, since they’ll be less likely to click through. So it might be necessary to bring this formatting to the front. Not all possible formatting, mind you: probably just strong, emphasis, and links. Everything else should result with an ellipsis, encouraging the user to open the actual item.
Anyway, still playing at the moment.
I generally don’t want to write about things going on at work, but sometimes it helps, and with the self-imposed rule of writing something a day, it’s usually the only thing worthy of comment. Classic case of the topic sometimes choosing you, rather than the other way around.
Seeing a problem at work where the performance of the service we’re working on is just not good enough. Tried bumping up the EC2 instance size and raising the DB connection pool to 20. But the pool won’t go higher than 7 connections.
So it’s either that the connection pool is misconfigured, or the NATS Jetstream client has too few listeners. I suspect it’s the latter. I don’t know what the default is set to, but if it’s a multiple of the number of CPU cores, I’m not sure how well that’ll play with AWS EC2.
So that’s the next thing to look at.
I enjoyed reading this post by Rohan Ganapavarapu. It’s fascinating getting the perspective of someone born after the early internet yet wishing they were there to experience it. The ending’s quite illuminating:
There is neocites, and a small community of people who share this philosophy about the web (and that are relatively young), but I have not met anyone my age, in the real world, that would choose to do something like this.
The majority of people (my age [of 18]) today would think sites like those (and, by extension, their creators) are weird.
I guess, here’s to the weird ones. π₯
Finished work and now I’m having a quick dinner before heading off to a meetup. And I can’t lie: I’ve been nervous all day. π¬
Kind of glad I don’t run a big, multinational corporation.
Scare with care. π
π New post over at Workpad: Try-Catch In UCL - Some Notes
I think Slack’s got an opportunity here to displace Confluence as the “system of record” for work documents. Their Canvas editor is quite good, much better than Confluence’s, and it’s one I really like using. Yet once a Canvas is written and published, they’re ridiculously hard to find again.
Having a central place to browse Canvases and arrange them into categories, much like someone would do in a wiki, would go a long way to making them useful as documents unto themselves, rather than simply “big messages” with short-to-medium-term lifespans.
That’s not to say there’s not a role for such documents. In fact, I wonder if that’s why wikis are always difficult to navigate: you’re mixing documents that have different expected lifespans. System designs sit alongside retrospectives from 2022, which sit alongside the agenda for a meeting next week.
Here Canvases in Slack could be created with the default expectation that the lifespan will be a couple or weeks, or a month, and it’s only those that you explicitly “keep” that would be browsable in this new system. These are the ones you expect to last years and be always kept up to date. The others will still be there β Slack can archive them β but they’ll slowly fade into the background much like the message history.
Anyway, just some random thoughts I had while starting to work on a design within a Canvas and wondering if it should actually go into Confluence.
Discovered that Enya is still releasing albums. Started listening to this one over the weekend. Quite good. π΅
π New post over at Workpad: Weekly Update - 20 Oct 2024
I was poking around Dave Winerβs Software Snacks β a brilliant name for those β and I stumbled across Little Card Editor. Decided to give it a try.
Archie is no longer with us sadly, so my sister went out and got a new companion for Ivy. Say hello to Rico. π¦
So it looks like Squarespace has been acquired by a private equity firm. I wonder if the new owners will keep buying podcast ads, or if they’ll pull them like Akamai did when they acquired Linode. I get the feeling a lot of shows are relying on Squarespace’s consistent ad money to remain viable.
Iβm going to a meetup surrounding a book that I havenβt read yet. I wanted to finish what I was reading at the time and I figured Iβd have about a week to read this book before the meetup started. I thought I remembered buying it, so when it came time to start reading it, I was a little surprised to find it missing from the Kindle app.
Fearing that I was running out of time, I went to Amazon to buy it again, only to discover that I actually pre-ordered it and that itβs going to be released on the date of the meetup.
So, yeah, feeling releaved about that.
π How to be confident
A great post by Annie Mueller. And pretty much spot on, based on my understanding of how to gain confidence.
π Save the Web by Being Nice
Found this while browsing Dave Winer’s blog-roll on Scripting News. I enjoyed reading this post so I thought I’d take his advice and be nice by sharing a link to it.
Oof, it’s been quite the week! Almost over though: only around 30 minutes left, then it’s the weekend.