I’ve seen a few people online talk about the new Ollama native macOS app. I’ve not used Ollama’s CLI tools myself, and have very limited experience with running models locally. I think I tried a couple of years ago, and gave up at the step when they asked me to setup a Python virtual environment. Since then I was always under the impression that setting up LLMs to run locally would be painful, and while I liked the idea of doing so, it wasn’t enough to go through that pain again. So hearing about Ollama packaging all this in a native app piqued my interest, and I thought I’d give it a go.

Install was as expected: just download a DMG and add to Applications. I was half expecting the package to be huge to accomodate the models, but the download doesn’t include the models themselves, which is fair enough given that there are a few to choose from.

First launched revealed a very ChatGPT-esk experience, making it absolutely clear how this app worked. I poked around a little trying to find the place where I can download a model, but I couldn’t find anything. I then tried posting a question and quickly found out that models are downloaded on demand.

Auto-generated description: A minimalistic chat window features a simple cartoonish character and a text input field.

It was then a matter of trying out some of these models. First test was with Gemma 3:4b. I asked it a question around the tongue-twister “see shells sea shells by the sea shore”. The fact that it actually produce a result was impressive, yet the quality of the answer was questionable:

Auto-generated description: A humorous chat exchange discussing a popular saying about women selling seashells at the beach.

For comparison, here’s the same question posed to ChatGPT:

Auto-generated description: A ChatGPT interaction shows a question about a popular saying regarding a female at a beach selling sea-shells, followed by the answer with the tongue twister, She sells sea-shells by the sea-shore.

So this model struggled a little, probably because it was on the small size (I’m assuming the “4b” refers to the number of parameters. 4 billion?). I didn’t get a good gauge as to how slow the response was. It was slower, but not as slow as I was expecting. Plus it was coloured by the time it took to download the model, which itself was not fast (maybe 30 minutes to an hour, although I wasn’t on a fast connection).

I then tried the same question with Deepseek R1:8b, and was greeted with a reminder that this is not a Western trained model. But after some clarification it got there in the end. Ollama actually displayed the reasoning steps as the model was going through it’s inference, which is a nice touch.

Auto-generated description: A text-based conversation discussing the saying She sells seashells by the sea shore with an additional note about cultural differences.

I could go on and try Qwen 3, but you get the idea. This is not really about the models (okay, it’s sort of about the models, but I’m hardly putting them through the paces).

So on the whole, I found the app itself is quite pleasant to use. The minimal aesthetics make it very approachable for someone who only had experience using ChatGPT. I also like that you can go back through your chat and edit past questions. Being able to “alter the context” like this is something I wish the online AI chat apps support, particularly when the model start going down the wrong path.

At the same time, I do wish I have a bit more control over the experience. I’m not a user of Ollama’s CLI tool so I don’t know what’s possible but some things I would like to see added down the line:

  1. A way to manage model downloads outside of the chat window.
  2. A way to configure the system prompt. I actually don’t know if Ollama’s applying one, but if it is, it would be nice to see/modify it.
  3. A way to export the chat transcript.

So, would I continue using this? Honestly, probably not, mainly because of performance. I’m not sure the computer I’m using (a 2020 MacBook Pro M1 with 16 GB RAM) is powerful enough to be competitive to the online models. Maybe if I had the latest and greatest hardware, with as much memory and significantly more disk space, it would be a viable contender. I do have access to something a little newer (Mac Mini M2) so I maybe I’ll try it out there. It would be interesting to see how the OpenAI’s new open weight models, which were release while I was writing this, will perform.

But that said, I’m quite impress that I was able to run these models at all. I have heard of others running local models for a while now, and I like the Ollama makes this easy to do so, with a nice, approachable native app.