It’s easy to get a irrational sense of how scalable a particular database technology can be. Take my experience with PostgreSQL. I use it all the time, but I have it in my head that it shouldn’t be used for “large” amounts of data. I get a little nervous when a table goes beyond 100,000 rows, for example.
But just today, I discovered a table that had 47 million rows of time-series data, and PostgreSQL seems to handle this table just fine. There are a few caviets: it’s a database with significant resources backing it, it only sees a few commits per second, and the queries that are not optimised for that table’s indices can take a few seconds. But PostgreSQL seems far from being under load. CPU usage is low (around 3%), and the disk queue depth is zero.
I guess my point is that PostgreSQL continues to be awesome and I really shouldn’t underestimate how well it can handle we we throw at it.