Reflections on Software Performance

  • > It seems increasingly common these days to not worry about performance at all,

    You don't even have to continue there. People, who should know better, assume that 'modern cloud stuff' will make this trivial. You just add some auto-scaling and it can handle anything. Until it grinds to a halt because it cannot scale beyond bottlenecks (relational database most likely) or the credit card is empty trying to pull in more resources beyond the ridiculous amount that were already being used for the (relatively) tiny amount of users.

    This will only get worse as people generally use the 'premature optimization' (delivering software for launch is not premature!) and 'people are more expensive than more servers' (no they are not with some actual traffic and O(n^2) performing crap) as excuse to not even try to understand this anymore. Same with storage space; with NoSQL, there are terabytes of data growing out of nowhere because 'we don't care as it works and it's 'fast' to market, again 'programmers are more expensive than more hardware!'). Just run a script to fire up 500 aws instances backed by Dynamo and fall asleep.

    I am not so worried about premature optimization ; I am more worried about never optimization. And at that; i'm really worried about my (mostly younger) colleagues simply not caring because they believe it's a waste of time.

  • Yes, performance is a feature.

    You have to plan and architecture for it, and you can't just tack it on after the fact by profiling a few hot codepaths (though you should do that too).

    Performance can be different from "scalability" though. Sometimes, there is tension between the two.

  • This piece is excellent. I really love how it challenges the "optimize last" philosophy by pointing out that performance is integral to how a tool will be used and designing it in as part of the architecture from the very start can produce a fundamentally different product, even if it appears to have the same features.

  • I've heard that performance is a feature but I feel like that understates the effort involved in seeking performance for a piece of software.

    If you want to call it a feature, its closer to N features: 1 for each feature you have. If you have 10 features, and add performance, the effort involved isn't like having 11 features. It's like having 20 features. The effect is multiplicative.

    This is because performance is a cross cutting concern. Many times cross cutting concerns are easy to inject/share effort with. But not with performance. You can't just add an @OptimizeThis annotation to speed up your code. Performance tuning tends to be very specific to each chunk of code.

  • > And while the SQLite developers were able to do this work after the fact, the more 1% regressions you can avoid in the first place, the easier this work is.

    That mention of regressions seems, IMO, a slightly out of left field attempt at dismissing how the SQLite example shows that you can, in fact, "make it fast" later. Maybe he should've a picked a different example entirely because it undermined his point a little bit.[1]

    All in all, his entire thesis comes from talking about a typechecker, which is indeed a piece of software whose each component in general contributes to the performance of the whole. It isn't a set of disparage moving parts (at least, from what I remember of my time studying parsers in college), so it's very hard to optimize by sections because all components mostly feed off each other. Most software is not a typechecking tool, plenty (dare I say, most) of software does have specific bottlenecks.

    Though I do agree that, even if we aren't focusing on it right away, we should keep performance in mind from the beginning. If nothing else, making the application/system as modular as possible, so as to make it easier to replace the slowest moving parts.

    [1] Which is a good thing IMO, as it highlights how this is all about trade-offs. "Premature optimization is the root of all evil", "CPU time is always cheaper than an engineer’s time", etc., are, in fact, mostly true, at least when talking about consumer software/saas: it really doesn't matter how fast your application is because crafting fast software is slower than crafting slow software, and your very performant tool is not used by anyone because everyone is already using that other tool that is slower but came out first.

  • > What is perhaps less apparent is that having faster tools changes how users use a tool or perform a task.

    Important here is that for a user, "faster" means with respect to achieving the goal.

    At work we've created a module where, instead of punching line items by hand and augmenting the data by memory or web searches, the user can paste data from Excel (or import from OCR) and the system remembers mappings for the data augmentation.

    After a couple of initial runs for the mapping table to build our users can process thousands of lines in 10 minutes or less, a task that could take the better part of a day.

    It's not uncommon with some follow-up support after new customers start with this module, so I often get to follow the transformation from before to after.

    They also quickly get accustomed. We'll hear it quick if those 10 minutes grows to 20 from one build to another, not much thought is given to how 20 minutes is still a lot faster than they'd be able to punch those 8000 lines :)

  • > the SQLite 3.8.7 release, which was 50% faster than the previous release

    Nit: the link says it’s 10% faster than the previous release. It’s 50% faster than some arbitrary point in the past, perhaps the time when they began their CPU-based profile optimization.

  • Nice and clean static layout. A rarity these days when blog post web pages tend to be overloaded with headers, footers, and various crappy interactive elements.

  • undefined

  • > I’ve really strongly come to believe that…

    I’ve come to believe really strongly that…

  • This guy hasn't worked in a start up. Products should be iterative. You can't say your application is slow because it was architected badly.

  • Great stuff. We need to work on the fact at the moment, though that happens at time goes by.