How to speed up the Rust compiler some more in 2019

  • > But I was able to work around this by using a trick: creating two variants of the function, one marked with #[inline(always)] (for the hot call sites) and one marked with #[inline(never)] (for the cold call sites).

    Can't PGO make inlining decisions like this? Otherwise, propeller/LTO might work well.

    > But there’s a trade-off. Sometimes a simpler, smaller function is slower.

    Without a doubt! Imagine the naive/simple/portable memcpy versus a target-aware one that capitalizes on wider or aligned loads and stores.