(Replying to PARENT post)

I read it and I still don't get it, can someone (re-)explain what the presence of the print() is doing that is helpful for branch prediction (or any other aspect of the CPU)?

Update: It seems to be the conditional move, see https://news.ycombinator.com/item?id=37245325

👤dataflow🕑2y🔼0🗨️0

(Replying to PARENT post)

I read it three or four times. It's never explained. If the print("lol") version has a branch-less-than, what does the regular version have? It must either be a branch or a conditional move, but we aren't shown that part of the assembly. You can't reach any conclusions about why one version is faster if you don't know what you're comparing to.

👤Calavar🕑2y🔼0🗨️0

(Replying to PARENT post)

The high-level view is that adding code that never gets executed causes the compiler to emit code that the CPU predicts better. IDK if this is the compiler assuming that the `print()` call is cold or the branch predictor getting luckier by chance but basically this tickles the CPU in the right way to get better performance.

It seems that this is mostly luck in a strange situation. And of course if you ever hit the `print()` it will be way slower than not. You can probably do better by adding something like a `__builtin_expect(...)` intrinsic in the right place to be more explicit about what the goal is here.

👤kevincox🕑2y🔼0🗨️0

(Replying to PARENT post)

I'm in school, so this may be oversimplified, but if the processor/assembly code is predicting the next result, it gets the result faster. The processor only does this prediction with conditional branches. The extra if for printing or finding the min invoke the prediction with the accuracies stated.

👤trolan🕑2y🔼0🗨️0

(Replying to PARENT post)

As to branch prediction, for anyone interested: https://stackoverflow.com/a/11227902

👤EspressoGPT🕑2y🔼0🗨️0