optimizing for loops


Disclaimer: I’m not a compiler expert. I’m simply curious and come seeking enlightenment.

I’ve seen people claim that — for efficiency — for loops should generally use a zero comparison for termination. So rather than:

void blink1(int n) {     for (int i=0; i<n; i++) {         blink_led();     } } 

you should write:

void blink2(int n) {     for (int i=n; i>0; i--) {         blink_led();     } } 

I thought that was a little silly: why put the burden on the human if a compiler could interpret both cases as "blink_led() n times"?

But using Mr. Godbolt’s Compiler Explorer, I now think I’m wrong. For all the compilers I tried, the "compare against zero" always produced a shorter loop. For example, x86-64 gcc 10.2 with -O3 optimization produced the following inner loops:

blink1:     ... .L3:         xor     eax, eax         add     ebx, 1         call    blink_led         cmp     ebp, ebx         jne     .L3 

vs

blink2:     ... .L12:         xor     eax, eax         call    blink_led         sub     ebx, 1         jne     .L12 

So here’s the question

This seems like such a common case.

Why can’t (or why doesn’t) the compiler notice that the effect of the for loop is simply "do this thing N times" — whether counting up or counting down — and optimize for that?