I tested out a few different compiler optimizations on GNU GCC, in the CodeBlocks IDE. With each different optimization I ran the same test: I let my program try to build as big of a tree as it could in 1 second. The nodes of the tree were objects of a class I created.
[-O1]: 78,235 nodes generated. [-O2]: 78,235 nodes generated. [-O3]: 78,235 nodes generated. Optimize more (for speed) [-O1]: 773,019 nodes generated. Optimize even more (for speed) [-O2]: 773,019 nodes generated. Optimize fully (for speed) [-O3]: 222,072 nodes generated.
So this is confusing. Standard O1, O2, O3 all perform exactly the same. I made sure to recompile each time I adjusted the compiler settings, so I don’t think there’s any error on my part.
The extra optimizations for O1 and O2 also perform exactly the same, but they perform significantly better than the O3 extra optimization (the optimization I assumed was for max speed).
Could anyone explain this?
EDIt – The exact same number of nodes generated by certain optimizations is actually to be expected, I just remembered this. The program performs iterative-deepening growth of the tree, so the number of nodes grows in huge chunks at a time.