Next: Linking Object Files
Up: Compiling with GCC
Previous: Compiling a Single Source
Contents
Optimization Options
Code optimization is an attempt to improve performance. The trade-off is lengthened compile times and increased memory usage during compilation.
- The bare -O option tells GCC to reduce both code size and execution time.
- It is equivalent to -O1. The types of optimization performed at this level depend on the target processor, but always include at least thread jumps and deferred stack pops.
- Thread jump optimizations attempt to reduce the number of jump operations,
- deferred stack pops occur when the compiler lets arguments accumulate on the stack as functions return and then pops them simultaneously, rather than popping the arguments piecemeal as each called function returns.
- -O2 level optimizations include all first-level optimization plus additional tweaks that involve processor instruction scheduling. At this level, the compiler takes care to make
sure the processor has instructions to execute while waiting for the results of other instructions or data latency from cache or main memory. The implementation is highly processor-specific.
- -O3 options include all O2 optimizations, loop unrolling, and other processor-specific features. Depending on the amount of low-level knowledge you have about a given CPU family,
you can use the -fflag option to request specific optimizations you want performed. Three of these flags bear consideration: -ffastmath, -finline-functions, and -funroll-loops.
- -ffastmath generates floating-point math optimizations that increase speed, but violate IEEE and/or ANSI standards.
- -finline-functions expands all "simple" functions in place, much like preprocessor macro replacements. Of course, the compiler decides what constitutes a simple function.
- -funroll-loops instructs GCC to unroll all loops that have a fixed number of iterations that can be determined at compile time.
- Inlining and loop unrolling can greatly improve a program's execution speed because they avoid the overhead of function calls and variable lookups, but the cost is usually a large increase in the size of the binary or object files.
- You will have to experiment to see if the increased speed is worth the increased file size. See the GCC info pages for more details on processor flag.
Next: Linking Object Files
Up: Compiling with GCC
Previous: Compiling a Single Source
Contents
Cem Ozdogan
2007-05-16