Many executables can be instead compiled with -fPIC or -fPIE and linked with -pie. The resulting output is a program containing a global offset table used by the dynamic linker at program start to resolve all fixed addresses to their proper position. This effectively eliminates the extra overhead of ET_EXEC random base mapping and all false alarms from improper mapping of ET_EXEC binaries when using PaX.
The -fPIE flag will likely break shared libraries. It will also break some program code on some architectures, particularly that which uses inline assembly and accesses all base registers (%eax, %ebx, %ecx, %edx and those which access parts of these). The breakage tends to be immediately apparent; the program refuses to compile and/or link. Thus, it is not appropriate to simply use -fPIE -pie for all compilation.
Many programs appear to break on i386 if using inline assembly optimizations, such as the mmx optimizations for Gimp. In usage on Gentoo Linux using a modified gcc which produces ET_DYN executables, it seems that AMD64 platforms do not suffer from lack of basic registers; AMD64 processors have many more available registers than i386.
The gcc spec file, located at /usr/lib/gcc-lib/(platform)/(version)/specs or at /usr/lib/gcc/(platform)/(version)/specs, can be modified such that the default behavior outside the presence of -fPIC is to use -fPIE, and that default linking to an executable rather than a shared library use -pie. This is the recommended
The above modifications can be easily done and undone with a shell script. Such a shell script to apply and remove these modifications should be packaged with gcc. This script could also be used to alter the specs file such that the default behavior would apply -fstack-protector to everything; this is suitable in the presence of a ProPolice/SSP patched gcc. This approach has the problem that multiple compilations requiring different specs cannot occur at the same time without race conditions, and should not be used.
Instead of a script, gcc can be told via the -specs=[file] switch or via an environment variable which specs file to use. This approach allows multiple specs files to exist, and allows multiple instances of gcc to use different specs files simultaneously. This method is now being examined for Hardened Gentoo, and would allow a single gcc to function as both a hardened and non-hardened compiler.
It is a design decision for the Debian developers to tackle whether they would have the default behavior of gcc to have PIE and/or SSP on or off by default in the presence of multiple specs. Although it could be easily disabled, this change alters the behavior of gcc in such a way which causes uncontrolled visible changes to the user; the Debian maintainers could not supply automatic disabling of the spcs when a build system for a package that's known to fail under a hardened specs file is activated.
The PaX kernel level security patch allows the randomized selection of mmap() bases as well as ET_EXEC executable bases. The mmap() base is mostly used for libraries, which contain position independent code; mmap() randomization does not alter the base for position dependent ET_EXEC executable binaries; however, PaX supplies an option to do just that.
The randomization of the ET_EXEC base imposes a greater overhead than that of ET_DYN shared objects. ET_EXEC code is fixed position, and so access of constant addresses must be adjusted for in some manner. In rare cases, there may be a lack of relocation information, which causes the relocation to fail. The result is that, upon execution of certain code, the binary is killed by PaX. This is called a "false alarm."
Using PIE, the executable base can be randomized on PaX systems without extra overhead. However, the outputted code itself is less optimal for performance; and although it will run on a non-PaX system with no more nor less performance than normal code, it is a larger set of instructions, and thus, not as fast.
It is notable that *all* shared objects are position independent. The nature of their code is the same as the nature of PIE code. Effecitvely, using Position Independent Executables is the same as replacing your entire system with shared objects; in fact, libc6 can be run as an executable if it is marked as such.
To summarize pie.txt, on i386 there is a 0.99002% performance impact measured in the absence of -fomit-frame-pointer; and a 5.8165% performance impact measured in the presence of -fomit-frame-pointer, using -O3. The gains from -fomit-frame-pointer are effectively lost with PIE. On AMD64, there is a hit of 0.027582% using -O3, which supplies -fomit-frame-pointer.
It can be said that the performance impact is negligible on i386 if two things are assumed. First, you must assume that either a large amount of code is to be executed from shared objects, or that there is no standing objection to moving more code into shared objects. Second, you must assume that the i386 base does not rely greatly on -fomit-frame-pointer for performance.
Performance impact on AMD64 is negligible.
Position Independent Executables will run exactly as normal executables on non-PaX kernels. These executables will be loaded into memory, have their GOTs resolved, link with shared libraries, and execute their main() function on any exec() or execve() call, such as made by bash when executing a command such as `cat'. Thus, no run-time compatibility with non-PaX systems is sacrificed.
It is notable that certain applications may not function properly with PIE. This only happens with very specific designs; the only particular example known at this time to the DSbD project is wine. The wine executable has a preloader which takes control of the dynamic linking process and loads the fixed position Windows executable at a fixed base. For this reason, wine should be built normally (ET_EXEC, non-PIE) and have a PaX marking to disable ET_EXEC base randomization, -x.
A handfull of applications cannot be compiled PIE. These use inline assembly which consumes all base registers. Possible solutions include compiling without assembly "preoptimizations" (such as --disable-mmx or --disable-3dnow) and disabling the gcc specs modifications that cause PIE-by-default behavior. These solutions should be favored in the order given; compilers have their own optimization passes for a reason.