Generally, linking is the last phase in the build process of C/C++ programs. Most build systems, for example GNU Make, invoke programs defined in the build system, when a given set of prerequisites is completed. That way, linking requires that all linker input files (object files, shared objects, archives, etc.) are built, before linking can start.
While source compilation can mostly be parallelized well, the linking jobs cause CPU underutilization, because it has to wait for all its input files to finish. The result is that the linking phase will add a significant amount of time to the build process, when a program is rebuilt (the edit-compile-run cycle).
This blog post will introduce a technique, called concurrent linking, which makes it possible to start the linking phase during the compilation process. Note that concurrent linking is different from parallel linking, since the latter will speed up the linking process by using multiple threads, while the former technique will speed up the overall build process by reducing core underutilization and thus improving overall parallelism.
GNU Gold 1 is a relative new ELF linker and written from scratch in C++ by Ian Lance Taylor 2. Given the fact that GNU Gold is written from scratch, and designed to speed up the linking process with multiple threads, it is a great candidate to implement concurrent linking.
In the past, I've done build system analysis 3 for Mozilla's JavaScript engine Spidermonkey (see previous blog posts) and one of the conclusions is that linking libmozjs
, followed by the JavaScript shell itself, is the most time consuming part of rebuilding the shell. For my bachelor thesis, I'm implementing concurrent linking in GNU Gold, in order to demonstrate the positive impact of using concurrent linking while rebuilding C/C++ programs.
Once concurrent linking is implemented in GNU Gold, it will speed up the overall (re)build process compared to GNU ld, since it also features incremental linking and parallel linking. The older GNU ld does not support these features (and it probably will never support them). The disadvantage of Gold is that it only targets ELF, while GNU ld support various executable and linking formats.
When GNU Make has constructed its directed acyclic graph of targets and prerequisites, it knows almost all recipes to invoke. My solution is that GNU Make should do a similar trick as it does with recursive Make instances: GNU Make's jobserver creates a pipe to each recursive Make instance and passes on tokens, in order to divide the available job slots to the child Make processes. 4
For concurrent linking, if Make sees that linking using Gold is necessary for a target and the --concurrent-linking
flag is set for Gold, Make starts Gold when the first object file is ready. Make will create a pipe and sends each filename to Gold through the pipe.
Reading from the pipe (Gold's side) will block until:
\0
char (since \n
is considered valid as a character for filenames). In the first case, Gold can produce parts of the final output file, with the input files sent thus far. In the second case, Gold can safely finish the output file (since all files are received through the pipe).
In order to maintain backwards compatibility with older Makefiles and GNU Make, GNU Make should set an environment variable (e.g. GOLD_CONCURRENT_LINKING
or something shorter). If developers want concurrent linking, the Makefile can set --concurrent-linking
, if the Makefile sees that GOLD_CONCURRENT_LINKING
is defined, otherwise it will do nothing. For example:
ifdef USE_GOLD_LINKER ifdef GOLD_CONCURRENT_LINKING LDFLAGS += --concurrent-linking endif endif
Using this approach, build systems can reduce their build times, by improving their parallelism and thus reducing CPU core underutilization.
There are still some missing pieces:
configure
script, which will set USE_GOLD_LINKER
when the testcase which detects the use of GNU Gold succeeds. Experiments will tell what's useful and what's slowing things down. Those experiments are done after implementing concurrent linking in GNU Gold, since these uncertainties are more related to tweaking Makefiles and GNU Make.
GNU Gold is part of GNU Binutils 2.21: http://sourceware.org/binutils/. Note that Gentoo's GNU Binutils version \< 2.21 does not build the Gold linker, however GNU Gold is added to Binutils 2.18 in 2008. http://cygwin.com/ml/binutils/2008-03/msg00162.html ?
Ian Lance Taylor is currently a Google employee, the main author of GNU Gold and has a detailed blog about linkers and loaders. http://www.airs.com/blog/ ?
BSA is a build system analysis tool for GNU Make and PyMake. BSA is mostly a viewer used for analysis and a set of script to generate content for the viewer. The viewer enables a developer to see exactly which part of the build system is time consuming and which part should be altered to improve parallelism. http://smvv.kompiler.org/bsa/ ?
GNU Make's jobserver implementation is descripted in detail at http://mad-scientist.net/make/jobserver.html ?
Compilation-linking ratio is the number of concurrent compilation jobs that will provide sufficient object files in time that a single linker instance can do its work without stalling. ?
联系客服