Investigating the effects of GCC's --without-headers and --with-newlib configuration flags

08/01/2022

The intro: Linux From Scratch and GCC cross-compiler confusion
Why can't we compile libgcc with libc support?
The difference between --with-newlib and --without-headers
The effects of inhibit_libc on libgcc
Conclusion
References

The intro: Linux From Scratch and GCC cross-compiler confusion

Linux From Scratch (LFS) is a popular online book that guides you through creating your own Linux distribution. To create a native compiler for our LFS distribution that is completely separate from our host system, we have to go through three passes of building GCC.

First, we build a cross-compiler (cc1) that will run on our host machine and will create executables for our LFS target.
Next, we use this cross-compiler to build another compiler that runs on and creates code for the LFS machine (cc-lfs).
Finally, we rebuild and test our native LFS compiler (cc-lfs) using itself on our LFS system. This third step ensures that our final compiler is completely isolated from the host system.

When building our cross-compiler in step #1, the LFS book instructs us to pass the following flags in to the configuration script.

The two flags I'll be investigating in this article are --with-newlib and --without-headers.

The LFS book doesn't provide great explanations of these flags. For --with-newlib, they say "Since a working C library is not yet available, this ensures that the inhibit_libc constant is defined when building libgcc. This prevents the compiling of any code that requires libc support." For --without-headers, "When creating a complete cross-compiler, GCC requires standard headers compatible with the target system. For our purposes these headers will not be needed. This switch prevents GCC from looking for them." [1]

What is libgcc?

GNU provides online documentation that describes the internals of GCC. Within this documentation, there's a brief section dedicated towards describing the libgcc library.

"Most of the routines in libgcc handle arithmetic operations that the target processor cannot perform directly. This includes integer multiply and divide on some machines, and all floating-point and fixed-point operations on other machines. libgcc also includes routines for exception handling, and a handful of miscellaneous operations." [5]

If we're to take the LFS explanation at face value, then both these flags perform the same function by ensuring that libgcc will not use target headers from libc (glibc, in our case). However, this isn't exactly the case; there is most definitely a nuanced difference between these two flags. My plan for this article is to is to first describe why we can't compile libgcc with libc support, even though libc is available on the host machine. Then, I want to discuss the high level difference between the two flags. Finally, I'll dive into the source code to see exactly how the inhibit_libc macro affects things.

Why can't we compile libgcc with libc support?

GCC uses the GNU Build System to configure and build itself. If you're not familiar with the GNU Build System (Autotools), I recommend checking out GNU's Autotools Introductions. For a brief summary, I like to refer to this picture:

[2]

The files that developers create are configure.ac and Makefile.am. The autoconf program will consume the configure.ac file and create a configure executable. This executable will accept an input file, Makefile.in, which was generated from Makefile.am by automake. The configure executable will insert values from configure.ac into Makefile.in, and the result of this process will be a regular Makefile.

This process sounds a bit complicated, and it is. But it needs to be. Think about all the configuration options that need to be considered when GCC is built. It has to support different host, build, and target environments. It has to support threads, it has to let users configure the location where output executables should go, it has to let users specify the linker to use, etc. The list of unique configurations for GCC is immense. Thus, it wouldn't make sense to ship a unique Makefile for each possible configuration of GCC, that would be impossible. Therefore, we use the GNU Build System to solve this problem.

Moving on, let's take a look at the top level configure.ac file. If we take a look at lines 128 - 168, we'll see a few important variables defined: build_libs, build_tools, host_libs, host_tools, and target_libraries.

Specifically, we see target_libgcc under target_libraries. We also see the comment "these libraries are built for the target environment, and are built after the host libraries and the host tools (which may be a cross compiler)".

The target environment is the system in which the cross-compiler is generating code for. This means a few things. Firstly, libgcc isn't actually compiled into the GCC executable that runs on the host system. Instead, it's a shared library that sits on the target machine. Therefore, libgcc must be linked to a libc library that was compiled for the target system's architecture. Since we don't initially have libc compiled for the target system (we must compile libc with our cross-compiler itself), and we cannot use the host system's libc since it (theoretically) supports a different architecture, we must make sure that libgcc does not include any code from libc when it is built.

Ok, now that we know why libgcc must not link to libc, it's time to figure out how the --with-newlib and --without-header flags play into this.

High-level difference between --with-newlib and --without-header

I found an old email in the gcc-help mailing list archive where Ian Taylor, a Global Reviewer for GCC, explains the interplay between these two flags. Here's the archived email, and here's a summary.

--without-headers is the default for a cross-compiler.
--without-headers forces libgcc (only libgcc) to be built without including any C library headers.
--with-newlib configures libgcc and the other libraries in the GCC package to use newlib headers wherever possible (newlib is a C library for embedded systems).

The above points imply that using --with-newlib alone causes libgcc and the other GCC libraries (e.g. libgomp) to be built with newlib headers. Using --without-headers alone will cause libgcc to be built without the presence of any C library headers, but will not affect other libraries included in the GCC distribution (in LFS, we disable building these other libraries until we have the C library available). Using both --with-newlib and --without-headers will cause a) libgcc to be built without requiring any libc headers and b) other libraries to use newlib headers.

So, to build a cross-compiler for our Linux From Scratch distribution, do we need to specify --with-newlib? No, I don't think we do. Let's look at the GCC build system source code.

The LFS description of --with-newlib implies that we need this flag to define the inhibit_libc macro. However, if we look at GCC's top level configure.ac file, we'll see the following lines:

The condition on lines 2408 - 2410 define the scenarios in which inhibit_libc=true. There are two specific ones that I want to point out:

We're compiling a cross-compiler, and (we've either set --without-headers or stdio.h doesn't exist in the directory where the target's header files will go). stdio.h not being there suggests that the target platform does not have the libc headers available to it.
We've supplied the --with-newlib flag, and (we've set --without-headers or stdio.h doesn't exist in the target's standard header directory).

As Ian Taylor mentioned, --without-headers is the default for a cross-compiler
(test x$host != x$target). Thus, using the flag --with-newlib is redundant, since even if this flag is not set, the conditions shown earlier will ensure that inhibit_libc=true.

The other part of this code block that I want to point out is line 2413:

AC_SUBST(inhibit_libc).

AC_SUBST is an Autoconf macro for creating output variables. AC_SUBST(inhibit_libc) will cause AC_OUTPUT, the last macro that we call in configure.ac, to replace instances of @inhibit_libc@ in input files (Makefile.in) with the value of inhibit_libc.

If we look at lines 734 - 739 of gcc/Makefile.in, we'll see that this is where @inhibit_libc@ is substituted with the value true. This in turn sets INHIBIT_LIBC_CFLAGS = -Dinhibit_libc.

Further down in the same input Makefile, we create the file libgcc.mvars and write the INHIBIT_LIBC_CFLAGS variable to it. Thus, libgcc.mvars will contain the line INHIBIT_LIBC_CFLAGS = -Dinhibit_libc.

libgcc.mvars will get evaluated by the libgcc input Makefile via this include directive on line 183.

Thus, the libgcc Makefile will have access to the INHIBIT_LIBC_CFLAGS variable. It will append the value of this variable to LIBGCC2_CFLAGS.

If we keep following along, we'll see that the value of LIBGCC2_CFLAGS is appended to INTERNAL_CFLAGS.

Then, on lines 317-320, the flags bound to INTERNAL_CFLAGS are appended to the variables gcc_compile and gcc_s_compile.

These variables will be used when compiling the libgcc library. This means that -Dinhibit_libc will be passed as a preprocessor option to GCC. This will define inhibit_libc as a macro with the value 1. This proves that we do not need --with-newlib to define this macro.

The effects of inhibit_libc on libgcc

A tangential question that I want to explore is how the inhibit_libc macro will degrade our libgcc library.

If we do a search for the inhibit_libc constant inside of the libgcc source code, the following files return a match.

Many of these files are implementing stack unwinding for DWARF2 exception handling. By inhibiting libc, these files won't be compiled, as most of their content is under #ifndef inhibit_libc lines. Thus, libgcc won't be able to support exception handling features, like try-catch blocks in C++.

Other files listed here are related to libgcov. According to the GCC docs, "gcov is a tool you can use in conjunction with GCC to test code coverage in your programs." [4] Essentially, gcov is a profiler that lets you find performance statistics like how often each line of code executes and how much computing time each section of code uses. Since we inhibit the use of the C library, the gcov runtime library interfaces will only be stubbed and thus not functional.

Next, we have generic-morestack.c and generic-morestack-thread.c. These files provide library and thread library support for -fsplit-stack, respectively. The GCC man page describes the fsplit-stack option with "Generate code to automatically split the stack before it overflows... This is most useful when running threaded programs, as it is no longer necessary to calculate a good stack size to use for each thread." [3] These files won't get compiled if we have inhibit_libc defined.

To summarize, when building a cross-compiler for Linux From Scratch, we can ignore the --with-newlib flag. The build process for cross-compilers, by default, will define inhibit_libc, which will prevent libgcc from requiring libc support. This will have several important effects on libgcc:
we'll lose the ability to support exception handling, gcov won't be functional, and we won't be able to provide the --fsplit-stack option when compiling code. However, even with this degraded libgcc, we'll still be able to build a fully functional glibc; none of these missing features from libgcc will impact glibc. Understanding this last point was absolutely crucial for me in understanding the Linux From Scratch book and the build process itself.

References

[1] “5.3. GCC-11.2.0 - pass 1,” Linux From Scratch - Version 11.1 . [Online]. Available: https://www.linuxfromscratch.org/lfs/view/stable/chapter05/gcc-pass1.html. [Accessed: 29-Jul-2022].
[2] D. A. Wheeler, “Introduction to the autotools, part 1,” YouTube, 09-Mar-2012. [Online]. Available: https://www.youtube.com/watch?v=4q_inV9M_us. [Accessed: 29-Jul-2022].
[3] “gcc(1) - Linux manual page,” Linux man pages online. [Online]. Available: https://man7.org/linux/man-pages/man1/gcc.1.html. [Accessed: 29-Jul-2022].
[4] “gcov - a Test Coverage Program,” Gcov (using the GNU compiler collection (GCC)). [Online]. Available: https://gcc.gnu.org/onlinedocs/gcc/Gcov.html. [Accessed: 29-Jul-2022].
[5] “The GCC low-level runtime library,” Libgcc (GNU Compiler Collection (GCC) internals). [Online]. Available: https://gcc.gnu.org/onlinedocs/gccint/Libgcc.html. [Accessed: 29-Jul-2022].