One of the first tasks for my project (GSoC, Rustify GJS) was simply to get Rust building alongside the C++ code using autotools. To do so I had to learn some of the autotools suite, and how to write the configuration and makefile input.
I can tell you honestly that I'm not a fan of autotools after this. Sure, it does the job, but the insane amount of macros used for setup/configuration and so on is mind-bending.
Rust and compilation
Building with autotools Optimization
There are a few ways to compile Rust, each has pros and cons depending on your end goal. Example use cases for Rust are;
- Embedded controllers
- Application development
- Libraries
- Embedding in other languages
There are many more use cases than the above of course, but these ones will cover the examples I want to show here. I'll start with the more simple use case, that of compiling Rust on its own for an application.
Compiling Rust Code
This is dead simple, but! There are two ways to do so.
Cargo is the standard way to create and build Rust software. It performs a rather lot of functions: create new projects, compilation, testing, benchmarking, documentation generation, publishing projects as crates, and a few more.
A Quick Binary
Lets create a new Rust project, run; cargo new --bin hello_rust
, this creates a new cargo
project in a sub directory of the current directory with the name hello_rust
which is a binary. The directory structure is:
. ├── Cargo.toml └── src └── main.rs
Rust has also helpfully created an fn main()
which prints "Hello, world!". So lets compile it with cargo build
. Cargo by default builds a debug
version of everything since this is the most commonly requested mode. To build a release
version run cargo build --release
.
You can also compile with rustc main.rs
. However if you use rustc
on its own to compile, you will need to do a lot of extra stuff manually such as adding the compiler flags that cargo build --release
adds if you want an equivalent release build; this is generally rustc -C opt-level=3 -C debuginfo=0
. Using rustc on its own will get pretty harsh once you start to include external crates, linking other libs and so on, so for the rest of this post I will focus on using only cargo since it handles a lot of stuff for us in the background, but where it may be instructive I will include equivalent rustc commands.
Building a Library
Rust libraries and the integration of a rust lib in to C++ (or any other language) is the focus of my project, so lets get started!
The project I'm going to use as an example will use autotools
to control compilation, and use both C++ and Rust, with C++ having the main call point, and both languages calling functions plus passing variables to each other.
Start by creating a directory to store the project in, and in that, create an src
directory;
In src/
create a main.c
with the following content;
#include <stdio.h> extern void hello_world(); // declare the Rust function int main(void) { hello_world(); }
This is fairly standard for C, what we're interested in though is the declaration of the Rust function; extern void hello_world()
. The extern
here tells the compiler that what follows is a declaration only, and not to allocate memory for it as it will be found elsewhere at link time. In other words; this is declared, but not defined - it is defined somewhere else. In our case, it will be defined in the Rust source which will then export the symbol (compiled function definition) at compile time so that it can be linked.
Change in to src
create a new Rust project using cargo new --lib rs_hello --name rs_hello
. This creates our project under rs_hello
with a Cargo.toml
, and src/lib.rs
, and names it. The source file contains only a simple test to run, and no functions or other code. You can erase or leave the test code there, it won't affect anything being done in this post, but it is good to learn how rust tests are built and run.
In src/rs_hello/src/lib.rs
add the function that we declared in the C source;
#[no_mangle] pub extern "C" fn hello_world() { println!("hello world!"); }
It's as simple as that, but there are two things to note;
#[no_mangle]
- this tells the rust compiler not to generate a hash of the function name.pub extern "C"
- here we're declaring that the function is publicly accessible (pub), and is being exported to the C calling convention.
You can compile and run this right now if you wanted to, in the base run;
rustc --crate-type staticlib -o librs_hello.a src/rs_hello/src/lib.rs && gcc -o hello src/main.c librs_hello.a -ldl -lrt -lpthread -lgcc_s -lc -lm -lrt -lutil
Running cargo build
on a new library project will by default produce a rustlib, .rlib
, which is not linkable to external non-rust source, open src/rs_hello/Cargo.toml
and append to the end;
[lib] crate-type = ["staticlib"]
Using cargo build
in src/rs_hello
will produce the static link library in src/rs_hello/target/debug
by default, and to link with the main.c
just prepend the path to librs_hello.a
.
Note: libraries built with cargo will have lib
prepended to their name.
Building with autotools
Now, on to autotools!
We will need two files in the base directory: configure.ac
and Makefile.am
. The content of configure.ac
is;
AC_PREREQ([2.60]) AC_INIT([rust_hello], [0.1]) AM_INIT_AUTOMAKE([1.6 foreign subdir-objects]) m4_ifdef([AM_SILENT_RULES], [ AM_SILENT_RULES([yes]) ]) AC_CANONICAL_HOST AC_PROG_CC_C99 AM_PROG_CC_C_O AC_PATH_PROG([CARGO], [cargo], [notfound]) AS_IF([test "$CARGO" = "notfound"], [AC_MSG_ERROR([cargo is required])]) AC_PATH_PROG([RUSTC], [rustc], [notfound]) AS_IF([test "$RUSTC" = "notfound"], [AC_MSG_ERROR([rustc is required])]) LT_INIT AC_CONFIG_MACRO_DIRS([m4]) AC_CONFIG_FILES([ Makefile ]) AC_OUTPUT
As far as I can tell (and I'm absolutely not an autotools expert here) this is fairly standard for an ultra basic configure.ac
. We're only going to be focusing on the relevant rust bits however, as that is what makes our build tick.
AC_PATH_PROG([CARGO], [cargo], [notfound])
is a macro (AC_PATH_PROG
) that checks if a program (cargo
) exists, and stores it in the variable [CARGO]
, if it doesn't exist it stores notfound
in the variable.
AS_IF([test "$CARGO" = "notfound"], [AC_MSG_ERROR([cargo is required])])
tests the variable CARGO
, and checks if the content matches "notfound"
, if it does then it calls the error print macro AC_MSG_ERROR
.
The content of Makefile.am
is;
ACLOCAL_AMFLAGS = -I m4 RSHELLO_DIR = src/rs_hello RSHELLO_TARGET = $(RSHELLO_DIR)/target/release bin_PROGRAMS = hello_rust hello_rust_SOURCES = src/main.c hello_rust_LDADD = $(RSHELLO_TARGET)/librs_hello.a hello_rust_LDFLAGS = -lrt -ldl -lpthread -lgcc_s -lpthread -lc -lm -lrt -lutil $(RSHELLO_TARGET)/librs_hello.a: cd $(srcdir)/$(RSHELLO_DIR); \ $(CARGO) rustc --release -- \ -C lto --emit dep-info,link=$(abs_builddir)/$@ clean-local: cd $(srcdir)/$(RSHELLO_DIR); cargo clean
Again, a fairly standard layout. bin_PROGRAMS
declares the name of our program, and the lines beginning with hello_rust_
declare much of the same stuff that we used for the gcc
command above. We haven't included the rust source on the SOURCES
line however since autotools
is geared towards compilation of C/C++.
How does it build the rust source then? It looks at
hello_rust_LDADD = $(RSHELLO_TARGET)/librs_hello.a
and sees that it needs librs_hello.a
in the src/rs_hello/target/release
directory then looks for the relevant commands to build that if it doesn't exist'. That's where $(RSHELLO_TARGET)/librs_hello.a:
comes in to play. This is a pattern that make
matches against which basically says
"for any file named librs_hello.a in directory src/rs_hello/target/release, perform the following operations";
- cd in to
$(srcdir)/%(RSHELLO_DIR)
-srcdir
is a variable that Make sets topwd
, andRSHELLO_DIR
is the variable we set near the top of the file. - run
cargo
, which is contained in the variableCARGO
with the following arguments;- rustc --release - instructs cargo to use the rustc option, which allows us to pass arguments to rustc, and uses the "release" profile.
--
arguments to rustc begin.-C lto
- this is not a default option in--release
mode.lto
is "link-time optimization".--emit dep-info,link=$(abs_builddir)/$@
breaks down to;--emit
output the following,dep-info
, tells us what libraries you need to link to the output,link
, a compiled binary with therustlib
linked in,=$(abs_builddir)/$@
output the link files to the builddir (generally the base dir of the source if not set),$@
is a macro the autotools uses which passes in the file name that is before:
-$(RSHELLO_TARGET)/librs_hello.a
The last block, clean-local:
run along with the usual clean with make clean
, since rust and cargo place files in different locations to what autotools expects, we need to clean up manually. This cds in to the cargo project and run cargo clean
.
With those two files done, you now need to run autoreconf -si
to generate all the files needed. Then run ./configure
followed by make
.
Congratulations! You've built a Rust library used by a C program, using autotools. So with that groundwork out of the way, lets dive a little deeper.
Types of Libraries
You'll recall that above we had to pass in the staticlib
option to rustc
and add to the Cargo.toml
for use with cargo. This is because rust builds rust libraries (.rlib) by default which are native to rust only. The format of these is still unstable afaik, and may change between rust versions. They also include extra metadata for rust, and don't require the use of unsafe blocks when you want to use functions/data from them. This cannot be used with other languages.
For this reason we need the staticlib
option. This produces a static library which contains all the rust projects generated code and its upstream dependencies. As such it will not have external dependencies on Rust libraries.
There are other options too!
dylib
produces a dynamic rust only library. This can be used with other languages at the moment but will eventually be used for Rust only. The file extension is *.so
on Linux. You should probably avoid using this altogether and use either lib
for Rust libraries, or one of the below for external use cases.
cdylib
is a dynamic library which is a newer output format introduced in rust v1.10 specifically for use with embedding in other languages. It exports public Rust symbols as a C API using C calling conventions. This is meant to be linked in to binaries that use it, at run time, this typically uses a system linker mechanism. The file extension is *.so
.
staticlib
is meant to be compiled and linked in to other projects statically - this means it is copied in to the binary that uses it, at compile time. Suitable for embedding in other languages. File extension is *.a
.
lib
is default, and will be whatever Rust needs it to be to produce a compiler recommended Rust library.
rlib
is a static Rust library.
A small note: if you were to produce a library for use with other Rust projects, you should use the default lib
. If you use cdylib
or staticlib
, Rust projects will need to use unsafe
blocks.
Static vs Dynamic Linking
Linking on Linux is typically done using ld
, and is the last step of compilation. If you run man ld
to view the man page for it, the first sentence of the description states;
ld combines a number of object and archive files, relocates their data and ties up symbol references.
This gives a pretty good idea of what linking is. When building a typical C/C++ program, the compiler will compile each source file to can object file, then as the last step it will invoke ld
to tie them all together.
Each declared function or data structure in one source file that is meant to be public to another source file (as in our example above, pub extern "C" fn foo()
) is exported and exposed as a symbol. When another source file references this function, the linker looks for the related symbol and links them together.
The way linking is done for static
vs dynamic
is different.
- static linking replaces all references to external symbols in a compiled object with the actual code needed at compile time
- dynamic linking will instead put a reference to the library being linked to in the compiled binary/library, and will not link to it until runtime. A dynamic library can be shared between many programs.
Rust by default static links all Rust dependencies including the Rust std library, as in, it copies in parts of the libraries where it is used.
If you create a library using dylib
or cdylib
, that library is dynamically linkable to other projects, and also static links the Rust std library. Whereas if you create a staticlib
, that library is copied in to other projects that use it (along with the Rust library parts it contains).
Rust will however, dynamic links system libraries such as libc
and pthreads
. You can static link system libraries if you use an alternative libc such as musl. Read more here
Rust and Objects
In Types of Libraries we outlined a few types of libraries that Rust can build - we can also output object files much like C/C++ compilation does. This can complicate things a bit though and I won't go in to much detail here except to outline it. If you did want to output objects for linking, then you will be losing the benefit of cargo handling linking for you - this means you need to manually link any Rust libraries you depend on.
When you're dealing with library names such as /usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-912d6e6c7cbc93f3.so
, well, I don't recommend forgoing cargo unless you really need to. Unusual filename right? This hash will change with each distribution that compiles the Rust compiler from scratch and so is a bad idea to hardcode the name in to build scripts. Whence why sticking to cargo is a good idea.
Another use-case is using bare metal rust on an embedded controller. Bare metal Rust is Rust code only, no standard lib, no libraries that may require external dependencies - this makes it much easier to deal with linking.
Rust is not ABI stable
Rust does not have a stable ABI as of yet, and may not do for some time. What this means to us in terms of linking is that a project that dynamically links the Rust std library will only work with the library that it was compiled with. This shouldn't be an issue with Linux distribution supplied Rust, but if you switch between [Rustup][rustup] and distro supplied, it likely won't work with one.
Optimization
Now that we have covered what types of libraries there are, lets have a look at way's to optimize Rust.
In all honesty, there isn't that much you need to do - default --release
produces very fast binaries with the following defaults;
- opt-level = 3
- debug = false
- lto = false
- panic = 'unwind'
But there are some things you can do such as reducing size via link-time optimization, using a different allocator, and a few other tricks.
Covering how LTO works in detail is well beyond my abilities, but I may be able to adequately simplify it; at compile time the objects produced consist of everything that may be used, eg, all of a library. For C, the first pass of a linker may find that function foo()
is not actually used, and so it is removed from the object, a second pass may find some condition is always false and so bar()
is never called, on a third pass since fizz()
was being called by bar()
, and bar()
was removed, fizz()
is no-longer called and so is removed too.
Using LTO with Rust works similar to this, it will find all the functions etc that are never called and remove them - this results in a very nice size drop. Once of the differences between Rust and C here, is that Rust will warn you that a block of code isn't reachable (the compiler treats it as an error if it is a pattern matching block) and implores you to remove it.
So how do you use LTO? Two ways;
- pass
-C lto
to rustc as an arg, or viacargo rustc --release -- -C lto
if using cargo - or, (also for cargo) add a section in the
Cargo.toml
as follows;
[profile.release] lto = true [profile.debug] lto = true
Currently for the small amount of Rust I have in GJS so far, using LTO reduces the size of libgjs.so
from 12mb to 7.7mb - quite a decent saving.
Another way to reduce the final size is with the use of [strip][strip] - a tool used to remove symbols from a binary/object. Handy also for making reverse-engineering harder (you'll probably never stop Matthew Garrett though).
Running strip
on libgjs.so
with my Rust code compiled in without LTO reduces this size down to 1.9mb. Using LTO and strip
reduces it to 912K.
The usual way to use strip
is to remove only the debug symbols, via strip --strip-debug
, running this on libgjs.so
along with LTO reduced the size from 7.7M to 926K.
This step is typically performed by Linux distributions as part of their packaging process - they strip the debug symbols out to a separate file/s and package these alongside the stripped binary/library. The end user doesn't require them normally.
You can pass a strip argument to rustc with
rustc -C link-args=-s
If you are using cargo this would be
cargo rustc --release -- -C link-args=-s
The last thing we can try is changing how panics are handled. The default handling for a panic is to include code to unwind the stack to help debugging. We can remove the code for unwinding, and just abort by passing an arg to rustc;
rustc -C panic=abort
or with cargo, add panic = "abort"
to the relevant profile section. The saving here isn't all that much though, ~100K, but this may be useful for embedded devices etc.
Finally
In light of all the testing and getting to grips with autotools and how various bits of the Rust compiler work, I've decided for the "Rustify GJS" project to use basically what is covered in the examples.
- the default args for the
--release
are quite adequate - to reduce final size I have used
lto
- stripping is to be left to distributions
- static linking the rust code in will be best to keep
libgjs.so
whole.
And one last thing: You can pass global args by the RUSTFLAGS
environment variable, such as RUSTFLAGS="-C lto -C panic=abort" cargo build
, I will likely switch to this method at some point. The RUSTFLAGS
env-var also means that Rust crate dependencies also use these flags, where without the env-var set, they use the rust defaults.
Please email me if you see anything factually inaccurate that needs correction, or even just better explanations.
Note: Makefiles require the use of actual tabs, not spaces.
TODO: Parallel build fails due to Rust not finishing build before C++ linking.