Background Information

This is the most technical part of this book but if we truly want to understand, we just have to go through it. I promise I will make it as fast and to the point as possible. We'll soon enough move on to the code.

So here we go! First of all, we are going to interfere and control the CPU directly. This is not extremely portable since there are many kinds of CPU's out there. The main ideas are the same, but a small part of the implementation details will differ.

We will cover one of the more commonly used architectures: x86-64.

In this architecture the CPU features a set of 16 registers:

Click the picture to view an enlarged view

If you're interested you can find the rest of the specification here: https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI.

Out of interest for us right now is the registers marked as "callee saved". These are the registers that keep track of our context: the next instructions to run, the base pointer, the stack pointer and so on. We'll get to know this more in detail later.

If we want to direct the CPU directly we need some minimal code written in Assembly, fortunately we only need to know some very basic assembly instructions for our mission. How to move values to and from registers:

mov %rsp, %rax

Windows has a slightly different convention. On Windows the registers XMM6:XMM15 is also callee-saved and must be saved and restored if our functions use them. Our code runs fine on Windows even if we only use the psABI convention in this example.

There are one more subtle difference as well that you can read about in Appendix: Supporting Windows where we go through everything. You can follow along anyway, since everything will work on Windows, but it will not be a correct implementation.

A super quick introduction to Assembly

First and foremost. Assembly language is not very portable, each CPU might have a special set of instructions, however some are common on most desktop computers today.

There are two popular dialects: AT&T dialect and Intel dialect.

The AT&T dialect is the standard when writing inline assembly in Rust, but in Rust we can specify that we want to use the "Intel" dialect instead if we want to. Rust mainly leaves it up to LLVM to deal with inline assembly, and the inline assembly for LLVM closely resembles the same syntax which is used when writing inline assembly in C. That makes it a lot easier to look at C inline ASM for learning since the syntax will be very familiar (though not exactly the same).

We will use the AT&T dialect in our examples.

Assembly has a strong backwards compatibility guarantee. That's why you will see that the same registers are addressed in different ways. Let's look at the %rax register we used as example above for an explanation:

%rax # 64 bit register (8 bytes)
%eax # 32 low bits of the "rax" register
%ax # 16 low bits of the "rax" register
%ah # 8 high bits of the "ax" part of the "rax" register
%al # 8 low bits of the "ax" part of the "rax" register

As you can see, this is basically watching the history of CPUs evolve in front of us. Since most CPUs today are 64 bits, we will use the 64 bit registers in our code.

The word size in assembly also has historical reasons. It stems from the time when the CPU had 16 bit data buses, so a word is 16 bits. This is relevant because in the AT&T dialect you will see many instructions suffixed with "q" (quad-word), or "l" (long-word). So a movq would mean a move of 4 * 16 bits = 64 bits.

A plain mov will use the size of the register you use. This is the standard in the Intel dialect and the one we will use in our code.

We will go through a bit more of the syntax of inline assembly in the next chapter.

One more thing to note is that the stack alignment on x86-64 is 16 bytes. Just remember this for later.