The compiler sees characters, not code
Here's one line of Rust: let answer: i32 = 40 + 2;
When I read that line, I see a variable named answer, a type, and some math. When the compiler first receives it, it sees l. Then e. Then t. Then a space. That's the entire input: a long row of characters, exactly as meaningless to the machine as this sentence is.
There is no variable in that file. No type, no addition. Those are ideas, and something has to build the ideas out of the text before anything else can happen. In part 1 I said Clippy reads a tree that the compiler builds. This part is about how that tree gets built. It happens in two big moves: first find the pieces, then find the structure.
Move one: finding the pieces
You can't reason about a program one character at a time, the same way you can't read a book one letter at a time. So the first pass over your file, called the lexer, groups characters into tokens: the smallest chunks that mean something in the language. The character 4 next to the character 0 isn't two things, it's one number, 40. The characters l, e, t in a row aren't three letters, they're the keyword let. Run the lexer over our line and out come nine tokens.
Two things about this step stuck with me. First, the lexer throws things away. Spaces, newlines, comments: gone. No token, no trace. That's the real reason formatting doesn't affect behavior. It's not that the compiler graciously ignores your weird indentation. It's that by the time anything meaningful happens, your indentation no longer exists.
Second, look at how tokens get recognized. "A number is one or more digits in a row." "A name is letters, digits and underscores, not starting with a digit." If those rules sound like regex to you, good instincts: the pattern matching theory behind regex is the same machinery lexers are built on. Which pays off the question from part 1 in a satisfying way. The linter isn't regex. But at the very bottom of the compiler, the first thing that ever touches your file is doing something extremely regex shaped. It's the right tool for exactly this step, and pretty much only this step.
One more thing the lexer is not: smart. It knows pieces, not sense. Notice it labeled i32 a "name," not a type. The lexer has no idea i32 is a type. That's knowledge from a later stage. And it would happily tokenize = = let 40 answer without blinking, because a token stream doesn't have to make sense. Making sense is somebody else's department.
Move two: finding the structure
That somebody is the parser, and its problem is that tokens are a flat list while meaning is not. Take the expression 5 + 3 * 2. As tokens it's just: number, plus, number, star, number. Read flatly, that could mean add 5 and 3 first, then multiply, which gives 16. Or multiply 3 and 2 first, then add, which gives 11. The flat list genuinely does not say which.
What decides it is the language's grammar: the rulebook of what's allowed to follow what, and which operations bind tighter. Multiplication binds tighter than addition, so it's 11. The parser is the thing that applies that rulebook to the token list. And its output is where the whole picture snapped together for me. The parser's output is a tree.
In the tree, the ambiguity isn't resolved so much as it becomes impossible to express. The multiplication sits below the addition, so it happens first, and there's no way to read it otherwise. The shape is the meaning. Even parentheses don't survive into the tree: if you write (5 + 3) * 2, the parens exist only to tell the parser which shape to build, and once the tree is built, they're gone. That's why this thing is called an abstract syntax tree, or AST. Abstract, because it drops everything about how the code was written, spacing, parens, semicolons, and keeps only what the code says.
And this is the tree from part 1. When Clippy flags is_ready == true, it's standing on a comparison node with a variable on one side and the literal true on the other. Nothing about text patterns. Everything about shape.
A perfectly structured lie
So are we done understanding the program? Watch this: let age: i32 = "twenty-two";
The lexer is happy. Every character groups into a clean token, including one string token. The parser is happy too, and builds a lovely little tree: a declaration, name age, type i32, value "twenty-two". Grammatically, that line is flawless. It's also nonsense, the way "the fridge is jealous of my haircut" is a flawless English sentence and also nonsense. Structure can be perfect while meaning is broken.
Catching that is a third job with its own name, semantic analysis: walking the tree and interrogating it about meaning. Does this name exist? What type does this expression have? Is a string allowed where an i32 was promised? No, so the compiler stops right here with the "mismatched types" error every Rust developer has memorized. This stage is where "type safe" actually lives. In Rust it grows into much more ambitious questions too, like whether every borrow follows the ownership rules, though some of those checks actually run a bit further down the pipeline, on a lower-level form of the program that we haven't met yet. That's a later part's story.
We still haven't run anything
Here's the status of our one line of code after all that machinery: characters, then tokens, then a tree, then a tree that's been interrogated and confirmed to make sense. And notice what hasn't happened. Nothing has run. Not one instruction has executed. Not a single number has been added to another number. Every bit of this, the lexer, the parser, the semantic checks, is just the toolchain understanding what I wrote, and we've already gone deeper than the tutorials that shaped my mental model ever went.
The gap between "understood program" and "program actually running on a CPU" is where all the famous words live: compile, bytecode, interpreter, JIT. It's also where the tutorial story gets the fuzziest, because it turns out "compiling" isn't one step. It's a pipeline of transformations, with middle layers almost nobody tells beginners about. That's next: what "compiling" actually means.