Creating a programming language
Thorsten Ball joined the show to talk about creating a programming language, writing an interpreter, why he wrote the book âWriting An Interpreter in Goâ, how writing a language/interpreter will help you better understand other programming languages, building a computer from Nand to Tetris, and his thoughts on imposter syndrome.
Matched from the episode's transcript đ
Thorsten Ball: [23:47] Yeah, thatâs exactly the point you make. I think itâs really funny because a lot of people, they⌠What you said is absolutely correct - the market of compiler writers is a small one. You donât see many advertisers or recruiters sending out emails like âDo you wanna write a compiler?â But a compiler is hugely complex, itâs interesting, it has a lot of parts, and if you understand how they work, you can take those parts and use them in other places. If you look at those parts, you can recognize patterns and then use those patterns again.
The basic idea behind a compiler is it takes input, which is programming code or code, and it takes this input, transforms it and puts out something the computer can understand and execute. You take puts âHello Worldâ and give it to a compiler, and the compiler outputs machine code. This machine code is much longer than puts âHello Worldâ and it contains all the machine code instructions that tell the CPU and the computer how to display Hello World on your screen. It does this by having certain stages⌠You always talk about stages and passes with compilers. Source code comes in on one end, and on the other end comes out machine code, or some other form of code. I donât wanna escalate this conversation, but there are certain compilers that do not translate to machine code, but other programming languages; theyâre sometimes called transpilers.
In the end, itâs the same idea - you take source code and output something that a computer can understand. It does this by first parsing the input; it most of the times constructs an internal tree, a syntax tree, and it then has several passes or phases where it takes this tree and tries to look at it in detail and find out if there are some parts of the tree it can move, throw away, or if there are some parts of the tree it can fold together, or if there are duplicates, if there are errors, if there are parsing errors in there. Then it takes this tree and it kind of - Iâm simplifying, right? - walks down the tree and it outputs machine code that lets the computer execute what this tree is supposed to mean. It gives the tree meaning, it gives the source code which you input meaning. Does that make sense?