The Compiler Writer Resource Page


I know complete pans of the literature are left out, but this is a page for amateur compiler writers. Anything that I did not find practical is not listed here. (I also did not include the things that I do not yet know!) All the remarks in grey and even the selection of documents are personal. If you have suggestions of papers to include, please contact me! Finally, the order of items in the various sections is totally arbitrary.

Philosophy

  • What every compiler writer should know about programmers. M. Anton Ertl. [pdf] Contains some good warnings about taking the slippery path many compiler writers are taking with respect to undefined behavior.

Compiler Descriptions

  • A New C Compiler. Ken Thompson. [pdf] A description of the portable C99 Plan 9 compilers.
  • A Tour Through the Portable C Compiler. S. C. Johnson. [pdf] It is probably not a very good reference, but full compiler descriptions are pretty rare. The PCC compiler has changed quite a bit since this writeup.
  • A Retargetable C Compiler: Design and Implementation. David R. Hanson and Christopher W. Fraser. [amazon] A resonably sized book exposing the full source code of a retargetable compiler (lcc). I found the parts describing the frontend and the automatic code generator most interesting.

Books

  • Compiler Construction. Niklaus Wirth. [pdf] An elementary and short textbook to learn the basics of compilation. Written by Wirth, a Turing award winner, inventor of Pascal.
  • Compilers: Principles, Techniques, and Tools. Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. [amazon] Despite being intimidating, it is a very readable book with many examples. It has the best material available on lexing and parsing. Some key concepts of more advanced optimizations are also very nicely exposed.
  • Advanced Compiler Design and Implementation. Steven Muchnick. [amazon] A pretty comprehensive exposition of compiler technology. Many details are left aside and some code is wrong. Its main value is I think the span of its coverage.
  • Hacker's Delight. Henry S. Warren, Jr. [amazon] This is a really fun book for computer geeks. It is also filled with tricks that can be used for strength reduction. For example, it will teach you how to turn a division by a constant into a multiplication.

Dominators and Static Single Assignment

  • The Static Single Assignment Book. [pdf] A very detailed and practical book about many aspects of the SSA intermediate representation.
  • Simple and Efficient Construction of Static Single Assignment Form. Matthias Braun, Sebastian Buchwald, Sebastian Hack, Roland Leißa, Christoph Mallon, and Andreas Zwinkau. [pdf] Combining laziness and memoization, the authors make a naive construction algorithm both efficient and correct. The algorithm is especially useful for fixing SSA invariants locally broken.
  • An Efficient Method of Computing Static Single Assignment Form. Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. [pdf] The paper explaining the classic construction algorithm of SSA form using domination information.
  • A Simple, Fast Dominance Algorithm. Keith D. Cooper, Tymothy J. Harvey, and Ken Kennedy. [pdf] An algorithm to compute the domination information very practical and simple to implement. It can be slower than the more complex Lengauer-Tarjan algorithm on huge input programs.

Optimizations

  • Constant Propagation with Conditional Branches. Mark N. Wegman and F. Kenneth Zadeck. [pdf] The state-of-the-art algorithm for constant propagation (known as sparse conditional constant propagation). Despite being cutting edge, the algorithm remains simple.
  • Global Code Motion Global Value Numbering. Cliff Click. [pdf] Explains how to make value numbering a global optimization by leveraging SSA form and an independent global code motion algorithm.

Register Allocation

  • Linear Scan Register Allocation. Massimiliano Poletto and Vivek Sarkar. [pdf] The classic linear scan paper, this algorithm is fairly popular in JIT compilers because it compiles quickly and generates reasonable code.
  • Linear Scan Register Allocation on SSA Form. Christian Wimmer Michael Franz. [pdf] An adaptation of the classic and fast linear scan algorithm to work on SSA form. It presents the challenges posed by SSA form pretty well.
  • Iterated Register Coalescing. Lal George and Andrew W. Appel. [pdf] A classic paper for register allocation using graph coloring. It is usually said to be slow and not used in JIT compilers. It is also not often used in big compilers because not flexible enough (see below for a fix).
  • A Generalized Algorithm for Graph-Coloring Register Allocation. Michael D. Smith, Norman Ramsey, and Glenn Holloway. [pdf] Graph-coloring as presented in textbooks is not suited to real computers. This paper presents various extensions to handle register classes and aliases. It is the algorithm used in PCC.

Code Generation

  • Engineering a Simple, Efficient Code Generator Generator. Christopher W. Fraser, David R. Hanson, and Todd A. Proebsting. [pdf] This paper presents iburg, a flexible code generator generator that generates compact and simple code. It is used in the retargetable C compiler lcc.

Machine Specific

  • What Every Computer Scientist Should Know About Floating Point Arithmetic. David Glodberg. [pdf]
  • System V Application Binary Interface (AMD64). [pdf] (Almost) Everything you need to know to interface with C code on x64 platforms running a UNIX OS.
  • Procedure Call Standard for the ARM 64-bit Architecture. [pdf] The ABI for AARCH64, saner and better explained than the x64 one.
  • Short X86 Instruction Set Reference. [www] Clearly not exhaustive, but still pretty useful for quick lookups.