-----

Overview of the Compiler

-----

The compiler translates user-defined coprocessor functions (input to bulk-define) into the coprocessor's assembler and optionally into C. The first eight steps convert the input code into an intermediate language; the final three steps translate this intermediate language into C or assembler. Except for the final steps, the code keeps a scheme-like syntactic form.

Type inference, and some other functions, add type markings to the heads of forms. This is done by replacing the head (a symbol) with a vector consisting of the type and the head. Compiler rules from step 4 onward either require typed heads (explicitly indicated in their descriptions) or understand both sorts of heads.

Five utilities are called more than once during compilation: percolation, type inference, arithmetic identities, real contagion, and removal of useless forms.

The compilation process common to C and assembler is organized into eight steps numbered (for historical reasons) as: 1, 2, 3, 4, 5a, 5b, 6a, and 6b.

step 1: Register Function
check basic format of input and register the function
step 2: Check Syntax
check syntax, eliminate variant syntax, trivial rewriting rules
step 3: Rename Variables
assign unique names to variables, remove binding forms, detect undefined symbols
step 4: Pre-type Rewriting
rewriting rules that introduce new symbols but don't require type information
step 5a: Type Inference and type-dependent rewriting
step 5b: Restructuring
step 6a: Rewriting Specific Operators
step 6b: Final Clean-up
miscellaneous polishing of output code

At the end of step 6, the compiler caches the output code, rewritten lambda-list, and symbol table (with type markings). The output code is in a common intermediate language significantly different from the user-level language.

The common intermediate language is converted to the co-processor's assembly language (byte code) by steps 7 through 9:

step 7: Assembler-Specific Rewrite Rules
a few trivial rewrites
step 8: Assembler Linearization
generate instructions in linear order by walking the code recursively
step 9: Assembler Linking
convert symbolic variable names into stack locations, symbolic jump points into line numbers

Production of C code is also divided into three steps:

step 7': C-Specific Rewrite Rules
reshape the code to match C's syntax
step 8': C Code Generation
convert the scheme-like forms into correct C syntax
step 9': C Wrapper Generation
write the "wrappers" which allow interpreted code to call compiled code

-----

Ownership, Maintenance and Disclaimers

Manual Top Page

Last modified