Overview of the Compiler

The compiler translates user-defined coprocessor functions (input to bulk-define) into the coprocessor's assembler and optionally into C. The first eight steps convert the input code into an intermediate language; the final three steps translate this intermediate language into C or assembler. Except for the final steps, the code keeps a scheme-like syntactic form.

Type inference, and some other functions, add type markings to the heads of forms. This is done by replacing the head (a symbol) with a vector consisting of the type and the head. Compiler rules from step 4 onward either require typed heads (explicitly indicated in their descriptions) or understand both sorts of heads.

Five utilities are called more than once during compilation: percolation, type inference, arithmetic identities, real contagion, and removal of useless forms.

The compilation process common to C and assembler is organized into eight steps numbered (for historical reasons) as: 1, 2, 3, 4, 5a, 5b, 6a, and 6b.

step 1: Register Function

check basic format of input and register the function

step 2: Check Syntax

check syntax, eliminate variant syntax, trivial rewriting rules

step 3: Rename Variables

assign unique names to variables, remove binding forms, detect undefined symbols

step 4: Pre-type Rewriting

rewriting rules that introduce new symbols but don't require type information

miscellaneous rewriting rules and code to trigger run-time errors if missing values are passed to forms that cannot accept them
arithmetic identities
expand sheet handlers
purity analysis

step 5a: Type Inference and type-dependent rewriting

fill in type information for input and output variables
infer types of variables
infer types of sub-forms
coerce outputs to the declared type of output variables and wrap the code in a return form if it returns values
in-line scanner substitution (recursively calls type inference)
real contagion
specialize guaranteed-divide (recursively calls type inference)
give distinctive names to 2D and 3D vector operations

step 5b: Restructuring

remove useless forms
percolate (required by the next step)
rewrite binary-if forms so they never return values
remove all begin forms from expression contexts

step 6a: Rewriting Specific Operators

expand even-quotient-scalar
add code to generate a missing value if any input to an expression is missing
expand 2D and 3D vector operations
simple expansions

step 6b: Final Clean-up

miscellaneous polishing of output code

type inference (for the benefit of next step)
real contagion
arithmetic identities
constant promotion
remove useless forms
percolate
arithmetic identities
type inference (for the benefit of steps 7-9)
remove unused symbols from symbol table

At the end of step 6, the compiler caches the output code, rewritten lambda-list, and symbol table (with type markings). The output code is in a common intermediate language significantly different from the user-level language.

The common intermediate language is converted to the co-processor's assembly language (byte code) by steps 7 through 9:

step 7: Assembler-Specific Rewrite Rules: a few trivial rewrites
step 8: Assembler Linearization: generate instructions in linear order by walking the code recursively
step 9: Assembler Linking: convert symbolic variable names into stack locations, symbolic jump points into line numbers

Production of C code is also divided into three steps:

step 7': C-Specific Rewrite Rules: reshape the code to match C's syntax
step 8': C Code Generation: convert the scheme-like forms into correct C syntax
step 9': C Wrapper Generation: write the "wrappers" which allow interpreted code to call compiled code

Ownership, Maintenance and Disclaimers

Manual Top Page

Last modified