Ch.2 Setting up SML# programming environment

§ 2.2. Bootstrapping the SML# compiler

This section outlines the structure of SML# compiler and the method to building (bootstrapping) it. You do not have to understand this section for installing and using SML# compiler, but you may find this section informative in understanding various messages during compilation of SML# and also a structure of a compiler in general.

SML# system consists of a single compiler that performs separate compilation. Its interactive mode is realized by the top-level-loop performing the following steps:

  1. compile the user input using the current static environment as its interface,

  2. link the object file with the current system to generate a shared executable file,

  3. dynamically load the shared executable in the current system, and call its entry point.

SML# is written in SML# and C. In addition, it uses the following tools during compiling the SML# compiler.

  • ml-lex,ml-yacc: a lexical analyzer generator and a parser generator.

  • SMLFormat: a printer generator.

  • The Standard ML Basis Library.

All of them are written in Standard ML.

SML# compiler compiles each SML# (which is a super set of Standard ML) source file (source.sml) into a system standard object file (sample.o file). The compiled files are then linked by the standard linker (ld in Unix-family OS) to generate an executable file. So, in order to build the SML# compiler, it is sufficient to have a C compiler and an SML# compiler. But of course, at the time when the SML# compiler is first built, an SML# compiler is not available. The standard step of solving this bootstrap problem is the following.

  1. Build minismlsharp command

    1. Obtain a set of assembly source files of a minimal compiler minismlsharp that is sufficient for compiling all the source files used in the SML# compiler. The assembly files are typically generated by an older version of SML# compiler.

    2. In the system where the target SML# compiler is installed, assemble the minismlsharp assembly files, link them together and create a minismlsharp command.

  2. Build ML-lex and ML-yacc commands

    1. Compile the library files required by ml-lex and ml-yacc using minismlsharp.

    2. Compile ml-lex and ml-yacc sources using minismlsharp.

    3. Create ml-lex and ml-yacc commands using the system linker.

  3. Build SMLFormat command

    1. Compile the library files required by SMLFormat using minismlsharp.

    2. Generate the parser source file used in SMLFormat using ml-yacc and ml-lex.

    3. Compile SMLFormat source files using minismlsharp.

    4. Create SMLFormat command using the system linker.

  4. Build smlsharp command

    1. Compile all the library files using minismlsharp.

    2. Generate the parser source files of SML# using ml-yacc and ml-lex.

    3. Generate the printer source files of SML# using SMLFormat.

    4. Compile SML# source files using minismlsharp.

    5. Compile C source files of the SML# runtime system.

    6. Invoke the system linker to like all the object files. This yields smlsharp (SML# compiler) command.

  5. Install the compiler Use the system install command to install the following files.

    • Library object files. They are linked with compiled user source files.

    • Library interface files. They are used when a user source file is compiled.

    • Library signature files.

    • ml-lex,ml-yacc,SMLFormat,smlsharp commands.

As outlined above, there are complex dependencies among source files and commands. Furthermore, processing some of these files depend on the underlying OS. This is a typical situation in a large system development. One well established method to solve these dependency problems is to use configure script generated by GNU Autoconf and make command.

SML# compiler compiles each source file according to its interface file, which describes the set of files require by the source file. SML# compiler can also generate a list of files on which each source file depends in the Makefile format that can be processed by make command. SML# compiler does this task, when it is given one of the following switch.

  1. smlsharp -M smlFile. The compiler generates the dependency for the source file smlFile to be compiled in the Makefile format.

  2. smlsharp -Ml smiFile. The compiler assumes that the file smlFile specifies the top level system, and generates the list of necessary object files in the Makefile format.

In the SML# project, we make a Makefile that performs the above described complicated sequence of compilation and linking steps using the above functionality of SML#. Invoking make command on Makefile re-compiles only the necessary files to build SML# compiler.