PF11 -- An ANS Forth Implementation for the 68HC11

Andrew Sterian
Padnos School of Engineering
Grand Valley State University


Top-Level | Glossary | Compiler Design | Rationale | Notes
Version 1.0

July 18, 2003
Introduction | Compilation Process | Stripping Forth | Memory Analysis | Merging BUFFALO | Serial Driver | Forth Tests | FSEND


Introduction

This document contains several miscellaneous notes about the PF11 implementation. Please read this file at least once if you are planning on working with PF11 extensively.

Compilation Process

The compilation of PF11 is complicated. It is complicated because:
The above complications are dealt with in the following general ways:
The final program is compiled with the PF11_PREBUILT manifest defined. The simulation-only version has this manifest left undefined.
The src/pf_config.h file chooses between the mutually exclusive PF11_RAM_TARGET and PF11_ROM_TARGET. The top-level makefile (through config.mk) sets PF11_ROM_TARGET if a ROM-based target is chosen, otherwise this manifest is left undefined and src/pf_config.h then sets PF11_RAM_TARGET.

By default, all functions are compiled into the text section ("low memory", before EEPROM). The TEXT2 manifest is defined in src/pf_config.h for the ROM-based target to place a function into the eeprom section (i.e., "high memory", after EEPROM). By manually assigning some functions to TEXT2, the hole is avoided.

This process is hit-and-miss. If the source code to PF11 changes, some reshuffling of functions and sections may be necessary.

The easiest solution is to just disable the EEPROM in the 68HC11 CONFIG register and treat all of memory as contiguous. In this case, the TEXT2 manifest can be left blank.

The following sections describe the above processes in more detail.

Two-Stage Build Process

In the first stage, the "boostrap" program is built in the bootstrap/ directory from the files in the src/ directory. Prior to doing so, Forth files are stripped (see below) in the system/ directory and C source file dependencies are calculated by a "make dep" in the src/ directory.

The Makefile variable MEMX1_FILE is used in the bootstrap phase to determine the linker file that describes the memory map. This memory map is fake as it is only used for simulation. However, the key consideration in constructing this file is that it places the components to be pre-built (dictionary and code space) in exactly the same memory locations as the final memory map (second stage).  The bootstrap memory map needs a lot more ROM space (i.e., text section) to store the Forth system dictionary files (in the system/ directory).

The only difference in code between the bootstrap and final program versions is in the file src/pf_build.c. Thus, this file must come last in the list of files to link (and the Makefile ensures this).

In the bootstrap stage, the src/pf_build.c file does the following:
Once the bootstrap version of PF11 is constructed, it is run under the GDB simulator. The bootstrap/batch.gdb file (created by the Python program tools/genbatch.py) contains a list of commands to GDB to run PF11 and dump the dictionary and code space area to disk. The process is as follows:
These three header files are copied to the src/ directory and the bootstrap process is complete. These three header files are the entire result of the bootstrap process. The files are used as follows in the second build stage.
The second build stage occurs in the obj/ directory. The compilation process is the same except -DPF11_PREBUILT is added to the C compiler flags so that PF11_PREBUILT is a defined manifest during compilation. Also, the MEMX2_FILE Makefile variable is used to point to the final program linker configuration file to define the memory map.

Section Control

Placing data and code in the right sections can often be a challenge with the GCC tools, especially since both RAM-based and ROM-based targets must be accomodated. The main sections of interest are as follows.
The remaining space in this section is available for custom C code or for converting more PF11 C variables to reside in this section (by declaring them with the PAGE0 macro...see the src/pf_core.c file for examples).
The text memory map region also contains the rodata linker section, which represents read-only data like string constants, and variables declared with the const keyword. The linker places data in the rodata section right after the text section in the memory map.

The compilation process ensures that the GCC library file crt0.o is linked first, thus the label _start (where the whole program begins) resides as the first location in the text section. The end of the text section is given by the C variable _etext (defined by the linker). Note that this is the address where the rodata section begins.
The data section has a "load address" given by the C variable __data_image (automatically created by the linker) and a runtime address given by the C variable __data_section_start. The size of the data section is given by __data_section_size. The load address is a place in ROM (should be just beyond the end of the text section) where the initialization data is stored. The runtime address is a place in RAM where the data structure actually resides during program operation. Thus, the runtime loader automatically copies __data_section_size bytes from __data_image to __data_section_start prior to calling main().
The ROM-based target section usage is described first, as it's easier.
At run time, the function pfBuildDictionary (in the second stage program) copies td_DictPrebuilt to td_Dict and td_ExecPrebuilt to td_Exec. One may well wonder why this scheme is necessary, specifically why td_Dict and td_Exec were not defined to reside in the data section and the GCC built-in loader used to initialize them. The answer is that the linker would not consistently place these in the same location between the pre-built and final programs. When they are placed in bss, the two-stage bootstrap process works as desired.

The distinction between rodata and eeprom for td_DictPrebuilt and td_ExecPrebuilt is simply to allow the code to fit into both low memory and high memory. There is nothing special about placing td_ExecPrebuilt in the eeprom section.

For a RAM-based target, the section usage is as follows.
Note that the dictionary and code space are initialized directly from the C strings in predict.h and preexec.h rather than indirectly through td_DictPrebuilt and td_ExecPrebuilt (which are not defined for the RAM-based target).

The initialized structures (dictionary and code area) are placed in text instead of data so that memory is not wasted. If these structures were declared as normal initialized global variables (i.e., placed in data) then there would be both run-time and load-time space allocated for them, doubling the memory requirements for each structure. Since all code is going into RAM, placing dictionary and code space in text allows the pre-built structures to be loaded directly into the 68HC11 and still be mutable by user code.

As for placing the two stacks in text instead of bss, the goal here is to simplify the construction of the memory configuration file. The linker cannot be told to simply place everything (text, data, bss, the works) into one big memory area, which would really be the simplest thing to do on a 68HC11, since it has no virtual memory. Thus, you have to guess ahead of time where text will end and data begins when constructing the memory configuration file. By making the data section as small as possible, the process becomes easier. It is also easier to modify the size of stacks and recompile without having to modify the memory configuration file.

If you compile for the RAM-based target and inspect obj/program.map you will notice that the entire data memory region only occupies less than 600 bytes, thus making it easy to configure the memory map.

Stripping Forth

The Python program tools/fthstrip.py is a simple Forth file stripper. This program strips out all of the following from a Forth source file:
The program understands the following Forth constructs that define untouchable strings:
For example, the Forth string 'word1      word2' will be stripped to 'word1 word2', but '." Hello      there"' will be left as-is.

The Forth words ':', 'compile', '[compile]', and 'postpone' cause the next word to not define an untouchable string. Thus, 'postpone S" word1      word2' will indeed strip out the unnecessary space between word1 and word2. This hack allows the above words to be defined without indicating an untouchable string word.

Clearly, this program is easily fallible and was only intended to work with the Forth files in the system/ directory, to minimize the amount of memory space required during the bootstrap process. Typing "make" in the system/ directory causes all *.FT files to be stripped with the results stored in *.FS files. The tools/fthwords.py Python program then takes all of the *.FS files and generates the "forthwords.h" C include file for constructing the second-stage program.

Memory Analysis

Typing "make analyze" in the top-level PF11 directory runs the tools/analyzemem.py Python program. This program reads the obj/program.map linker map file and parses it to extract memory usage information. The program reports on how many bytes are used and how many are free in each memory region. If the MERGE_BUFFALO option is set in the config.mk file (see below) then the analysis program assumes that BUFFALO occupies memory from 0xE000 to 0xFFFF and adjusts its results accordingly.

Merging BUFFALO

If MERGE_BUFFALO=file.s19 is specified in the config.mk file and a ROM-based target is chosen, then once the second-stage program is built, it is converted from ELF format to Motorola S-records, then the S-records for the BUFFALO monitor (in the specified file.s19 file) are merged in. This process is performed by the tools/s19merge.py Python program. The program ignores everything except S1 records (data) and the S9 record (load address) of the first file specified.

The BUFFALO program buf34x.s19 in the misc/ directory is provided as a possible option. It comes from the Axiom Manufacturing CMM11E1 support tools and may not be suitable for other 68HC11 systems. The original BUFFALO from Motorola can be found by searching the net for buf34.asm and the as11.exe assembler which can be used to compile it.

If you want to merge a monitor other than BUFFALO it is certainly possible to do so. Just remember that the S19 file you specify will have its interrupt vector table used rather than PF11's and it must not conflict with the memory layout for PF11.

Serial Driver

The serial I/O driver built-in to PF11 is implemented in C in the src/sio.c file. There are two implementations, selected by the PF11_INTERRUPT_SIO manifest in the top-level config.h file. When this manifest is undefined, no interrupts are used and serial I/O is performed using SCI polling. There is no handshaking in this case.

When PF11_INTERRUPT_SIO is defined, the SCI interrupt is used to perform interrupt-driven I/O and XON/XOFF handshaking. Some details of the implementation are described here. The full story, of course, is told by the code.

The SCI interrupt vector is installed at 0xFFD6 if this location is writeable (perhaps NVRAM?). If a write to this location fails, then PF11 assumes that something like BUFFALO is present wherein the address at 0xFFD6 points to a secondary vector, i.e., a JMP >XXXX instruction somewhere in RAM. For example, BUFFALO has 0x00C4 stored at 0xFFD6. At 0x00C4 is a JMP instruction to the actual vector, and since it's in RAM, it can be changed. Thus, if writing directly to 0xFFD6 fails, PF11 will overwrite the XXXX in the JMP >XXXX instruction in RAM to its own vector location.

The receive and transmit buffers are defined as arrays of size SIO_RX_BUFSIZE and SIO_TX_BUFSIZE, respectively (both configurable in the top-level config.h file). As data is received by the interrupt service routine, it is stored in the circular receive buffer, where it is removed when PF11 calls the input() or inchar() C functions. If the circular buffer overflows,  a brief error message is displayed. Similarly, when PF11 calls the output() C function, characters are stored in the transmit buffer after which they are sent by the interrupt service routine. Thus, output() can be considered to be non-blocking, except when the transmit buffer is full, in which case output() blocks until all characters to display are stored in the transmit buffer.

When an XOFF character is received, transmission stops immediately until an XON character is received. Upon reception, if the receive circular buffer has SIO_RX_XOFFLEVEL bytes in it (configurable in config.h) then an XOFF character is transmitted to the host. When the number of bytes in this circular buffer drops below SIO_RX_XONLEVEL, an XON character is transmitted to the host. The default values for these levels are 3/4 and 1/4 of the buffer size, respectively.

Note that rewriting the serial driver in assembly language is a good idea, as it is (most likely...but not certainly) a limiting factor in how quickly programs can be downloaded to PF11.

Forth Tests

The tests/ subdirectory contains various Forth files that test the PF11 Forth implementation. These tests come from various sources. The t_xxx.ft tests all expect that the t_tools.ft file has been downloaded first, prior to each test. The README file in this subdirectory provides a guide for the various tests.

Note that the fsend.py program (see below) may be useful (on Unix-ish systems) for running the tests as it automatically strips the files. For example:
	cd tests
../fsend/fsend.py t_tools.ft t_core.ft
The above commands will run the core word set test. Remember that you must download t_tools.ft prior to any other t_xxx.ft test as all of the words defined by t_tools.ft are forgotten by the actual test code (to save dictionary space).

FSEND

The FSEND program (fsend/fsend.py) is a simplistic Python program for sending Forth code to a running PF11 session. Terminal programs like minicom and TeraTerm can also send text files to a target board, however FSEND is useful because:
FSEND can also be used for running the Forth test programs, as described in the previous section.

FSEND is only meant for Unix environments.

For a quick help summary, type "python fsend.py -h" or just "fsend.py -h" if the fsend.py file has been given the execute flag.

FSEND automatically opens the I/O devices with XON/XOFF flow control. If this fails for whatever reason, FSEND can implement "soft" XON/XOFF flow control, although buffering in the kernel can make this nearly useless (i.e., by the time that FSEND has received an XOFF, there might be thousands of bytes already queued up in the kernel). In this case, implementing a short delay after each transmitted line (called line pacing) is useful. The --line-pacing option can be used to specify a pacing interval (try 0.05 to begin with and go up if you get errors, down to make downloads faster).

FSEND is a quick-and-dirty program and could use quite a bit of improvement.


© 2003, Copyright by Andrew Sterian; All Rights Reserved. mailto: steriana@claymore.engineer.gvsu.edu