PFAVR -- An ANS Forth Implementation for the Atmel AVR

Andrew Sterian
Padnos School of Engineering
Grand Valley State University


Top-Level | Glossary | Compiler Design | Rationale | Notes
Version 1.2

February 28, 2004
Introduction | Compilation Process | Stripping Forth | Memory Analysis | Serial Driver | Forth Tests | FSEND


Introduction

This document contains several miscellaneous notes about the PFAVR implementation. Please read this file at least once if you are planning on working with PFAVR extensively.

Compilation Process

The compilation of PFAVR is complicated. It is complicated because:
The above complications are dealt with in the following general ways:
The final program is compiled with the PFAVR_PREBUILT manifest defined. The simulation-only version has this manifest left undefined.
The following sections describe the above processes in more detail.

Two-Stage Build Process

In the first stage, the "boostrap" program is built in the bootstrap/ directory from the files in the src/ directory. Prior to doing so, Forth files are stripped (see below) in the system/ directory and C source file dependencies are calculated by a "make dep" in the bootstrap/ directory.

The key consideration in constructing the bootstrap program is that it places the components to be pre-built (dictionary and code space) in exactly the same memory locations as the final program (second stage).  The bootstrap program needs a lot more flash and external RAM space to store the Forth system dictionary files (in the system/ directory).

The only difference in code between the bootstrap and final program versions is in the file src/pf_build.c. Thus, this file must come last in the list of files to link (and the Makefile ensures this).

In the bootstrap stage, the src/pf_build.c file does the following:
Once the bootstrap version of PFAVR is constructed, it is run under the GDB/SimulAVR simulator. The bootstrap/batch.gdb file (created by the Python program tools/genbatch.py) contains a list of commands to GDB to run PFAVR and dump the dictionary and code space area to disk. The process is as follows (automatically performed by commands in the bootstrap/batch.gdb file):
These three header files are copied to the src/ directory and the bootstrap process is complete. These three header files are the entire result of the bootstrap process. The files are used as follows in the second build stage.
The second build stage occurs in the obj/ directory. The compilation process is the same except -DPFAVR_PREBUILT is added to the C compiler flags so that PFAVR_PREBUILT is a defined manifest during compilation.

A key consideration in this two-stage process is that the td_Dict dictionary and td_Exec code space areas must be in exactly the same memory locations in both stages (and, of course, have the same size). For this reason, the pf_core5.c and pf_core6.c source files (which define the dictionary and code space areas) are listed first in the set of source files in config.mk so that variations in compilation between the two stages do not affect the placement of these two files.

Section Control

Placing data and code in the right sections can often be a challenge with the GCC tools. The main sections of interest are as follows.
The compilation process ensures that the GCC library file crt0.o is linked first, thus the label _start (where the whole program begins) resides as the first location in the text section after the interrupt vectors, which begin at 0x0. The end of the text section is given by the C variable _etext (defined by the linker). Note that this is the address where the load-time data section begins (see below).
The data section has a "load address" given by the C variable __data_load_start (automatically created by the linker) and a runtime address given by the C variable __data_start. The end of the data section is given by __data_end. The load address is a place in flash (should be just beyond the end of the text section) where the initialization data is stored. The runtime address is a place in RAM where the data structure actually resides during program operation. Thus, the runtime loader automatically copies (__data_end-__data_start) bytes from __data_load_start in FLASH to __data_start in RAM prior to calling main().

In the second stage "final" program, the pre-built dictionary and code space are forced to reside in the text section, since if they would reside in the data section they would occupy an unpredictable space in external RAM. By putting the dictionary and code space areas in the bss section (see below) we can make sure their locations are unchanged in both build stages.
The section usage is described below.
At run time, the function pfBuildDictionary (in the second stage program) copies td_DictPrebuilt to td_Dict and td_ExecPrebuilt to td_Exec. One may well wonder why this scheme is necessary, specifically why td_Dict and td_Exec were not defined to reside in the data section and the GCC built-in loader used to initialize them. The answer is that the linker would not consistently place these in the same location between the pre-built and final programs. When they are placed in bss, the two-stage bootstrap process works as desired.

Stripping Forth

The Python program tools/fthstrip.py is a simple Forth file stripper. This program strips out all of the following from a Forth source file:
The program understands the following Forth constructs that define untouchable strings:
For example, the Forth string 'word1      word2' will be stripped to 'word1 word2', but '." Hello      there"' will be left as-is.

The Forth words ':', 'compile', '[compile]', and 'postpone' cause the next word to not define an untouchable string. Thus, 'postpone S" word1      word2' will indeed strip out the unnecessary space between word1 and word2. This hack allows the above words to be defined without indicating an untouchable string word.

Clearly, this program is easily fallible and was only intended to work with the Forth files in the system/ directory, to minimize the amount of memory space required during the bootstrap process. Typing "make" in the system/ directory causes all *.FT files to be stripped with the results stored in *.FS files. The tools/fthwords.py Python program then takes all of the *.FS files and generates the "forthwords.h" C include file for constructing the second-stage program.

Memory Analysis

Typing "make analyze" in the top-level PFAVR directory runs the tools/analyzemem.py Python program. This program reads the obj/pforth.map linker map file and parses it to extract memory usage information. The program reports on how many bytes are used and how many are free in each memory region. Note that this program truly reports bytes, thus the number of FLASH words used is the size of the text section divided by 2.

Serial Driver

The serial I/O driver built-in to PFAVR is implemented in C in the src/sio.c file. There are two implementations, selected by the PFAVR_INTERRUPT_SIO manifest in the top-level config.h file. When this manifest is undefined, no interrupts are used and serial I/O is performed using USART polling. There is no handshaking in this case.

When PFAVR_INTERRUPT_SIO is defined, the USART interrupts are used to perform interrupt-driven I/O and XON/XOFF handshaking. Some details of the implementation are described here. The full story, of course, is told by the code.

The receive and transmit buffers are defined as arrays of size SIO_RX_BUFSIZE and SIO_TX_BUFSIZE, respectively (both configurable in the top-level config.h file). As data is received by the interrupt service routine, it is stored in the circular receive buffer, where it is removed when PFAVR calls the input() or inchar() C functions. If the circular buffer overflows,  a brief error message is displayed. Similarly, when PFAVR calls the output() C function, characters are stored in the transmit buffer after which they are sent by the interrupt service routine. Thus, output() can be considered to be non-blocking, except when the transmit buffer is full, in which case output() blocks until all characters to display are stored in the transmit buffer.

When an XOFF character is received, transmission stops immediately until an XON character is received. Upon reception, if the receive circular buffer has SIO_RX_XOFFLEVEL bytes in it (configurable in config.h) then an XOFF character is transmitted to the host. When the number of bytes in this circular buffer drops below SIO_RX_XONLEVEL, an XON character is transmitted to the host. The default values for these levels are 3/4 and 1/4 of the buffer size, respectively.

Note that rewriting the serial driver in assembly language is a good idea, as it is (most likely...but not certainly) a limiting factor in how quickly programs can be downloaded to PFAVR.

Forth Tests

The tests/ subdirectory contains various Forth files that test the PFAVR Forth implementation. These tests come from various sources. The t_xxx.ft tests all expect that the t_tools.ft file has been downloaded first, prior to each test. The README file in this subdirectory provides a guide for the various tests.

Note that the fsend.py program (see below) may be useful (on Unix-ish systems) for running the tests as it automatically strips the files. For example:
	cd tests
python ../fsend/fsend.py t_tools.ft t_core.ft
The above commands will run the core word set test. Remember that you must download t_tools.ft prior to any other t_xxx.ft test as all of the words defined by t_tools.ft are forgotten by the actual test code (to save dictionary space).

FSEND

The FSEND program (fsend/fsend.py) is a simplistic Python program for sending Forth code to a running PFAVR session. Terminal programs like minicom and TeraTerm can also send text files to a target board, however FSEND is useful because:
FSEND can also be used for running the Forth test programs, as described in the previous section.

FSEND is only meant for Unix environments.

For a quick help summary, type "python fsend.py -h" or just "fsend.py -h" if the fsend.py file has been given the execute flag.

FSEND automatically opens the I/O devices with XON/XOFF flow control. If this fails for whatever reason, FSEND can implement "soft" XON/XOFF flow control, although buffering in the kernel can make this nearly useless (i.e., by the time that FSEND has received an XOFF, there might be thousands of bytes already queued up in the kernel). In this case, implementing a short delay after each transmitted line (called line pacing) is useful. The --line-pacing option can be used to specify a pacing interval (try 0.05 to begin with and go up if you get errors, down to make downloads faster).

FSEND is a quick-and-dirty program and could use quite a bit of improvement.


© 2003-2004, Copyright by Andrew Sterian; All Rights Reserved. mailto: steriana@claymore.engineer.gvsu.edu