PFAVR -- An ANS Forth Implementation for the Atmel
AVR
Andrew Sterian
Padnos School of Engineering
Grand Valley State University
Introduction
This document contains several miscellaneous notes about the PFAVR
implementation. Please read this file at least once if you are planning
on working with PFAVR extensively.
Compilation Process
The compilation of PFAVR is complicated. It is complicated because:
- The Forth words defined in the system/ directory must be
compiled into a pre-built dictionary that is ready to run
- The GNU AVR compiler tools must be forced to put things in the
right place
The above complications are dealt with in the following general ways:
- The pre-built dictionary is created by a two-stage build process.
- First, a simulation-only version of PFAVR is compiled that
contains the source files in the system/
directory as hard-coded strings in the program.
- The GDB AVR simulator is used to run this program and
construct the in-memory dictionary/code space. SimulAVR is used as the
back-end for GDB.
- The contents of the dictionary/code-space memory are dumped out
to C arrays.
- These C arrays are used to build a second version of PFAVR that
will be the final version. This version copies the contents of the C
arrays directly to the dictionary/code-space to provide a pre-built
dictionary of words.
The final program is compiled with the
PFAVR_PREBUILT manifest defined. The simulation-only version has this
manifest left undefined.
- Putting things in the right place is controlled by manipulating
the sections in which
data reside and manipulating where these sections start in the linking
phase.
The following sections describe the above processes in more detail.
Two-Stage Build Process
In the first stage, the "boostrap" program is built in the bootstrap/ directory from the
files in the src/
directory. Prior to doing so, Forth files are stripped (see below) in
the system/ directory and
C
source file dependencies are calculated by a "make dep" in the bootstrap/ directory.
The key
consideration in constructing the bootstrap program is that it places
the
components
to be pre-built (dictionary and code space) in exactly the same memory
locations as the final program (second stage). The bootstrap
program needs a lot more flash and external RAM space to store the
Forth system dictionary files (in the system/ directory).
The only difference in code between the bootstrap and final program
versions is in the file src/pf_build.c.
Thus, this file must come last in the list of files to link (and the
Makefile ensures this).
In the bootstrap stage, the src/pf_build.c
file does the following:
- The character array ForthWordsText
is defined as the string concatenation of all stripped Forth
definitions
in the system/ directory.
This string will be sent to the
Forth interpreter to build the dictionary. This process occurs in the LoadForthWordsText() function.
The ForthWordsText string
is very large, approximately 16k. This string will occupy space in
flash memory and will get copied to external memory at program load
time.
- The function pfBuildDictionary()
is the main entry point in this file. This function creates the
built-in
words one at a time. Then, it calls LoadForthWordsText() to add the
Forth words defined in the system/
directory.
Once the bootstrap version of PFAVR is constructed, it is run under the
GDB/SimulAVR simulator. The bootstrap/batch.gdb
file (created by the Python program tools/genbatch.py) contains a
list of commands to GDB to run PFAVR and dump the dictionary and code
space area to disk. The process is as follows (automatically performed
by commands in the bootstrap/batch.gdb
file):
- Load the bootstrap PFAVR into the simulator
- Set a breakpoint at the gdb_stop_here()
C function (an empty function that is simply a convenient place to stop
at).
- Run the program until the breakpoint. This breakpoint occurs in pfBuildDictionary(), after the
dictionary has been completely built.
- Execute a variety of GDB commands to dump the dictionary and code
space area to a file named temp.out.
- Run a Python program tools/doprebuild.py
to process the temp.out
file and generate C header files predict.h, preexec.h, and prevars.h.
These three header files are copied to the src/ directory and the
bootstrap
process is complete. These three header files are the entire result of
the bootstrap process. The files are used as follows in the second
build
stage.
- The predict.h file
contains the contents of the dictionary as one large C string. It is
used in src/pf_build.c to initialize td_DictPrebuilt, which will be
copied at run-time from program memory to td_Dict in external RAM.
- The preexec.h file
contains the contents of the code space area as one large C string. It
is used in src/pf_build.c
to initialize td_ExecPrebuilt,
which will be
copied at run-time from program memory to td_Exec in external RAM.
- The prevars.h file
is used in src/pf_build.c
to initialize some variables like td_DictPtr
(to point to the last dictionary entry), td_ExecPtr (to point to the
first free location in code space), etc. The variables td_Dict_check and td_Exec_check store the
starting
addresses of the dictionary and code space area in the bootstrap
program. They are compared against the same addresses in the second
stage program, and of course they must be identical (since pointers to
absolute addresses are used extensively).
The second build stage occurs in the obj/ directory. The compilation
process is the same except -DPFAVR_PREBUILT
is added to the C compiler flags so that PFAVR_PREBUILT is a defined
manifest during compilation.
A key consideration in this two-stage process is that the td_Dict dictionary and td_Exec code space areas must
be in exactly the same memory locations in both stages (and, of course,
have the same size). For this reason, the pf_core5.c and pf_core6.c source files (which
define the dictionary and code space areas) are listed first in the set
of source files in config.mk
so that variations in compilation between the two stages do not affect
the placement of these two files.
Section Control
Placing data and code in the right sections can often be a challenge
with the GCC tools. The main sections of interest are as
follows.
- The text section
contains program code. We also explicitly place the pre-built
dictionary and code space data in the text section when compiling the
second stage "final program" to not waste memory (see the data section description
below). Note that the word "text" is used to mean two different things:
the name of an actual linker relocatable section, and a region in the memory map. The text
linker section does go into the text memory map region, but other
things
do too.
The compilation process ensures that
the GCC library file crt0.o
is linked first, thus
the
label _start (where the
whole program begins) resides as the first location in the text
section after the interrupt vectors, which begin at 0x0.
The end of the text section is given by the C variable _etext (defined by the linker).
Note that this is the address where the load-time data section begins (see below).
- The data section
contains global variables that are initialized. This section actually
occupies two places in the memory map: the run-time section and the
load-time section. The run-time section is where the "live" variables
reside in external RAM. These are the addresses that C code actually
manipulates when
working with these variables. The load-time section contains the
initialization data for these variables in flash. Before main() is called, built-in code
generated by GCC copies all data from the load-time section to the
run-time section, thus initializing the variables.
The data section has a "load
address" given by the C variable __data_load_start
(automatically created by the linker) and a runtime address given by
the
C variable __data_start.
The end of the data
section is given by __data_end.
The load address is a place in flash (should be just beyond the end of
the text section) where
the
initialization data is stored. The runtime address is a place in RAM
where the data structure actually resides during program operation.
Thus, the runtime loader automatically copies (__data_end-__data_start) bytes
from __data_load_start in FLASH to
__data_start in RAM prior to
calling main().
In the second stage "final" program, the pre-built dictionary and code
space are forced to reside in the text
section, since if they would reside in the data section they would occupy
an unpredictable space in external RAM. By putting the dictionary and
code space areas in the bss
section (see below) we can make sure their locations are unchanged in
both build stages.
- The bss section
contains global variables that are not initialized. The contents of
this
section are automatically set to 0 by the loader prior to calling main(). The beginning of this
section is given by the C variable __bss_start
and ends at __bss_end.
The section usage is described below.
- The stack (td_Stack)
resides in bss.
- The return stack (td_Return)
resides in bss.
- The dictionary (td_Dict)
resides in bss.
- The code space area (td_Exec)
resides in bss.
- The pre-built dictionary (td_DictPrebuilt)
resides in text.
- The pre-built code space area (td_ExecPrebuilt) resides in text.
At run time, the function pfBuildDictionary
(in the second stage program) copies td_DictPrebuilt to td_Dict and td_ExecPrebuilt to td_Exec. One may well wonder
why this scheme is necessary, specifically why td_Dict and td_Exec were not defined to
reside in the data
section
and the GCC built-in loader used to initialize them. The answer is that
the linker would not consistently place these in the same location
between the pre-built and final programs. When they are placed in bss, the two-stage bootstrap
process works as desired.
Stripping Forth
The Python program tools/fthstrip.py
is a simple Forth file stripper. This program strips out all of the
following from a Forth source file:
- Unnecessary whitespace
- Blank lines
- Single-line comments beginning with \
- Comments in parentheses (e.g., " ( a b -- x ) ")
The program understands the following Forth constructs that define
untouchable strings:
- ."
- ABORT"
- S"
- C"
- .( string string
string )
For example, the Forth string 'word1
word2' will be stripped to 'word1
word2', but '."
Hello there"' will be left as-is.
The Forth words ':', 'compile', '[compile]', and 'postpone' cause the next word
to not define an untouchable
string. Thus, 'postpone S"
word1 word2' will indeed strip out
the unnecessary space between word1
and word2. This hack
allows the above words to be defined without indicating an untouchable
string word.
Clearly, this program is easily fallible and was only intended to work
with the Forth files in the system/
directory, to minimize the amount of memory space required during the
bootstrap process. Typing "make" in the system/ directory causes all
*.FT files to be stripped with the results stored in *.FS files. The tools/fthwords.py Python
program
then takes all of the *.FS files and generates the "forthwords.h" C
include file for constructing the second-stage program.
Memory Analysis
Typing "make analyze" in the top-level PFAVR directory runs the tools/analyzemem.py Python
program. This program reads the obj/pforth.map
linker map file and parses it to extract memory usage information. The
program reports on how many bytes are used and how many are free in
each
memory region. Note that this program truly reports bytes, thus the number
of FLASH words used is the size of the text section divided by 2.
Serial Driver
The serial I/O driver built-in to PFAVR is implemented in C in the src/sio.c file. There are two
implementations, selected by the PFAVR_INTERRUPT_SIO manifest in the
top-level config.h file.
When this manifest is undefined, no interrupts are used and serial I/O
is performed using USART polling. There is no handshaking in this case.
When PFAVR_INTERRUPT_SIO is defined, the USART interrupts are used to
perform interrupt-driven I/O and XON/XOFF handshaking. Some details of
the implementation are described here. The full story, of course, is
told by the code.
The receive and transmit buffers are defined as arrays of size
SIO_RX_BUFSIZE and SIO_TX_BUFSIZE, respectively (both configurable in
the top-level config.h
file). As data is received by the interrupt service routine, it is
stored in the circular receive buffer, where it is removed when PFAVR
calls the input() or inchar() C functions. If the
circular buffer overflows, a brief error message is displayed.
Similarly, when PFAVR calls the output()
C function, characters are stored in the transmit buffer after which
they are sent by the interrupt service routine. Thus, output() can be considered to
be
non-blocking, except when the transmit buffer is full, in which case output() blocks until all
characters to display are stored in the transmit buffer.
When an XOFF character is received, transmission stops immediately
until an XON character is received. Upon reception, if the receive
circular buffer has SIO_RX_XOFFLEVEL bytes in it (configurable in config.h) then an XOFF
character is transmitted to the host. When the number of bytes in this
circular buffer drops below SIO_RX_XONLEVEL, an XON character is
transmitted to the host. The default values for these levels are 3/4
and
1/4 of the buffer size, respectively.
Note that rewriting the serial driver in assembly language is a good
idea, as it is (most likely...but not certainly) a limiting factor in
how quickly programs can be downloaded to PFAVR.
Forth Tests
The tests/ subdirectory
contains various Forth files that test the PFAVR Forth implementation.
These tests come from various sources. The t_xxx.ft tests all expect that
the t_tools.ft file has
been downloaded first, prior to each
test. The README file in this subdirectory provides a guide for the
various tests.
Note that the fsend.py
program (see below) may be useful (on Unix-ish systems) for running the
tests as it automatically strips the files. For example:
cd tests
python ../fsend/fsend.py t_tools.ft t_core.ft
The above commands will run the core word set test. Remember that you
must download t_tools.ft
prior to any other t_xxx.ft
test as all of the words defined by t_tools.ft are forgotten by the
actual test code (to save dictionary space).
FSEND
The FSEND program (fsend/fsend.py)
is a simplistic Python program for sending Forth code to a running
PFAVR
session. Terminal programs like minicom and TeraTerm can also send text
files to a target board, however FSEND is useful because:
- FSEND automatically strips Forth source prior to download (unless
disabled with the --no-strip
option) to save download time (see Stripping
Forth above)
- FSEND displays output from PFAVR as it is generated, allowing
source errors to be identified faster (some terminal programs also do
this, but not all)
FSEND can also be used for running the Forth test programs, as
described in the previous section.
FSEND is only meant for Unix environments.
For a quick help summary, type "python fsend.py -h" or just "fsend.py
-h" if the fsend.py file
has been given the execute flag.
FSEND automatically opens the I/O devices with XON/XOFF flow control.
If this fails for whatever reason, FSEND can implement "soft" XON/XOFF
flow control, although buffering in the kernel can make this nearly
useless (i.e., by the time that FSEND has received an XOFF, there might
be thousands of bytes already queued up in the kernel). In this case,
implementing a short delay after each transmitted line (called line pacing) is useful. The --line-pacing option can be
used
to specify a pacing interval (try 0.05 to begin with and go up if you
get errors, down to make downloads faster).
FSEND is a quick-and-dirty program and could use quite a bit of
improvement.
© 2003-2004, Copyright by Andrew Sterian;
All Rights Reserved. mailto: steriana@claymore.engineer.gvsu.edu