Punyforth is a simple, stack-based, Forth inspired programming language that primarily targets Internet of Things (IOT) devices, like the ESP8266. The ESP8266 is a low-cost Wi-Fi capable chip with a 80 MHz Xtensa LX3 32 bit CPU, TCP/IP stack, GPIO pins and 512 KiB to 4 MiB flash memory. It is widely used in IoT applications and home automation projects.
Punyforth also runs on x86 (Linux), ARM (Raspberry PI) but these are not the primary supported targets.
- Simple
- Highly interactive
- Extensible
- Small memory footprint and resource efficiency
The easiest way to try out Punyforth is to use a ESP8266 based development board that has USB to serial interface on board (Geekcreit/Doit, Amica, WeMos, LoLin). Connect the development board to your computer via USB. Let's assume the serial port is COM3.
$ cd arch/esp8266/bin
$ python modules.py core
$ flash com3
At first we select the modules to be installed using the modules.py python script. Then we install both Punyforth and the selected modules (this time only the core library) to the ESP8266 using the flash script.
Open a serial terminal1 on port COM3 then type:
println: "Hello world!"
1: Baud rate: 115200 bps. Local echo: on, line mode: enabled. You can find some free terminal emulators here.
4
dup
+
.
This should give you the following output.
(stack)
(stack 4)
(stack 4 4)
(stack 8)
(stack)
8
Congratulation, you've just doubled a number and printed out the result in the REPL.
Punyforth is a simple, imperative, stack-based, concatenative programming language and interactive environment with good metaprogramming support and extensibility.
The Forth environment combines the compiler with an interactive shell (REPL), where the user can define functions called words.
Punyforth does not have local variables, instead values are kept on a stack. This stack is used only for storing data. There is a separate return stack that stores information about nested subroutine calls. Both stacks are first-class in the language.
As a consequence of the stack, Punyforth uses a form of syntax known as Reverse Polish or Postfix Notation.
If you type the following code in the REPL:
1 2 +
The interpreter pushes the number 1 then the number 2 onto the data stack. It executes the word +, which pops the two top level items off the stack, calculates their sum, and pushes the result back to the stack.
The following code calculates a * a + b * b.
2 3 \ let's say a is 2, b is 3
dup * swap dup * + . \ prints out 13
The word dup duplicates the top level item of the stack. The word swap exchanges the two top level items of the stack.
Stack visualization:
2 3 3 9 2 2 4 13 2 3 2 9 2 9 2 9
dup and swap are stack shuffle words. Excessive use of words like them makes the code hard to follow, so it is advisable to use them sparingly. There are many ways to reduce the number of stack shuffles, one of them is to use quotations and combinators.
For example the above code could have been expressed the following way:
2 3 { dup * } bi@ +
See the chapter about quotations and combinators for more information.
Punyforth is heavily inspired by the Forth programming language. It uses the same compilation model (outer interpreter, compiler, modes, dictionary, immediate words, etc.) as other Forth systems. Punyforth is bootstrapped from a small set of primitives written in assembly language. The compiler targets these primitives and compiles indirect-threaded code. Higher level abstractions are built on top of the primitives therefore most of the system is written in itself (in Forth).
- Punyforth is case sensitive
- Strings are null-terminated
- Strings are created and printed differently (str: "foobar", print: "foobar" instead of s" foobar", ." foobar")
- Parsing words are ended with a colon character by convention (including variable:, constant:, create: does>)
- Defining a word in terms of itself results recursion by default (use the override word to alter this behaviour)
- Curly brackets denote quotations instead of locals
Punyforth supports exception handling, multitasking, socket and GPIO APIs and comes with a UART and a TCP REPL.
During programming, the user uses the REPL to write and test small piece of codes or to extend the languge with new words (which are called subroutines or functions in other languages).
The REPL (also known as the Forth Outer/Text Interpreter) operates in 2 modes. In interpretation mode, it immediately executes the words that the user typed in. In compilation mode (when you start a new word definition), its action depends on the compilation semantic of the current word. In most cases it compiles the execution token (pointer to the word) into the word to be defined. However, if the current word is flagged as immediate, the compiler executes the word at compile time so the word can define its own compilation semantic. This is a bit similar to Lisp macros. Control structures are implemented as immediate words in Forth.
Forth has almost no syntax. It grabs tokens separated by whitespace, looks them up in a dictionary then executes either their compilation or interpretation semantic. If the token is not found in the dictionary, it tries to convert it to a number. Eeverything in Forth is either a word or a number. Because of the postfix notation there are no precedence rules and parentheses.
This is an example of
valid Forth syntax 123 *&^%$#@2
Words are stored in a dictionary. The dictionary maps words to executable code or data structures.
You can use defining words to extend the dictionary with new definitions. The most basic defining words is the : (colon). This adds a new word to the dictionary with the behavior defined in terms of existing words. A colon definition begins with a colon and ends with a semicolon.
: square ( n -- n^2 ) dup * ;
4 square . \ prints 16
In the above example we created a new word called square that takes a number off the stack, multiplies it with itself, then leaves the result on the stack. The ( n -- n^2 ) is the optional stack effect comment indicating the input and output parameters.
Other common defining words are variable: and constant:.
variable: var1 \ create a variable 'var1' without initializing
54 init-variable: var2 \ create a variable 'var2' and initialize it to 54
42 constant: answer \ create a constant 'answer' with the value 42
var2 @ var1 ! \ assigns the value of var2 to var1
var1 ? \ prints out 54
answer . \ prints out 42
Punyforth supports the regular Forth conditional and loop words.
General form of if else then.
<bool> if <consequent> else <alternative> then
For example:
: max ( a b -- max )
2dup < if nip else drop then ;
10 100 max . \ prints 100
The else part can be omitted.
: abs ( n -- absn )
dup 0< if -1 * then ;
-10 abs . \ prints 10
Punyforth also supports switch-case like flow control logic as shown in the following example.
: day ( n -- )
case
1 of print: "Monday" endof
2 of print: "Tuesday" endof
3 of print: "Wednesday" endof
4 of print: "Thursday" endof
5 of print: "Friday" endof
6 of print: "Saturday" endof
7 of print: "Sunday" endof
print: "Unknown day: " .
endcase ;
The limit and start before the word do defines the number of times the loop will run.
<limit> <start> do <loop-body> loop
Do loops iterate through integers by starting at start and incrementing until you reach the limit. The word i pushes the loop index onto the stack. In a nested loop, the inner loop may access the loop variable of the outer loop by using the word j.
For example:
5 0 do i . loop \ prints 01234
There is an other version of the do loop where you can define the increment (which can be negative as well).
<limit> <start> do <loop-body> <increment> +loop
For example:
10 0 do i . 2 +loop \ prints 02468
If the increment is negative then limit is inclusive.
0 8 do i . -2 +loop \ prints 86420
It is important to know that Do loops store the loop index on the return stack. You can break the semantics of i and j if you use the return stack to store temporary data. Also you can't simply exit a word from inside a do loop without clearing the return stack first. See unloop for more information.
begin <loop-body> <bool> until
The begin...until loop repeats until a condition is true. This loop always executes at least one time.
For example:
: countdown ( n -- )
begin
dup .
1- dup
0 < until
drop ;
5 countdown \ prints 543210
If you replace until with again and omit the condition then the loop will run indefinitely.
begin <loop-body> again
begin .. <bool> while <loop-body> repeat
For example:
: countdown ( n -- )
begin
dup 0 >=
while
dup . 1-
repeat
drop ;
5 countdown \ prints 543210
You can use the exit word to exit from the current word as well from the loop.
But this won't work with do loops. The reason for this is because do loops store the loop index on the return stack. You can use the unloop word to clear the return stack before exiting a do loop.
: some-word ( -- )
10 0 do
i 5 = if unloop exit then
loop ;
An unloop is required for each nesting level before the definition may be exited.
: nested-exit ( -- )
5 0 do
5 i 1+ do
j i + 7 = if
i . space j . cr
unloop unloop \ clear the return stack before exiting
exit
then
loop
loop ;
Control structres are compile time words with no interpretation semantics. They can be used only in compilation mode, that is inside a word definition.
If a word faces an error condition it can throw an exception. Your can provide exception handlers to catch exceptions.
For example:
exception: EZERODIV
: div ( q d -- r | throws:EZERODIV ) \ this word throws an exception in case of division by zero
dup 0= if
EZERODIV throw
else
/
then ;
: test-div ( q d -- r )
['] div catch
case
EZERODIV of
println: '/ by zero' \ print exception in case of zero division
2drop \ drop q d
endof
throw \ rethrow if it wasn't EZERODIV, or there was no exception (code=0)
endcase ;
The word catch expects an execution token of a word that potentially throws an exception.
The exeption mechanism in Punyforth follows the "catch everything and re-throw if needed" semantics. The instruction 0 throw is essentially a no-op and indicates no error.
An uncaught exception causes the program to print out the error and the stack trace to the standard output and terminate.
You can modify this behaviour by overriding the unhandled deferred word.
: my-uncaught-exception-handler ( code -- )
cr print: "Uncaught exception: " ex-type
abort ;
' unhandled is: my-uncaught-exception-handler
The implementation of exceptions is based on the idea of William Bradley.
Immediate words are executed at compile time. Loops and control structures are implemented with immediate words that compile the required semantics.
: begin
here \ saves the absolute address of the beginning of the loop to the stack
; immediate
: until
['] branch0 , \ compiles a conditional branch
here - cell - , \ calculate then compile the relative address
; immediate
Parsing words can parse the input stream. One example of a parsing word is the comment. There are 2 types of comments.
( this is a comment )
\ this is an other comment
: ( \ comments start with ( character
begin \ consume the stream until ) character is found
key ')' =
until
; immediate
: \ \ single line comments start with \ character
begin
key dup
'cr' = swap
'lf' = or
until \ consume the stream until cr or lf character is found
; immediate
The word hex: is an other example of a parsing word.
hex: FF \ pushes 255 onto the stack
This word interprets the input as a hexadecimal number then pushes it to the stack. Parsing words are similar to reader macros in Lisp.
Punyforth relies on a Hyper Static Global Environment. This means redefining a word will create a new definition, but the words continue to refer to the definition that existed when they were defined. You can alter this behaviour by using deferred words.
For example
: myword1 ( -- )
print: 'foo' ;
: myword2 ( -- )
myword1
print: 'bar' ;
: myword1 ( -- ) \ redefining myword1 to print out baz instead of foo
print: 'baz' ;
myword2 \ myword2 will print out foobar, not bazbar
Redefinition has no effect on myword2. Let's try it again. This time using the defer:/is: words.
defer: myword1
: myword2 ( -- )
myword1 \ I can define myword2 in terms of the (yet undefined) myword1
print: 'bar' ;
: printfoo ( -- ) print: 'foo' ;
: printbaz ( -- ) print: 'baz' ;
' myword1 is: printfoo \ redefine the deferred word to print out foo
myword2 \ this prints out foobar
' myword1 is: printbaz \ redefine the deferred word to print out baz
myword2 \ this prints out bazbar
You might want to redefine a word in terms of it's older definition.
For example:
: myword ( -- )
print: 'foo' ;
: myword ( -- )
myword
print: 'bar' ;
myword \ infinite recursion
Unfortunately this won't work because the myword inside the second defintion will refer to the new word, resulting infinite recursion. You can avoid this by marking the word with override.
: myword ( -- )
print: 'foo' ;
: myword ( -- ) override
myword
print: 'bar' ;
myword \ prints out foobar
Because the usage of override, the myword in the second defintion will refer to the old myword. Therefore the execution of myword will print out foobar.
A quotation is an anonymous word inside an other word, similar than a lambda expression in other languages. Quotations don't act as lexical closures, because there are no locals in Forth to close over. The word { starts compiling the quotation body into the current word definition. The word } ends the quotation by compiling an exit word into the quotation.
: a-word-definition ( -- )
( .. )
{ ( ..quotation body.. ) }
( .. ) ;
At runtime the quotation pushes its execution token onto the stack, therefore it can be used with execute, catch or combinators.
: demo ( -- n )
3 { 1+ 5 * } execute ;
% demo
(stack 20)
{ str: 'AF01z' hex>int } catch
if
println: 'invalid hex number'
abort
then
Punyforth supports a few Factor style combinators.
Calls a quotation while temporarily hiding the top item on the stack.
1 2 4 { + } dip \ Same as: 1 2 4 >r + r>
(stack 3 4)
Calls a quotation with an item on the stack, restoring that item after the quotation returns.
1 2 4 { + } keep \ Same as: 1 2 4 dup >r + r>
(stack 1 6 4)
Applies quotation p to x, then applies quotation q to x.
\ given a rectangle(width=3, height=4)
rectangle { .width @ } { .height @ } bi * \ Same as: rectangle dup .width @ swap .height @ *
(stack 12)
Applies quotation p to x, then applies quotation q to y.
str: "john" str: ".doe" { 1+ c@ } { 2 + c@ } bi* = \ Same as: str: "john" str: ".doe" swap 1+ c@ swap 2 + c@ =
(stack -1)
Applies the quotation to x, then to y.
str: "john" str: ".doe" { strlen } bi@ = \ Same as: str: "john" str: ".doe" swap strlen swap strlen =
(stack -1)
The word create: and does> lets you combine a data structure with an action. You can create multiple instances with different data content and with the same action.
create: is a defining word like the : (colon). It creates a new dictionary entry with the header but without the body. The name of the newly created definition comes from the input stream. Then you can lay out some data using the , (comma) word. The action which will operate on this data is the sequence of words that comes after the does> part. The pointer to the data is pushed to the stack before invoking the action.
One of the simplest application of create: does> is the definition of a constant.
: constant:
create: ,
does> @ ;
80 constant: COLUMNS
COLUMNS . \ prints out 80
- First we push the value 80 to the data stack
- Then we invoke the constant: word
- The word create: reads the name of the constant (COLUMNS) from the input stream and creates a new dictionary header
- The word , stores the value on the stack (80) in the body of the newly created dictionary entry
- The does> sets the action to be the @ (fetch) word which will read the constant value from the body
: array: ( size "name" -- ) ( index -- addr )
create: cells allot
does> swap cells + ;
10 array: numbers \ create an array with 10 elements
12 3 numbers ! \ store 12 in the 3rd element
3 numbers @ \ fetch the 3rd element
: struct 0 ;
: field:
create: over , +
does> @ + ;
struct
cell field: .width
cell field: .height
constant Rect
: new-rect: ( "name" -- )
Rect create: allot ;
: area ( rect -- area )
dup .width @ swap .height @ * ;
new-rect: r1
3 r1 .width !
5 r1 .height !
r1 area .
Words with name starting with the test: prefix are treated as unit tests. Unit testing words typically use assert or =assert to validate the correctness of an other word.
: test:add 1 2 + 3 =assert ;
: test:sub 8 3 - 5 =assert ;
test
The =assert word asserts that two top-most items on the stack are equal. The assert words asserts that the top-most item of the stack is true.
The test word runs all unit tests and gives back a simple report.
2 tests, 2 passed, 0 failed, 0 errors
All passed
A unit test can either pass/fail or raise an error. Failure means an assertion wasn't met. Error occurs in case of an unhandled exception.
: test:add 1 2 + 4 =assert ;
: test:sub some-exception throw ;
test
2 tests, 0 passed, 1 failed, 1 errors
test:add(3 4 <>) FAIL
test:sub ERROR: some-exception
There were failures
The ESP8266 has a built in Wi-Fi chip that can be used both in access point and station mode (wireless client).
In station mode, the ESP8266 connects to an existing Wi-Fi access point.
str: "password" str: "existing-ssid" wifi-connect
In AP mode, the ESP8266 acts as an central connection point, which wireless clients (smartphones, laptops) can connect to. In this mode you have to choose an IP address for the ESP and an IP range for the clients. Client IP addresses are assigned by the DHCP server.
172 16 0 1 >ipv4 wifi-set-ip \ AP ip is 172.16.0.1
4 3 0 AUTH_WPA2_PSK str: "1234567890" str: "my-ssid" wifi-softap \ max connections = 4
8 172 16 0 2 >ipv4 dhcpd-start \ dhcp max_leases = 8, first client ip is 172.16.0.2
The dhcp max_leases parameter should not be smaller than the maximum allowed connections.
The Wi-Fi settings are persistently stored by the ESP8266, there is no need to setup the Wi-Fi at every startup.
2 constant: PIN
PIN GPIO_OUT gpio-mode
PIN GPIO_HIGH gpio-write
250 ms
PIN GPIO_LOW gpio-write
See Philips Hue lightswitch example for more information.
Netconn is a sequential API on top of the lightweight TCP/IP stack of [FreeRTOS] (https://en.wikipedia.org/wiki/FreeRTOS). Punyforth provides a wrapper around the Netconn API.
512 buffer: line
: fetch ( netcon -- )
begin
dup 512 line netcon-readln -1 <>
while
line type cr
repeat
drop ;
80 str: "google.com" TCP netcon-connect constant: socket
socket str: "GET / HTTP/1.1\r\n\r\n" netcon-write
socket fetch
socket netcon-dispose
str: "Lorem ipsum" constant: data
str: "192.168.0.3" constant: SERVER_IP
8005 constant: SERVER_PORT
SERVER_PORT SERVER_IP UDP netcon-connect
dup data 11 netcon-send-buf
netcon-dispose
import select, socket
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.bind(('0.0.0.0', 8005))
s.setblocking(0)
while True:
result = select.select([s],[],[])
msg = result[0][0].recv(1024)
print(msg.strip())
str: "192.168.0.15" constant: HOST
8000 constant: PORT
128 buffer: data
PORT HOST netcon-udp-server
dup 128 data netcon-readln
print: 'received bytes: ' . cr
data type
netcon-dispose
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.sendto(b'hello\r\n', ('192.168.0.15',8000))
See Simple HTTP Server for more information.
Punyforth supports cooperative multitasking which enables users to run more than one task simultaneously. For example one task may wait for input on a socket, while another one receives commands through the serial port. Punyforth never initiates a context switch by its own. Instead, tasks voluntarily yield control periodically using the word pause. Tasks are executed in a round robin fashion.
In order to run some code in the background, one must create a new task first, using the task: parsing word. A tasks can be activated inside a word. This word usually does something in a loop and calls pause periodically to yield controll to other tasks.
task: mytask
: my-word
mytask activate
[...] pause [...]
deactivate
To start the task, first you have to switch to multi tasking mode first by executing the word multi. Then simply call the word that was associated to the task.
multi
my-word
Often tasks need to communicate with each other. A mailbox is a fixed size blocking queue where messages can be left for a task. Receiving from an empty mailbox or sending to a full mailbox blocks the current task.
\ create a mailbox with size 5
5 mailbox: mailbox1
\ create a task for the consumer
task: task-consumer
\ this word is executed by the task
: consumer ( task -- )
activate \ activate task
begin
mailbox1 mailbox-receive . \ receive and print one item from the mailbox
println: "received by consumer"
pause \ allow other tasks to run
again
deactivate ; \ deactivate task
multi \ switch to multitask mode
task-consumer consumer \ run the consumer
123 mailbox1 mailbox-send \ send some numbers to the consumer
456 mailbox1 mailbox-send
\ create a task for the counter
task: task-counter
\ this word is executed by the task
: counter ( task -- )
activate \ activate task
100 0 do
i . cr
500 ms
loop
deactivate ; \ deactivate task
multi \ switch to multitask mode
task-counter counter \ run the consumer
\ Returns the available free dictionary space.
freemem ( -- bytes )
\ Returns the available free memory.
osfreemem ( -- bytes )
\ Blocks all running tasks for the specified number of millisecond.
ms ( msec -- )
\ Blocks for the specified number of microsecond. This is implemented as busy loop. Use it if you need high precision delay.
us ( usec -- )
\ Sets the baud rate of the specied uart.
uart-set-bps ( bps uart-number -- )
You can see some example code under the examples directory.
Build instructions and further information is available at punyforth wiki.
Attila Magyar