5.2 KiB
TAL Assembler
See also the UXN virtual machine
TAL is a Forth like two-pass assembler language translating directly to UXN memory images.
Words
Words are up to 63 consecutive non-whitespace characters.
For instance loop
, System
, Mouse/x
, my-routine
and some_other_routine
would all be examples of words.
The UXN instructions themselves (ADD
, POP
, LIT
and soforth) are all words.
Some words have special interpretations.
Opcodes
See the UXN documentation for a full listing of opcodes, but BRK
, INC
, POP
, NIP
, SWP
.... SFT
as words all mean their respective opcodes.
These opcodes may be followed with the flags k
, r
or 2
to set the keep
, return
and short
flags.
For instance INC2
as a word would increment a two-bit quantity at the top of the stack.
INC2k
would keep the original value, resulting in x x+1
as the stack values.
Numbers
Hexadecimal numbers written with either two or four digits.
For instance 00
would be the single word 0x00
.
0000
is equivalent to the two words 00 00
.
UXN is little-endian, the value 0xFF00
is represented as the sequential words FF 00
.
To disambiguate, numbers are usually prefixed with #
.
Strings
Words may be captured as ASCII formatted strings.
Such strings are written "<word>
.
For instance "foo
would cause the bytes #66 #6f #6f #00
to be literally inserted into the memory image.
As "
notation cannot capture whitespace, the #20
(space), #0a
(newline) and #09
(tab) character constants are common.
Comments
Comments in TAL are written ( ... )
and support nesting. Eg. ( () )
is a valid comment. ( ( )
is not.
TAL does not have a way to "close all start comments" like Java and some other languages do.
Brackets
[
and ]
are treated as whitespace, and may be used for visual grouping.
While they have semantics in traditional Forth, they have no semantics in TAL.
Assembler directives
Padding
|<number>
"pad-absolute" pads the resulting UXN rom to a given absolute address.
For instance |0x0000
would explicitly align the assembler's point to 0x0000
.
$<number>
"pad-relative" pads the UXN rom by the specified number of words (bytes).
For instance $2
would move the assembler's point forwards two words.
Labels
@<word>
defines a top-level label.
For instance @foo
would make the word foo
a valid symbol for use elsewhere.
Defining a top-level word establishes a scope within which sub-labels may be defined.
&bar
following @foo
would create the label foo/bar
.
This can be used to create semantic tables.
Numbers and opcodes cannot be created as labels.
Example - the system device
|00 @System &vector $2 &wst $1 &rst $1 &eaddr $2 &ecode $1 &pad $1 &r $2 &g $2 &b $2 &debug $1 &halt $1
|00
aligns the assembler to 0x0000
.
This line of code creates the following symbols:
System
at0x0000
Sytstem/vector
at0x0000
System/wst
at0x0002
, shifted fromSystem/vector
by the$2
System/rst
at0x0003
System/eaddr
at0x0004
System/ecode
at0x0005
System/pad
at0x0006
System/r
at0x0008
System/g
at0x000a
System/b
at0x000c
System/debug
at0x000e
System/halt
at0x000f
Label References
Labels may be referenced in one of seven ways:
- Literal byte zero-page -
.label
- Raw byte relative -
_label
- Literal byte relative -
,label
- Raw byte absolute -
-label
- Raw short absolute -
:label
or=label
- Literal short absolute -
;label
Literal labels are inserted with a LIT
or LIT2
as appropriate.
Raw labels are inserted directly into bytecode.
Absolute labels are double quantities. Relative labels are single signed byte quantities with a ±127 range.
The zero page (#00XX
) is used for program globals and convenient scratch space.
Literal byte relative references ala ,foo
are used for control flow.
Using only a single byte, these references have a range of ±127 instructions.
A typical opcode sequences would be ,loop JMP
, eg. emit a relative address value to the loop label and perform a computed relative jump.
For bytecode compactness, UXN programs tend to use computed rather than absolute jumps.
The difference between single and double word references is critical, because the LDR
instruction is a computed relative load, whereas LDA
is an absolute short address load.
Includes
TAL files can include other files by writing ~<filename>
.
For instance the uxnasm.tal
file writes ~projects/library/string.tal
to include implementations of string functions.
As with other preprocessor and assembler languages, TAL does not support namespacing, renaming or selective importing.
- All included code is assembled at the point where it is included.
- TAL does not support multiple definition or idempotent includes, and will error on repeated or recursive inclusion.
Macros
Macros are sequences of instructions which may be repeated.
Macros are defined by writing %macro-name { ... }
.
The canonical UXNASM does not allow macros to exceed 64 words in size.
When macros are invoked by using the macro-name as a bare word, the contents of the macro will be inserted. Sub-macro references are supported and will be expanded with no recursion guards or limit.