132 lines
5.6 KiB
Markdown
132 lines
5.6 KiB
Markdown
# TAL Assembler
|
|
|
|
See also [the UXN virtual machine](./uxn.md)
|
|
|
|
TAL is a Forth like two-pass assembler language translating directly to UXN memory images.
|
|
|
|
## Words
|
|
Words are up to 63 consecutive non-whitespace characters.
|
|
For instance `loop`, `System`, `Mouse/x`, `my-routine` and `some_other_routine` would all be examples of words.
|
|
The UXN instructions themselves (`ADD`, `POP`, `LIT` and soforth) are all words.
|
|
|
|
Some words have special interpretations.
|
|
|
|
### Opcodes
|
|
|
|
See [the UXN documentation](./uxn.md) for a full listing of opcodes, but `BRK`, `INC`, `POP`, `NIP`, `SWP` .... `SFT` as words all mean their respective opcodes.
|
|
These opcodes may be followed with the flags `k`, `r` or `2` to set the `keep`, `return` and `short` flags.
|
|
For instance `INC2` as a word would increment a two-bit quantity at the top of the stack.
|
|
`INC2k` would keep the original value, resulting in `x x+1` as the stack values.
|
|
|
|
### Numbers
|
|
|
|
Hexadecimal numbers written with either two or four digits.
|
|
For instance `00` would be the single word `0x00`.
|
|
`0000` is equivalent to the two words `00 00`.
|
|
UXN is little-endian, the value `0xFF00` is represented as the sequential words `FF 00`.
|
|
|
|
To disambiguate, numbers are usually prefixed with `#`.
|
|
|
|
### Strings
|
|
|
|
Words may be captured as ASCII formatted strings.
|
|
Such strings are written `"<word>`.
|
|
For instance `"foo` would cause the bytes `#66 #6f #6f #00` to be literally inserted into the memory image.
|
|
|
|
As `"` notation cannot capture whitespace, the `#20` (space), `#0a` (newline) and `#09` (tab) character constants are common.
|
|
|
|
## Comments
|
|
Comments in TAL are written `( ... )` and support nesting. Eg. `( () )` is a valid comment. `( ( )` is not.
|
|
TAL does not have a way to "close all start comments" like Java and some other languages do.
|
|
|
|
## Brackets
|
|
|
|
`[` and `]` are treated as whitespace, and may be used for visual grouping.
|
|
While they have semantics in traditional Forth, they have no semantics in TAL.
|
|
|
|
## Assembler directives
|
|
|
|
### Padding
|
|
|
|
`|<number>` "pad-absolute" pads the resulting UXN rom to a given absolute address.
|
|
For instance `|0x0000` would explicitly align the assembler's point to `0x0000`.
|
|
|
|
`$<number>` "pad-relative" pads the UXN rom by the specified number of words (bytes).
|
|
For instance `$2` would move the assembler's point forwards two words.
|
|
|
|
### Labels
|
|
|
|
`@<word>` defines a top-level label.
|
|
For instance `@foo` would make the word `foo` a valid symbol for use elsewhere.
|
|
Defining a top-level word establishes a scope within which sub-labels may be defined.
|
|
|
|
`&bar` following `@foo` would create the label `foo/bar`.
|
|
This can be used to create semantic tables.
|
|
|
|
Numbers and opcodes cannot be created as labels.
|
|
|
|
#### Example - the system device
|
|
|
|
```tal
|
|
|00 @System &vector $2 &wst $1 &rst $1 &eaddr $2 &ecode $1 &pad $1 &r $2 &g $2 &b $2 &debug $1 &halt $1
|
|
```
|
|
|
|
`|00` aligns the assembler to `0x0000`.
|
|
|
|
This line of code creates the following symbols:
|
|
- `System` at `0x0000`
|
|
- `Sytstem/vector` at `0x0000`
|
|
- `System/wst` at `0x0002`, shifted from `System/vector` by the `$2`
|
|
- `System/rst` at `0x0003`
|
|
- `System/eaddr` at `0x0004`
|
|
- `System/ecode` at `0x0005`
|
|
- `System/pad` at `0x0006`
|
|
- `System/r` at `0x0008`
|
|
- `System/g` at `0x000a`
|
|
- `System/b` at `0x000c`
|
|
- `System/debug` at `0x000e`
|
|
- `System/halt` at `0x000f`
|
|
|
|
### Label References
|
|
|
|
Labels may be referenced in one of seven ways:
|
|
- Literal byte zero-page - `.label`
|
|
- Raw byte relative - `_label`
|
|
- Literal byte relative - `,label`
|
|
- Raw byte absolute - `-label`
|
|
- Raw short absolute - `:label` or `=label`
|
|
- Literal short absolute - `;label`
|
|
|
|
Literal labels are inserted with a `LIT` or `LIT2` as appropriate.
|
|
Raw labels are inserted directly into bytecode.
|
|
|
|
Absolute labels are double quantities.
|
|
Relative labels are single signed byte quantities with a ±127 range.
|
|
|
|
The zero page (`#00XX`) is used for system devices, along other things.
|
|
It's common to see labels such as `.System/vector`, being a reference to the address `#0000` packed into just `#00`
|
|
However as UXN has a special `LDZ` operation for loading from the zero page, this address can be specified as simply `#96` to save a byte.
|
|
As the last device is mapped to `#CX`, it is common to see `#DX`, `#EX` and `#FX` used for program-global variables for ease of access.
|
|
|
|
Literal byte relative references ala `,foo` are used for control flow.
|
|
Using only a single byte, these references have a range of ±127 instructions.
|
|
A typical opcode sequences would be `,loop JMP`, eg. emit a relative address value to the loop label and perform a computed relative jump.
|
|
For bytecode compactness, UXN programs tend to use computed rather than absolute jumps.
|
|
|
|
The difference between single and double word references is critical, because the `LDR` instruction is a computed relative load, whereas `LDA` is an absolute short address load.
|
|
|
|
### Includes
|
|
TAL files can include other files by writing `~<filename>`.
|
|
For instance the `uxnasm.tal` file writes `~projects/library/string.tal` to include implementations of string functions.
|
|
As with other preprocessor and assembler languages, TAL does not support namespacing, renaming or selective importing.
|
|
|
|
- All included code is assembled at the point where it is included.
|
|
- TAL does not support multiple definition or idempotent includes, and will error on repeated or recursive inclusion.
|
|
|
|
## Macros
|
|
Macros are sequences of instructions which may be repeated.
|
|
Macros are defined by writing `%macro-name { ... }`.
|
|
The canonical UXNASM does not allow macros to exceed 64 words in size.
|
|
|
|
When macros are invoked by using the macro-name as a bare word, the contents of the macro will be inserted.
|
|
Sub-macro references are supported and will be expanded with no recursion guards or limit.
|