52 lines
2.8 KiB
Markdown
52 lines
2.8 KiB
Markdown
|
# Notes
|
||
|
|
||
|
Natural gradient; interpreter -> semi-VM -> tracing JIT/translator -> optimizing JIT/translator
|
||
|
-> abstract interpreter -> static compiler/translator
|
||
|
|
||
|
A semi-VM which demand translates AST nodes into a stack of demanded evaluation
|
||
|
terms and then walks the evaluation stack as if it were a bytecode or
|
||
|
semi-bytecode evaluator. The advantage of this strategy is that the demanded
|
||
|
operation / control stack and paired data stack eliminate the need to leverage
|
||
|
the system control stack. This gets you serializable stacks for Flowmetal. But
|
||
|
you write two half interpreters.
|
||
|
|
||
|
Now the natural question is why write a hosted VM to get serializable stacks
|
||
|
when Python has a perfectly good bytecode VM already? Sticking one VM atop
|
||
|
another is ... a bit silly especially since the goal of doing so is to be able
|
||
|
to "drop down" from the one to the other to ensure compatibility.
|
||
|
|
||
|
Is there a lens through which the serialization requirements of Flowmental can
|
||
|
be satisfied from "normal" Python using the "normal" Python bytecode
|
||
|
interpreter?
|
||
|
|
||
|
Consider - function call points and function return points are in a sense
|
||
|
language safe points. Rather than trying to capture the entire evaluation
|
||
|
"image" at some point, one could instead track the call/return evaluation log
|
||
|
for replay. Such a scheme would allow Flowmetal to be implemented using static
|
||
|
rewrites of Python ASTs. Any function call becomes a checkpoint as does
|
||
|
receiving the return result.
|
||
|
|
||
|
Any `__call__` invocation needs to be evaluated as something like
|
||
|
|
||
|
x = runtime.call(const_gen_call_id, instance, args)
|
||
|
|
||
|
This tactic specifically leans on `yield` being a statement _with a return
|
||
|
value_. This pattern would let the "runtime" as the root evaluation routine
|
||
|
'continue' any given call site with the return result. `runtime.call` would be
|
||
|
some incantation for producing a sentinel value to the runtime noting that a
|
||
|
function call had been requested - and that its result should either be computed
|
||
|
or replayed from a log.
|
||
|
|
||
|
There are a lot of opportunities for optimization here. Not every function call
|
||
|
needs its value persisted into the log. Most function calls depend only on the
|
||
|
direct live state of the program. Exceptions are things like interacting with
|
||
|
file descriptors/sockets and clocks. But strictly data-dependent operations like
|
||
|
dictionary mutations are entirely safe under replay. They're only path
|
||
|
dependent. So really we only need to "de-optimize" or spy on "key" function
|
||
|
calls which occur against _unsafe_ operations. Or which occur against captured
|
||
|
function/method instances which cannot be statically identified.
|
||
|
|
||
|
There may be games to be played with yield/coroutines here, but that could play
|
||
|
heck with normal generators. Intercepting "normal" calls with "normal" calls is
|
||
|
probably the easy strategy.
|