51 lines
2.8 KiB
Markdown
51 lines
2.8 KiB
Markdown
# Notes
|
|
|
|
Natural gradient; interpreter -> semi-VM -> tracing JIT/translator -> optimizing JIT/translator
|
|
-> abstract interpreter -> static compiler/translator
|
|
|
|
A semi-VM which demand translates AST nodes into a stack of demanded evaluation
|
|
terms and then walks the evaluation stack as if it were a bytecode or
|
|
semi-bytecode evaluator. The advantage of this strategy is that the demanded
|
|
operation / control stack and paired data stack eliminate the need to leverage
|
|
the system control stack. This gets you serializable stacks for Flowmetal. But
|
|
you write two half interpreters.
|
|
|
|
Now the natural question is why write a hosted VM to get serializable stacks
|
|
when Python has a perfectly good bytecode VM already? Sticking one VM atop
|
|
another is ... a bit silly especially since the goal of doing so is to be able
|
|
to "drop down" from the one to the other to ensure compatibility.
|
|
|
|
Is there a lens through which the serialization requirements of Flowmental can
|
|
be satisfied from "normal" Python using the "normal" Python bytecode
|
|
interpreter?
|
|
|
|
Consider - function call points and function return points are in a sense
|
|
language safe points. Rather than trying to capture the entire evaluation
|
|
"image" at some point, one could instead track the call/return evaluation log
|
|
for replay. Such a scheme would allow Flowmetal to be implemented using static
|
|
rewrites of Python ASTs. Any function call becomes a checkpoint as does
|
|
receiving the return result.
|
|
|
|
Any `__call__` invocation needs to be evaluated as something like
|
|
|
|
x = runtime.call(const_gen_call_id, instance, args)
|
|
|
|
This tactic specifically leans on `yield` being a statement _with a return
|
|
value_. This pattern would let the "runtime" as the root evaluation routine
|
|
'continue' any given call site with the return result. `runtime.call` would be
|
|
some incantation for producing a sentinel value to the runtime noting that a
|
|
function call had been requested - and that its result should either be computed
|
|
or replayed from a log.
|
|
|
|
There are a lot of opportunities for optimization here. Not every function call
|
|
needs its value persisted into the log. Most function calls depend only on the
|
|
direct live state of the program. Exceptions are things like interacting with
|
|
file descriptors/sockets and clocks. But strictly data-dependent operations like
|
|
dictionary mutations are entirely safe under replay. They're only path
|
|
dependent. So really we only need to "de-optimize" or spy on "key" function
|
|
calls which occur against _unsafe_ operations. Or which occur against captured
|
|
function/method instances which cannot be statically identified.
|
|
|
|
There may be games to be played with yield/coroutines here, but that could play
|
|
heck with normal generators. Intercepting "normal" calls with "normal" calls is
|
|
probably the easy strategy.
|