Import the Calf project
This commit is contained in:
parent
c25e825a95
commit
9feb262454
27 changed files with 1699 additions and 0 deletions
10
projects/calf/BUILD
Normal file
10
projects/calf/BUILD
Normal file
|
@ -0,0 +1,10 @@
|
|||
package(default_visibility = ["//visibility:public"])
|
||||
|
||||
py_library(
|
||||
name = "lib",
|
||||
srcs = glob(["src/**/*.py"]),
|
||||
imports = ["src"],
|
||||
deps = [
|
||||
py_requirement("pyrsistent"),
|
||||
]
|
||||
)
|
4
projects/calf/NOTES.md
Normal file
4
projects/calf/NOTES.md
Normal file
|
@ -0,0 +1,4 @@
|
|||
# Notes
|
||||
|
||||
https://github.com/Pyrlang/Pyrlang
|
||||
https://en.wikipedia.org/wiki/Single_system_image
|
56
projects/calf/README.md
Normal file
56
projects/calf/README.md
Normal file
|
@ -0,0 +1,56 @@
|
|||
# Calf
|
||||
|
||||
> Calf: Noun.
|
||||
A young cow or ox.
|
||||
|
||||
Before I walked from the Clojure space, I kept throwing around the idea of "ox", an ur-clojure.
|
||||
Ox was supposed to experiment with some stuff around immutable namespaces and code as data which never came to fruition.
|
||||
I found the JVM environment burdensome, difficult to maintain velocity in, and my own ideas too un-formed to fit well into a rigorous object model.
|
||||
|
||||
Calf is a testbed.
|
||||
It's supposed to be a lightweight, unstable, easy for me to hack on substrate for exploring those old ideas and some new ones.
|
||||
|
||||
Particularly I'm interested in:
|
||||
- compilers-as-databases (or using databases)
|
||||
- stream processing and process models of computation more akin to Erlang
|
||||
- reliability sensitive programming models (failure, recovery, process supervision)
|
||||
|
||||
I previously [blogged a bit](https://www.arrdem.com/2019/04/01/the_silver_tower/) about some ideas for what this could look like.
|
||||
I'm convinced that a programming environment based around [virtual resiliency](https://www.microsoft.com/en-us/research/publication/a-m-b-r-o-s-i-a-providing-performant-virtual-resiliency-for-distributed-applications/) is a worthwhile goal (having independently invented it) and worth trying to bring to a mainstream general purpose platform like Python.
|
||||
|
||||
## Manifesto
|
||||
|
||||
In the last decade, immutability has been affirmed in the programming mainstream as an effective tool for making programs and state more manageable, and one which has been repeatedly implemented at acceptable performance costs.
|
||||
Especially in messaging based rather than state sharing environments, immutability and "data" oriented programming is becoming more and more common.
|
||||
|
||||
It also seems that much of the industry is moving towards message based reactive or network based connective systems.
|
||||
Microservices seem to have won, and functions-as-a-service seem to be a rising trend reflecting a desire to offload or avoid deployment management rather than wrangle stateful services.
|
||||
|
||||
In these environments, programs begin to consist entirely of messaging with other programs over shared channels such as traditional HTTP or other RPC tools or message buses such as Kafka, gRPC, ThriftMux and soforth.
|
||||
|
||||
Key challenges with these connective services are:
|
||||
- How they handle failure
|
||||
- How they achieve reliability
|
||||
- The ergonomic difficulties of building and deploying connective programs
|
||||
- The operational difficulties of managing N-many 'reliable' services
|
||||
|
||||
Tools like Argo, Airflow and the like begin to talk about such networked or evented programs as DAGs; providing schedulers for sequencing actions and executors for performing actions.
|
||||
|
||||
Airflow provides a programmable Python scheduler environment, but fails to provide an execution isolation boundary (such as a container or other subprocess/`fork()` boundary) allowing users to bring their own dependencies.
|
||||
Instead Airflow users must build custom Airflow packagings which bundle dependencies into the Airflow instance.
|
||||
This means that Airflow deployments can only be centralized with difficulty due to shared dependencies and disparate dependency lifecycles and limits the return on investment of the platform by increasing operational burden.
|
||||
|
||||
Argo ducks this mistake, providing a robust scheduler and leveraging k8s for its executor.
|
||||
This allows Argo to be managed independently of any of the workloads it manages - a huge step forwards over Airflow - but this comes at considerable ergonomic costs for trivial tasks and provides a more limited scheduler.
|
||||
|
||||
Previously I developed a system which provided a much stronger DSL than Airflow's, but made the same key mistake of not decoupling execution from the scheduler/coordinator.
|
||||
Calf is a sketch of a programming language and system with a nearly fully featured DSL, and decoupling between scheduling (control flow of programs) and execution of "terminal" actions.
|
||||
|
||||
In short, think a Py-Lisp where instead of doing FFI directly to the parent Python instance you do FFI by enqueuing a (potentially retryable!) request onto a shared cluster message bus, from which subscriber worker processes elsewhere provide request/response handling.
|
||||
One could reasonably accuse this project of being an attempt to unify Erlang and a hosted Python to build a "BASH for distsys" tool while providing a multi-tenant execution platform that can be centrally managed.
|
||||
|
||||
## License
|
||||
|
||||
Copyright Reid 'arrdem' McKenzie, 3/5/2017.
|
||||
Distributed under the terms of the MIT license.
|
||||
See the included `LICENSE` file for more.
|
4
projects/calf/pytest.ini
Normal file
4
projects/calf/pytest.ini
Normal file
|
@ -0,0 +1,4 @@
|
|||
[pytest]
|
||||
python_files=test_*.py
|
||||
python_classes=Check
|
||||
python_functions=test_*
|
45
projects/calf/setup.py
Normal file
45
projects/calf/setup.py
Normal file
|
@ -0,0 +1,45 @@
|
|||
#!/usr/bin/env python
|
||||
|
||||
from os import path
|
||||
|
||||
from setuptools import setup, find_namespace_packages
|
||||
|
||||
|
||||
# Fetch the README contents
|
||||
rootdir = path.abspath(path.dirname(__file__))
|
||||
with open(path.join(rootdir, "README.md"), encoding="utf-8") as f:
|
||||
long_description = f.read()
|
||||
|
||||
|
||||
setup(
|
||||
name="calf",
|
||||
version="0.0.0",
|
||||
long_description=long_description,
|
||||
long_description_content_type="text/markdown",
|
||||
packages=find_namespace_packages(include=["calf.*"]),
|
||||
entry_points={
|
||||
"console_scripts": [
|
||||
# DSL testing stuff
|
||||
"calf-lex = calf.lexer:main",
|
||||
"calf-parse = calf.parser:main",
|
||||
"calf-read = calf.reader:main",
|
||||
"calf-analyze = calf.analyzer:main",
|
||||
"calf-compile = calf.compiler:main",
|
||||
|
||||
# Client/server stuff
|
||||
"calf-client = calf.client:main",
|
||||
"calf-server = calf.server:main",
|
||||
"calf-worker = calf.worker:main",
|
||||
]
|
||||
},
|
||||
install_requires=[
|
||||
"pyrsistent~=0.17.0",
|
||||
],
|
||||
extra_requires={
|
||||
"node": [
|
||||
"flask~=1.1.0",
|
||||
"pyyaml~=5.4.0",
|
||||
"redis~=3.5.0",
|
||||
],
|
||||
},
|
||||
)
|
1
projects/calf/src/calf/__init__.py
Normal file
1
projects/calf/src/calf/__init__.py
Normal file
|
@ -0,0 +1 @@
|
|||
#!/usr/bin/env python3
|
3
projects/calf/src/calf/analyzer.py
Normal file
3
projects/calf/src/calf/analyzer.py
Normal file
|
@ -0,0 +1,3 @@
|
|||
"""
|
||||
The calf analyzer.
|
||||
"""
|
86
projects/calf/src/calf/cursedrepl.py
Normal file
86
projects/calf/src/calf/cursedrepl.py
Normal file
|
@ -0,0 +1,86 @@
|
|||
"""
|
||||
Some shared scaffolding for building terminal "REPL" drivers.
|
||||
"""
|
||||
|
||||
import curses
|
||||
from curses.textpad import Textbox, rectangle
|
||||
|
||||
|
||||
def curse_repl(handle_buffer):
|
||||
|
||||
def handle(buff, count):
|
||||
try:
|
||||
return list(handle_buffer(buff, count)), None
|
||||
except Exception as e:
|
||||
return None, e
|
||||
|
||||
def _main(stdscr: curses.window):
|
||||
maxy, maxx = 0, 0
|
||||
|
||||
examples = []
|
||||
count = 1
|
||||
while 1:
|
||||
# Prompt
|
||||
maxy, maxx = stdscr.getmaxyx()
|
||||
stdscr.clear()
|
||||
|
||||
stdscr.addstr(0, 0, "Enter example: (hit Ctrl-G to execute, Ctrl-C to exit)", curses.A_BOLD)
|
||||
editwin = curses.newwin(5, maxx - 4,
|
||||
2, 2)
|
||||
rectangle(stdscr,
|
||||
1, 1,
|
||||
1 + 5 + 1, maxx - 2)
|
||||
|
||||
# Printing is part of the prompt
|
||||
cur = 8
|
||||
def putstr(str, x=0, attr=0):
|
||||
# ya rly. I know exactly what I'm doing here
|
||||
nonlocal cur
|
||||
# This is how we handle going off the bottom of the scren lol
|
||||
if cur < maxy:
|
||||
stdscr.addstr(cur, x, str, attr)
|
||||
cur += (len(str.split("\n")) or 1)
|
||||
|
||||
for ex, buff, vals, err in reversed(examples):
|
||||
putstr(f"Example {ex}:", attr=curses.A_BOLD)
|
||||
|
||||
for l in buff.split("\n"):
|
||||
putstr(f" | {l}")
|
||||
|
||||
putstr("")
|
||||
|
||||
if err:
|
||||
err = str(err)
|
||||
err = err.split("\n")
|
||||
putstr(" Error:")
|
||||
for l in err:
|
||||
putstr(f" {l}", attr=curses.COLOR_YELLOW)
|
||||
|
||||
elif vals:
|
||||
putstr(" Values:")
|
||||
for x, t in zip(range(1, 1<<64), vals):
|
||||
putstr(f" {x:>3}) " + repr(t))
|
||||
|
||||
putstr("")
|
||||
|
||||
stdscr.refresh()
|
||||
|
||||
# Readf rom the user
|
||||
box = Textbox(editwin)
|
||||
try:
|
||||
box.edit()
|
||||
except KeyboardInterrupt:
|
||||
break
|
||||
|
||||
buff = box.gather().strip()
|
||||
if not buff:
|
||||
continue
|
||||
|
||||
vals, err = handle(buff, count)
|
||||
|
||||
examples.append((count, buff, vals, err))
|
||||
|
||||
count += 1
|
||||
stdscr.refresh()
|
||||
|
||||
curses.wrapper(_main)
|
70
projects/calf/src/calf/grammar.py
Normal file
70
projects/calf/src/calf/grammar.py
Normal file
|
@ -0,0 +1,70 @@
|
|||
"""
|
||||
The elements of the Calf grammar. Used by the lexer.
|
||||
"""
|
||||
|
||||
WHITESPACE = r"\n\r\s,"
|
||||
|
||||
DELIMS = r'%s\[\]\(\)\{\}:;#^"\'' % (WHITESPACE,)
|
||||
|
||||
SIMPLE_SYMBOL = r"([^{ds}\-\+\d][^{ds}]*)|([^{ds}\d]+)".format(ds=DELIMS)
|
||||
|
||||
SYMBOL_PATTERN = r"(((?P<namespace>{ss})/)?(?P<name>{ss}))".format(ss=SIMPLE_SYMBOL)
|
||||
|
||||
SIMPLE_INTEGER = r"[+-]?\d*"
|
||||
|
||||
FLOAT_PATTERN = r"(?P<body>({i})(\.(\d*))?)?([eE](?P<exponent>{i}))?".format(
|
||||
i=SIMPLE_INTEGER
|
||||
)
|
||||
|
||||
# HACK (arrdem 2021-03-13):
|
||||
#
|
||||
# The lexer is INCREMENTAL not TOTAL. It works by incrementally greedily
|
||||
# building up strings that are PARTIAL matches. This means it has no support
|
||||
# for the " closing anchor of a string, or the \n closing anchor of a comment.
|
||||
# So we have to do this weird thing where the _required_ terminators are
|
||||
# actually _optional_ here so that the parser works.
|
||||
STRING_PATTERN = r'(""".*?(""")?)|("((\\"|[^"])*?)"?)'
|
||||
COMMENT_PATTERN = r";(([^\n\r]*)(\n\r?)?)"
|
||||
|
||||
TOKENS = [
|
||||
# Paren (noral) lists
|
||||
(r"\(", "PAREN_LEFT",),
|
||||
(r"\)", "PAREN_RIGHT",),
|
||||
# Bracket lists
|
||||
(r"\[", "BRACKET_LEFT",),
|
||||
(r"\]", "BRACKET_RIGHT",),
|
||||
# Brace lists (maps)
|
||||
(r"\{", "BRACE_LEFT",),
|
||||
(r"\}", "BRACE_RIGHT",),
|
||||
(r"\^", "META",),
|
||||
(r"'", "SINGLE_QUOTE",),
|
||||
(STRING_PATTERN, "STRING",),
|
||||
(r"#", "MACRO_DISPATCH",),
|
||||
# Symbols
|
||||
(SYMBOL_PATTERN, "SYMBOL",),
|
||||
# Numbers
|
||||
(SIMPLE_INTEGER, "INTEGER",),
|
||||
(FLOAT_PATTERN, "FLOAT",),
|
||||
# Keywords
|
||||
#
|
||||
# Note: this is a dirty f'n hack in that in order for keywords to work, ":"
|
||||
# has to be defined to be a valid keyword.
|
||||
(r":" + SYMBOL_PATTERN + "?", "KEYWORD",),
|
||||
# Whitespace
|
||||
#
|
||||
# Note that the whitespace token will contain at most one newline
|
||||
(r"(\n\r?|[,\t ]*)", "WHITESPACE",),
|
||||
# Comment
|
||||
(COMMENT_PATTERN, "COMMENT",),
|
||||
# Strings
|
||||
(r'"(?P<body>(?:[^\"]|\.)*)"', "STRING"),
|
||||
]
|
||||
|
||||
MATCHING = {
|
||||
"PAREN_LEFT": "PAREN_RIGHT",
|
||||
"BRACKET_LEFT": "BRACKET_RIGHT",
|
||||
"BRACE_LEFT": "BRACE_RIGHT",
|
||||
}
|
||||
|
||||
|
||||
WHITESPACE_TYPES = {"WHITESPACE", "COMMENT"}
|
101
projects/calf/src/calf/io/reader.py
Normal file
101
projects/calf/src/calf/io/reader.py
Normal file
|
@ -0,0 +1,101 @@
|
|||
"""
|
||||
Various Reader class instances.
|
||||
"""
|
||||
|
||||
|
||||
class Position(object):
|
||||
def __init__(self, offset, line, column):
|
||||
self.offset = offset
|
||||
self.line = line
|
||||
self.column = column
|
||||
|
||||
def __repr__(self):
|
||||
return "<Pos %r (%r:%r)>" % (self.offset, self.line, self.column)
|
||||
|
||||
def __str__(self):
|
||||
return self.__repr__()
|
||||
|
||||
|
||||
class PosReader(object):
|
||||
"""A wrapper for anything that can be read from. Tracks offset, line and column information."""
|
||||
|
||||
def __init__(self, reader):
|
||||
self.reader = reader
|
||||
self.offset = 0
|
||||
self.line = 1
|
||||
self.column = 0
|
||||
|
||||
def read(self, n=1):
|
||||
"""
|
||||
Returns a pair (position, text) where position is the position of the first character in the
|
||||
returned text. Text is a string of length up to or equal to `n` in length.
|
||||
"""
|
||||
p = self.position
|
||||
|
||||
if n == 1:
|
||||
chr = self.reader.read(n)
|
||||
|
||||
if chr != "":
|
||||
self.offset += 1
|
||||
self.column += 1
|
||||
|
||||
if chr == "\n":
|
||||
self.line += 1
|
||||
self.column = 0
|
||||
|
||||
return (
|
||||
p,
|
||||
chr,
|
||||
)
|
||||
|
||||
else:
|
||||
return (
|
||||
p,
|
||||
"".join(self.read(n=1)[1] for i in range(n)),
|
||||
)
|
||||
|
||||
@property
|
||||
def position(self):
|
||||
"""The position of the last character read."""
|
||||
return Position(self.offset, self.line, self.column)
|
||||
|
||||
|
||||
class PeekPosReader(PosReader):
|
||||
"""A wrapper for anything that can be read from. Provides a way to peek the next character."""
|
||||
|
||||
def __init__(self, reader):
|
||||
self.reader = reader if isinstance(reader, PosReader) else PosReader(reader)
|
||||
self._peek = None
|
||||
|
||||
def read(self, n=1):
|
||||
"""
|
||||
Same as `PosReader.read`. Returns a pair (pos, text) where pos is the position of the first
|
||||
read character and text is a string of length up to `n`. If a peeked character exists, it
|
||||
is consumed by this operation.
|
||||
"""
|
||||
if self._peek and n == 1:
|
||||
a = self._peek
|
||||
self._peek = None
|
||||
return a
|
||||
|
||||
else:
|
||||
p, t = self._peek or (None, "")
|
||||
|
||||
if self._peek:
|
||||
self._peek = None
|
||||
|
||||
p_, t_ = self.reader.read(n=(n if not t else n - len(t)))
|
||||
p = p or p_
|
||||
|
||||
return (p, t + t_)
|
||||
|
||||
def peek(self):
|
||||
"""Returns the (pos, text) pair which would be read next by read(n=1)."""
|
||||
if self._peek is None:
|
||||
self._peek = self.reader.read(n=1)
|
||||
return self._peek
|
||||
|
||||
@property
|
||||
def position(self):
|
||||
"""The position of the last character read."""
|
||||
return self.reader.position
|
136
projects/calf/src/calf/lexer.py
Normal file
136
projects/calf/src/calf/lexer.py
Normal file
|
@ -0,0 +1,136 @@
|
|||
"""
|
||||
Calf lexer.
|
||||
|
||||
Provides machinery for lexing sources of text into sequences of tokens with textual information, as
|
||||
well as buffer position information appropriate for either full AST parsing, lossless syntax tree
|
||||
parsing, linting or other use.
|
||||
"""
|
||||
|
||||
import io
|
||||
import re
|
||||
import sys
|
||||
|
||||
from calf.token import CalfToken
|
||||
from calf.io.reader import PeekPosReader
|
||||
from calf.grammar import TOKENS
|
||||
from calf.util import *
|
||||
|
||||
|
||||
class CalfLexer:
|
||||
"""
|
||||
Lexer object.
|
||||
|
||||
Wraps something you can read characters from, and presents a lazy sequence of Token objects.
|
||||
|
||||
Raises ValueError at any time due to either a conflict in the grammar being lexed, or incomplete
|
||||
input. Exceptions from the backing reader object are not masked.
|
||||
|
||||
Rule order is used to decide conflicts. If multiple patterns would match an input, the "first"
|
||||
in token list order wins.
|
||||
"""
|
||||
|
||||
def __init__(self, stream, source=None, metadata=None, tokens=TOKENS):
|
||||
"""FIXME"""
|
||||
|
||||
self._stream = (
|
||||
PeekPosReader(stream) if not isinstance(stream, PeekPosReader) else stream
|
||||
)
|
||||
self.source = source
|
||||
self.metadata = metadata or {}
|
||||
self.tokens = tokens
|
||||
|
||||
def __next__(self):
|
||||
"""
|
||||
Tries to scan the next token off of the backing stream.
|
||||
|
||||
Starting with a list of all available tokens, an empty buffer and a single new character
|
||||
peeked from the backing stream, reads more character so long as adding the next character
|
||||
still leaves one or more possible matching "candidates" (token patterns).
|
||||
|
||||
When adding the next character from the stream would build an invalid token, a token of the
|
||||
resulting single candidate type is generated.
|
||||
|
||||
At the end of input, if we have a single candidate remaining, a final token of that type is
|
||||
generated. Otherwise we are in an incomplete input state either due to incomplete input or
|
||||
a grammar conflict.
|
||||
"""
|
||||
|
||||
buffer = ""
|
||||
candidates = self.tokens
|
||||
position, chr = self._stream.peek()
|
||||
|
||||
while chr:
|
||||
if not candidates:
|
||||
raise ValueError("Entered invalid state - no candidates!")
|
||||
|
||||
buff2 = buffer + chr
|
||||
can2 = [t for t in candidates if re.fullmatch(t[0], buff2)]
|
||||
|
||||
# Try to include the last read character to support longest-wins grammars
|
||||
if not can2 and len(candidates) >= 1:
|
||||
pat, type = candidates[0]
|
||||
groups = re.match(re.compile(pat), buffer).groupdict()
|
||||
groups.update(self.metadata)
|
||||
return CalfToken(type, buffer, self.source, position, groups)
|
||||
|
||||
else:
|
||||
# Update the buffers
|
||||
buffer = buff2
|
||||
candidates = can2
|
||||
|
||||
# consume the 'current' character for side-effects
|
||||
self._stream.read()
|
||||
|
||||
# set chr to be the next peeked character
|
||||
_, chr = self._stream.peek()
|
||||
|
||||
if len(candidates) >= 1:
|
||||
pat, type = candidates[0]
|
||||
groups = re.match(re.compile(pat), buffer).groupdict()
|
||||
groups.update(self.metadata)
|
||||
return CalfToken(type, buffer, self.source, position, groups)
|
||||
|
||||
else:
|
||||
raise ValueError(
|
||||
"Encountered end of buffer with incomplete token %r" % (buffer,)
|
||||
)
|
||||
|
||||
def __iter__(self):
|
||||
"""
|
||||
Scans tokens out of the character stream.
|
||||
|
||||
May raise ValueError if there is either an issue with the grammar or the input.
|
||||
Will not mask any exceptions from the backing reader.
|
||||
"""
|
||||
|
||||
# While the character stream isn't empty
|
||||
while self._stream.peek()[1] != "":
|
||||
yield next(self)
|
||||
|
||||
|
||||
def lex_file(path, metadata=None):
|
||||
"""
|
||||
Returns the sequence of tokens resulting from lexing all text in the named file.
|
||||
"""
|
||||
|
||||
with open(path, "r") as f:
|
||||
return list(CalfLexer(f, path, {}))
|
||||
|
||||
|
||||
def lex_buffer(buffer, source="<Buffer>", metadata=None):
|
||||
"""
|
||||
Returns the lazy sequence of tokens resulting from lexing all the text in a buffer.
|
||||
"""
|
||||
|
||||
return CalfLexer(io.StringIO(buffer), source, metadata)
|
||||
|
||||
|
||||
def main():
|
||||
"""A CURSES application for using the lexer."""
|
||||
|
||||
from calf.cursedrepl import curse_repl
|
||||
|
||||
def handle_buffer(buff, count):
|
||||
return list(lex_buffer(buff, source=f"<Example {count}>"))
|
||||
|
||||
curse_repl(handle_buffer)
|
59
projects/calf/src/calf/packages.py
Normal file
59
projects/calf/src/calf/packages.py
Normal file
|
@ -0,0 +1,59 @@
|
|||
"""
|
||||
The Calf package infrastructure.
|
||||
|
||||
Calf's packaging infrastructure is very heavily inspired by Maven, and seeks first and foremost to
|
||||
provide statically understandable, repeatable builds.
|
||||
|
||||
However the loading infrastructure is designed to simultaneously support from-source builds
|
||||
appropriate to interactive development workflows and monorepos.
|
||||
"""
|
||||
|
||||
from collections import namedtuple
|
||||
|
||||
|
||||
class CalfLoaderConfig(namedtuple("CalfLoaderConfig", ["paths"])):
|
||||
"""
|
||||
"""
|
||||
|
||||
|
||||
class CalfDelayedPackage(
|
||||
namedtuple("CalfDelayedPackage", ["name", "version", "metadata", "path"])
|
||||
):
|
||||
"""
|
||||
This structure represents the delay of loading a packaage.
|
||||
|
||||
Rather than eagerly analyze packages, it may be profitable to use lazy loading / lazy resolution
|
||||
of symbols. It may also be possible to cache analyzing some packages.
|
||||
"""
|
||||
|
||||
|
||||
class CalfPackage(
|
||||
namedtuple("CalfPackage", ["name", "version", "metadata", "modules"])
|
||||
):
|
||||
"""
|
||||
This structure represents the result of forcing the load of a package, and is the product of
|
||||
either loading a package directly, or a package becoming a direct dependency and being forced.
|
||||
"""
|
||||
|
||||
|
||||
def parse_package_requirement(config, env, requirement):
|
||||
"""
|
||||
:param config:
|
||||
:param env:
|
||||
:param requirement:
|
||||
:returns:
|
||||
|
||||
|
||||
"""
|
||||
|
||||
|
||||
def analyze_package(config, env, package):
|
||||
"""
|
||||
:param config:
|
||||
:param env:
|
||||
:param module:
|
||||
:returns:
|
||||
|
||||
Given a loader configuration and an environment to load into, analyzes the requested package,
|
||||
returning an updated environment.
|
||||
"""
|
249
projects/calf/src/calf/parser.py
Normal file
249
projects/calf/src/calf/parser.py
Normal file
|
@ -0,0 +1,249 @@
|
|||
"""
|
||||
The Calf parser.
|
||||
"""
|
||||
|
||||
from collections import namedtuple
|
||||
from itertools import tee
|
||||
import logging
|
||||
import sys
|
||||
from typing import NamedTuple, Callable
|
||||
|
||||
from calf.lexer import CalfLexer, lex_buffer, lex_file
|
||||
from calf.grammar import MATCHING, WHITESPACE_TYPES
|
||||
from calf.token import *
|
||||
|
||||
|
||||
log = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def mk_list(contents, open=None, close=None):
|
||||
return CalfListToken(
|
||||
"LIST", contents, open.source, open.start_position, close.start_position
|
||||
)
|
||||
|
||||
|
||||
def mk_sqlist(contents, open=None, close=None):
|
||||
return CalfListToken(
|
||||
"SQLIST", contents, open.source, open.start_position, close.start_position
|
||||
)
|
||||
|
||||
|
||||
def pairwise(l: list) -> iter:
|
||||
"s -> (s0,s1), (s2,s3), (s4, s5), ..."
|
||||
return zip(l[::2], l[1::2])
|
||||
|
||||
|
||||
def mk_dict(contents, open=None, close=None):
|
||||
# FIXME (arrdem 2021-03-14):
|
||||
# Raise a real SyntaxError of some sort.
|
||||
assert len(contents) % 2 == 0, "Improper dict!"
|
||||
return CalfDictToken(
|
||||
"DICT",
|
||||
list(pairwise(contents)),
|
||||
open.source,
|
||||
open.start_position,
|
||||
close.start_position,
|
||||
)
|
||||
|
||||
def mk_str(token):
|
||||
buff = token.value
|
||||
|
||||
if buff.startswith('"""') and not buff.endswith('"""'):
|
||||
raise ValueError('Unterminated tripple quote string')
|
||||
|
||||
elif buff.startswith('"') and not buff.endswith('"'):
|
||||
raise ValueError('Unterminated quote string')
|
||||
|
||||
elif not buff.startswith('"') or buff == '"' or buff == '"""':
|
||||
raise ValueError('Illegal string')
|
||||
|
||||
if buff.startswith('"""'):
|
||||
buff = buff[3:-3]
|
||||
else:
|
||||
buff = buff[1:-1]
|
||||
|
||||
buff = buff.encode("utf-8").decode("unicode_escape") # Handle escape codes
|
||||
|
||||
return CalfStrToken(token, buff)
|
||||
|
||||
|
||||
CTORS = {
|
||||
"PAREN_LEFT": mk_list,
|
||||
"BRACKET_LEFT": mk_sqlist,
|
||||
"BRACE_LEFT": mk_dict,
|
||||
"STRING": mk_str,
|
||||
"INTEGER": CalfIntegerToken,
|
||||
"FLOAT": CalfFloatToken,
|
||||
"SYMBOL": CalfSymbolToken,
|
||||
"KEYWORD": CalfKeywordToken,
|
||||
}
|
||||
|
||||
|
||||
class CalfParseError(Exception):
|
||||
"""
|
||||
Base class for representing errors encountered parsing.
|
||||
"""
|
||||
|
||||
def __init__(self, message: str, token: CalfToken):
|
||||
super(Exception, self).__init__(message)
|
||||
self.token = token
|
||||
|
||||
def __str__(self):
|
||||
return f"Parse error at {self.token.loc()}: " + super().__str__()
|
||||
|
||||
|
||||
class CalfUnexpectedCloseParseError(CalfParseError):
|
||||
"""
|
||||
Represents encountering an unexpected close token.
|
||||
"""
|
||||
|
||||
def __init__(self, token, matching_open=None):
|
||||
msg = f"encountered unexpected closing {token!r}"
|
||||
if matching_open:
|
||||
msg += f" which appears to match {matching_open!r}"
|
||||
super(CalfParseError, self).__init__(msg, token)
|
||||
self.token = token
|
||||
self.matching_open = matching_open
|
||||
|
||||
|
||||
class CalfMissingCloseParseError(CalfParseError):
|
||||
"""
|
||||
Represents a failure to encounter an expected close token.
|
||||
"""
|
||||
|
||||
def __init__(self, expected_close_token, open_token):
|
||||
super(CalfMissingCloseParseError, self).__init__(
|
||||
f"expected {expected_close_token} starting from {open_token}, got end of file.",
|
||||
open_token
|
||||
)
|
||||
self.expected_close_token = expected_close_token
|
||||
|
||||
|
||||
def parse_stream(stream,
|
||||
discard_whitespace: bool = True,
|
||||
discard_comments: bool = True,
|
||||
stack: list = None):
|
||||
"""Parses a token stream, producing a lazy sequence of all read top level forms.
|
||||
|
||||
If `discard_whitespace` is truthy, then no WHITESPACE tokens will be emitted
|
||||
into the resulting parse tree. Otherwise, WHITESPACE tokens will be
|
||||
included. Whether WHITESPACE tokens are included or not, the tokens of the
|
||||
tree will reflect original source locations.
|
||||
|
||||
"""
|
||||
|
||||
stack = stack or []
|
||||
|
||||
def recur(_stack = None):
|
||||
yield from parse_stream(stream,
|
||||
discard_whitespace,
|
||||
discard_comments,
|
||||
_stack or stack)
|
||||
|
||||
for token in stream:
|
||||
# Whitespace discarding
|
||||
if token.type == "WHITESPACE" and discard_whitespace:
|
||||
continue
|
||||
|
||||
elif token.type == "COMMENT" and discard_comments:
|
||||
continue
|
||||
|
||||
# Built in reader macros
|
||||
elif token.type == "META":
|
||||
try:
|
||||
meta_t = next(recur())
|
||||
except StopIteration:
|
||||
raise CalfParseError("^ not followed by meta value", token)
|
||||
|
||||
try:
|
||||
value_t = next(recur())
|
||||
except StopIteration:
|
||||
raise CalfParseError("^ not followed by value", token)
|
||||
|
||||
yield CalfMetaToken(token, meta_t, value_t)
|
||||
|
||||
elif token.type == "MACRO_DISPATCH":
|
||||
try:
|
||||
dispatch_t = next(recur())
|
||||
except StopIteration:
|
||||
raise CalfParseError("# not followed by dispatch value", token)
|
||||
|
||||
try:
|
||||
value_t = next(recur())
|
||||
except StopIteration:
|
||||
raise CalfParseError("^ not followed by value", token)
|
||||
|
||||
yield CalfDispatchToken(token, dispatch_t, value_t)
|
||||
|
||||
elif token.type == "SINGLE_QUOTE":
|
||||
try:
|
||||
quoted_t = next(recur())
|
||||
except StopIteration:
|
||||
raise CalfParseError("' not followed by quoted form", token)
|
||||
|
||||
yield CalfQuoteToken(token, quoted_t)
|
||||
|
||||
# Compounds
|
||||
elif token.type in MATCHING.keys():
|
||||
balancing = MATCHING[token.type]
|
||||
elements = list(recur(stack + [(balancing, token)]))
|
||||
# Elements MUST have at least the close token in it
|
||||
if not elements:
|
||||
raise CalfMissingCloseParseError(balancing, token)
|
||||
|
||||
elements, close = elements[:-1], elements[-1]
|
||||
if close.type != MATCHING[token.type]:
|
||||
raise CalfMissingCloseParseError(balancing, token)
|
||||
|
||||
yield CTORS[token.type](elements, token, close)
|
||||
|
||||
elif token.type in MATCHING.values():
|
||||
# Case of matching the immediate open
|
||||
if stack and token.type == stack[-1][0]:
|
||||
yield token
|
||||
break
|
||||
|
||||
# Case of maybe matching something else, but definitely being wrong
|
||||
else:
|
||||
matching = next(reversed([t[1] for t in stack if t[0] == token.type]), None)
|
||||
raise CalfUnexpectedCloseParseError(token, matching)
|
||||
|
||||
# Atoms
|
||||
elif token.type in CTORS:
|
||||
yield CTORS[token.type](token)
|
||||
|
||||
else:
|
||||
yield token
|
||||
|
||||
|
||||
def parse_buffer(buffer,
|
||||
discard_whitespace=True,
|
||||
discard_comments=True):
|
||||
"""
|
||||
Parses a buffer, producing a lazy sequence of all parsed level forms.
|
||||
|
||||
Propagates all errors.
|
||||
"""
|
||||
|
||||
yield from parse_stream(lex_buffer(buffer),
|
||||
discard_whitespace,
|
||||
discard_comments)
|
||||
|
||||
|
||||
def parse_file(file):
|
||||
"""
|
||||
Parses a file, producing a lazy sequence of all parsed level forms.
|
||||
"""
|
||||
|
||||
yield from parse_stream(lex_file(file))
|
||||
|
||||
|
||||
def main():
|
||||
"""A CURSES application for using the parser."""
|
||||
|
||||
from calf.cursedrepl import curse_repl
|
||||
|
||||
def handle_buffer(buff, count):
|
||||
return list(parse_stream(lex_buffer(buff, source=f"<Example {count}>")))
|
||||
|
||||
curse_repl(handle_buffer)
|
156
projects/calf/src/calf/reader.py
Normal file
156
projects/calf/src/calf/reader.py
Normal file
|
@ -0,0 +1,156 @@
|
|||
"""The Calf reader
|
||||
|
||||
Unlike the lexer and parser which are mostly information preserving, the reader
|
||||
is designed to be a somewhat pluggable structure for implementing transforms and
|
||||
discarding information.
|
||||
|
||||
"""
|
||||
|
||||
from typing import *
|
||||
|
||||
from calf.lexer import lex_buffer, lex_file
|
||||
from calf.parser import parse_stream
|
||||
from calf.token import *
|
||||
from calf.types import *
|
||||
|
||||
class CalfReader(object):
|
||||
def handle_keyword(self, t: CalfToken) -> Any:
|
||||
"""Convert a token to an Object value for a symbol.
|
||||
|
||||
Implementations could convert kws to strings, to a dataclass of some
|
||||
sort, use interning, or do none of the above.
|
||||
|
||||
"""
|
||||
|
||||
return Keyword.of(t.more.get("name"), t.more.get("namespace"))
|
||||
|
||||
def handle_symbol(self, t: CalfToken) -> Any:
|
||||
"""Convert a token to an Object value for a symbol.
|
||||
|
||||
Implementations could convert syms to strings, to a dataclass of some
|
||||
sort, use interning, or do none of the above.
|
||||
|
||||
"""
|
||||
|
||||
return Symbol.of(t.more.get("name"), t.more.get("namespace"))
|
||||
|
||||
def handle_dispatch(self, t: CalfDispatchToken) -> Any:
|
||||
"""Handle a #foo <> dispatch token.
|
||||
|
||||
Implementations may choose how dispatch is mapped to values, for
|
||||
instance by imposing a static mapping or by calling out to runtime state
|
||||
or other data sources to implement this hook. It's intended to be an
|
||||
open dispatch mechanism, unlike the others which should have relatively
|
||||
defined behavior.
|
||||
|
||||
The default implementation simply preserves the dispatch token.
|
||||
|
||||
"""
|
||||
|
||||
return t
|
||||
|
||||
def handle_meta(self, t: CalfMetaToken) -> Any:
|
||||
"""Handle a ^<> <> so called 'meta' token.
|
||||
|
||||
Implementations may choose how to process metadata, discarding it or
|
||||
consuming it somehow.
|
||||
|
||||
The default implementation simply discards the tag value.
|
||||
|
||||
"""
|
||||
|
||||
return self.read1(t.value)
|
||||
|
||||
def make_quote(self):
|
||||
"""Factory. Returns the quote or equivalent symbol. May use `self.make_symbol()` to do so."""
|
||||
|
||||
return Symbol.of("quote")
|
||||
|
||||
def handle_quote(self, t: CalfQuoteToken) -> Any:
|
||||
"""Handle a 'foo quote form."""
|
||||
|
||||
return Vector.of([self.make_quote(), self.read1(t.value)])
|
||||
|
||||
def read1(self, t: CalfToken) -> Any:
|
||||
# Note: 'square' and 'round' lists are treated the same. This should be
|
||||
# a hook. Should {} be a "list" too until it gets reader hooked into
|
||||
# being a mapping or a set?
|
||||
if isinstance(t, CalfListToken):
|
||||
return Vector.of(self.read(t.value))
|
||||
|
||||
elif isinstance(t, CalfDictToken):
|
||||
return Map.of([(self.read1(k), self.read1(v))
|
||||
for k, v in t.items()])
|
||||
|
||||
# Magical pairwise stuff
|
||||
elif isinstance(t, CalfQuoteToken):
|
||||
return self.handle_quote(t)
|
||||
|
||||
elif isinstance(t, CalfMetaToken):
|
||||
return self.handle_meta(t)
|
||||
|
||||
elif isinstance(t, CalfDispatchToken):
|
||||
return self.handle_dispatch(t)
|
||||
|
||||
# Stuff with real factories
|
||||
elif isinstance(t, CalfKeywordToken):
|
||||
return self.handle_keyword(t)
|
||||
|
||||
elif isinstance(t, CalfSymbolToken):
|
||||
return self.handle_symbol(t)
|
||||
|
||||
# Terminals
|
||||
elif isinstance(t, CalfStrToken):
|
||||
return str(t)
|
||||
|
||||
elif isinstance(t, CalfIntegerToken):
|
||||
return int(t)
|
||||
|
||||
elif isinstance(t, CalfFloatToken):
|
||||
return float(t)
|
||||
|
||||
else:
|
||||
raise ValueError(f"Unsupported token type {t!r} ({type(t)})")
|
||||
|
||||
def read(self, stream):
|
||||
"""Given a sequence of tokens, read 'em."""
|
||||
|
||||
for t in stream:
|
||||
yield self.read1(t)
|
||||
|
||||
|
||||
def read_stream(stream,
|
||||
reader: CalfReader = None):
|
||||
"""Read from a stream of parsed tokens.
|
||||
|
||||
"""
|
||||
|
||||
reader = reader or CalfReader()
|
||||
yield from reader.read(stream)
|
||||
|
||||
|
||||
def read_buffer(buffer):
|
||||
"""Read from a buffer, producing a lazy sequence of all top level forms.
|
||||
|
||||
"""
|
||||
|
||||
yield from read_stream(parse_stream(lex_buffer(buffer)))
|
||||
|
||||
|
||||
def read_file(file):
|
||||
"""Read from a file, producing a lazy sequence of all top level forms.
|
||||
|
||||
"""
|
||||
|
||||
yield from read_stream(parse_stream(lex_file(file)))
|
||||
|
||||
|
||||
def main():
|
||||
"""A CURSES application for using the reader."""
|
||||
|
||||
from calf.cursedrepl import curse_repl
|
||||
|
||||
def handle_buffer(buff, count):
|
||||
return list(read_stream(parse_stream(lex_buffer(buff, source=f"<Example {count}>"))))
|
||||
|
||||
curse_repl(handle_buffer)
|
239
projects/calf/src/calf/token.py
Normal file
239
projects/calf/src/calf/token.py
Normal file
|
@ -0,0 +1,239 @@
|
|||
"""
|
||||
Tokens.
|
||||
|
||||
The philosophy here is that to the greatest extent possible we want to preserve lexical (source)
|
||||
information about indentation, position and soforth. That we have to do so well mutably is just a
|
||||
pain in the ass and kinda unavoidable.
|
||||
|
||||
Consequently, this file defines classes which wrap core Python primitives, providing all the usual
|
||||
bits in terms of acting like values, while preserving fairly extensive source information.
|
||||
"""
|
||||
|
||||
|
||||
class CalfToken:
|
||||
"""
|
||||
Token object.
|
||||
|
||||
The result of reading a token from the source character feed.
|
||||
Encodes the source, and the position in the source from which it was read.
|
||||
"""
|
||||
|
||||
def __init__(self, type, value, source, start_position, more):
|
||||
self.type = type
|
||||
self.value = value
|
||||
self.source = source
|
||||
self.start_position = start_position
|
||||
self.more = more if more is not None else {}
|
||||
|
||||
def __repr__(self):
|
||||
return "<%s:%s %r %s %r>" % (
|
||||
type(self).__name__,
|
||||
self.type,
|
||||
self.value,
|
||||
self.loc(),
|
||||
self.more,
|
||||
)
|
||||
|
||||
def loc(self):
|
||||
return "%r@%r:%r" % (
|
||||
self.source,
|
||||
self.line,
|
||||
self.column,
|
||||
)
|
||||
|
||||
def __str__(self):
|
||||
return self.value
|
||||
|
||||
@property
|
||||
def offset(self):
|
||||
if self.start_position is not None:
|
||||
return self.start_position.offset
|
||||
|
||||
@property
|
||||
def line(self):
|
||||
if self.start_position is not None:
|
||||
return self.start_position.line
|
||||
|
||||
@property
|
||||
def column(self):
|
||||
if self.start_position is not None:
|
||||
return self.start_position.column
|
||||
|
||||
|
||||
class CalfBlockToken(CalfToken):
|
||||
"""
|
||||
(Block) Token object.
|
||||
|
||||
The base result of parsing a token with a start and an end position.
|
||||
"""
|
||||
|
||||
def __init__(self, type, value, source, start_position, end_position, more):
|
||||
CalfToken.__init__(self, type, value, source, start_position, more)
|
||||
self.end_position = end_position
|
||||
|
||||
|
||||
class CalfListToken(CalfBlockToken, list):
|
||||
"""
|
||||
(list) Token object.
|
||||
|
||||
The final result of reading a parens list through the Calf lexer stack.
|
||||
"""
|
||||
|
||||
def __init__(self, type, value, source, start_position, end_position):
|
||||
CalfBlockToken.__init__(
|
||||
self, type, value, source, start_position, end_position, None
|
||||
)
|
||||
list.__init__(self, value)
|
||||
|
||||
|
||||
class CalfDictToken(CalfBlockToken, dict):
|
||||
"""
|
||||
(dict) Token object.
|
||||
|
||||
The final(ish) result of reading a braces list through the Calf lexer stack.
|
||||
"""
|
||||
|
||||
def __init__(self, type, value, source, start_position, end_position):
|
||||
CalfBlockToken.__init__(
|
||||
self, type, value, source, start_position, end_position, None
|
||||
)
|
||||
dict.__init__(self, value)
|
||||
|
||||
|
||||
class CalfIntegerToken(CalfToken, int):
|
||||
"""
|
||||
(int) Token object.
|
||||
|
||||
|
||||
The final(ish) result of reading an integer.
|
||||
"""
|
||||
|
||||
def __new__(cls, value):
|
||||
return int.__new__(cls, value.value)
|
||||
|
||||
def __init__(self, value):
|
||||
CalfToken.__init__(
|
||||
self,
|
||||
value.type,
|
||||
value.value,
|
||||
value.source,
|
||||
value.start_position,
|
||||
value.more,
|
||||
)
|
||||
|
||||
|
||||
class CalfFloatToken(CalfToken, float):
|
||||
"""
|
||||
(int) Token object.
|
||||
|
||||
|
||||
The final(ish) result of reading a float.
|
||||
"""
|
||||
|
||||
def __new__(cls, value):
|
||||
return float.__new__(cls, value.value)
|
||||
|
||||
def __init__(self, value):
|
||||
CalfToken.__init__(
|
||||
self,
|
||||
value.type,
|
||||
value.value,
|
||||
value.source,
|
||||
value.start_position,
|
||||
value.more,
|
||||
)
|
||||
|
||||
|
||||
class CalfStrToken(CalfToken, str):
|
||||
"""
|
||||
(str) Token object.
|
||||
|
||||
|
||||
The final(ish) result of reading a string.
|
||||
"""
|
||||
|
||||
def __new__(cls, token, buff):
|
||||
return str.__new__(cls, buff)
|
||||
|
||||
def __init__(self, token, buff):
|
||||
CalfToken.__init__(
|
||||
self,
|
||||
token.type,
|
||||
buff,
|
||||
token.source,
|
||||
token.start_position,
|
||||
token.more,
|
||||
)
|
||||
str.__init__(self)
|
||||
|
||||
|
||||
class CalfSymbolToken(CalfToken):
|
||||
"""A symbol."""
|
||||
|
||||
def __init__(self, token):
|
||||
CalfToken.__init__(
|
||||
self,
|
||||
token.type,
|
||||
token.value,
|
||||
token.source,
|
||||
token.start_position,
|
||||
token.more,
|
||||
)
|
||||
|
||||
|
||||
class CalfKeywordToken(CalfToken):
|
||||
"""A keyword."""
|
||||
|
||||
def __init__(self, token):
|
||||
CalfToken.__init__(
|
||||
self,
|
||||
token.type,
|
||||
token.value,
|
||||
token.source,
|
||||
token.start_position,
|
||||
token.more,
|
||||
)
|
||||
|
||||
|
||||
class CalfMetaToken(CalfToken):
|
||||
"""A ^ meta token."""
|
||||
|
||||
def __init__(self, token, meta, value):
|
||||
CalfToken.__init__(
|
||||
self,
|
||||
token.type,
|
||||
value,
|
||||
token.source,
|
||||
token.start_position,
|
||||
token.more,
|
||||
)
|
||||
self.meta = meta
|
||||
|
||||
|
||||
class CalfDispatchToken(CalfToken):
|
||||
"""A # macro dispatch token."""
|
||||
|
||||
def __init__(self, token, tag, value):
|
||||
CalfToken.__init__(
|
||||
self,
|
||||
token.type,
|
||||
value,
|
||||
token.source,
|
||||
token.start_position,
|
||||
token.more,
|
||||
)
|
||||
self.tag = tag
|
||||
|
||||
|
||||
class CalfQuoteToken(CalfToken):
|
||||
"""A ' quotation."""
|
||||
|
||||
def __init__(self, token, quoted):
|
||||
CalfToken.__init__(
|
||||
self,
|
||||
token.type,
|
||||
quoted,
|
||||
token.source,
|
||||
token.start_position,
|
||||
token.more,
|
||||
)
|
44
projects/calf/src/calf/types.py
Normal file
44
projects/calf/src/calf/types.py
Normal file
|
@ -0,0 +1,44 @@
|
|||
"""Core types for Calf.
|
||||
|
||||
I don't love baking these in, but there's one place to start and there'll be a
|
||||
considerable amount of bootstrappy nonsense to get through. So just start with
|
||||
good ol' fashioned types and type aliases.
|
||||
"""
|
||||
|
||||
from typing import *
|
||||
|
||||
import pyrsistent as p
|
||||
|
||||
|
||||
class Symbol(NamedTuple):
|
||||
name: str
|
||||
namespace: Optional[str]
|
||||
|
||||
@classmethod
|
||||
def of(cls, name: str, namespace: str = None):
|
||||
return cls(name, namespace)
|
||||
|
||||
|
||||
class Keyword(NamedTuple):
|
||||
name: str
|
||||
namespace: Optional[str]
|
||||
|
||||
@classmethod
|
||||
def of(cls, name: str, namespace: str = None):
|
||||
return cls(name, namespace)
|
||||
|
||||
|
||||
# FIXME (arrdem 2021-03-20):
|
||||
#
|
||||
# Don't just go out to Pyrsistent for the datatypes. Do something somewhat
|
||||
# smarter, especially given the games Pyrsistent is playing around loading
|
||||
# ctype implementations for performance. God only knows about correctness tho.
|
||||
|
||||
Map = p.PMap
|
||||
Map.of = staticmethod(p.pmap)
|
||||
|
||||
Vector = p.PVector
|
||||
Vector.of = staticmethod(p.pvector)
|
||||
|
||||
Set = p.PSet
|
||||
Set.of = staticmethod(p.pset)
|
23
projects/calf/src/calf/util.py
Normal file
23
projects/calf/src/calf/util.py
Normal file
|
@ -0,0 +1,23 @@
|
|||
"""
|
||||
Bits and bats.
|
||||
|
||||
Mainly bats.
|
||||
"""
|
||||
|
||||
import re
|
||||
|
||||
|
||||
def memoize(f):
|
||||
memo = {}
|
||||
|
||||
def helper(x):
|
||||
if x not in memo:
|
||||
memo[x] = f(x)
|
||||
return memo[x]
|
||||
|
||||
return helper
|
||||
|
||||
|
||||
@memoize
|
||||
def re_mem(regex):
|
||||
return re.compile(regex)
|
20
projects/calf/tests/BUILD
Normal file
20
projects/calf/tests/BUILD
Normal file
|
@ -0,0 +1,20 @@
|
|||
py_library(
|
||||
name = "conftest",
|
||||
srcs = [
|
||||
"conftest.py"
|
||||
],
|
||||
imports = [
|
||||
"."
|
||||
],
|
||||
)
|
||||
|
||||
py_pytest(
|
||||
name = "test",
|
||||
srcs = glob(["*.py"]),
|
||||
deps = [
|
||||
"//projects/calf:lib",
|
||||
":conftest",
|
||||
py_requirement("pytest-cov"),
|
||||
],
|
||||
args = ["--cov-report", "term", "--cov=calf"],
|
||||
)
|
7
projects/calf/tests/conftest.py
Normal file
7
projects/calf/tests/conftest.py
Normal file
|
@ -0,0 +1,7 @@
|
|||
"""
|
||||
Fixtures for testing Calf.
|
||||
"""
|
||||
|
||||
import pytest
|
||||
|
||||
parametrize = pytest.mark.parametrize
|
30
projects/calf/tests/test_grammar.py
Normal file
30
projects/calf/tests/test_grammar.py
Normal file
|
@ -0,0 +1,30 @@
|
|||
"""
|
||||
Tests covering the Calf grammar.
|
||||
"""
|
||||
|
||||
import re
|
||||
|
||||
from calf import grammar as cg
|
||||
from conftest import parametrize
|
||||
|
||||
|
||||
@parametrize('ex', [
|
||||
# Proper strings
|
||||
'""',
|
||||
'"foo bar"',
|
||||
'"foo\n bar\n\r qux"',
|
||||
'"foo\\"bar"',
|
||||
|
||||
'""""""',
|
||||
'"""foo bar baz"""',
|
||||
'"""foo "" "" "" bar baz"""',
|
||||
|
||||
# Unterminated string cases
|
||||
'"',
|
||||
'"f',
|
||||
'"foo bar',
|
||||
'"foo\\" bar',
|
||||
'"""foo bar baz',
|
||||
])
|
||||
def test_match_string(ex):
|
||||
assert re.fullmatch(cg.STRING_PATTERN, ex)
|
89
projects/calf/tests/test_lexer.py
Normal file
89
projects/calf/tests/test_lexer.py
Normal file
|
@ -0,0 +1,89 @@
|
|||
"""
|
||||
Tests of calf.lexer
|
||||
|
||||
Tests both basic functionality, some examples and makes sure that arbitrary token sequences round
|
||||
trip through the lexer.
|
||||
"""
|
||||
|
||||
import calf.lexer as cl
|
||||
from conftest import parametrize
|
||||
|
||||
import pytest
|
||||
|
||||
|
||||
def lex_single_token(buffer):
|
||||
"""Lexes a single token from the buffer."""
|
||||
|
||||
return next(iter(cl.lex_buffer(buffer)))
|
||||
|
||||
|
||||
@parametrize(
|
||||
"text,token_type",
|
||||
[
|
||||
("(", "PAREN_LEFT",),
|
||||
(")", "PAREN_RIGHT",),
|
||||
("[", "BRACKET_LEFT",),
|
||||
("]", "BRACKET_RIGHT",),
|
||||
("{", "BRACE_LEFT",),
|
||||
("}", "BRACE_RIGHT",),
|
||||
("^", "META",),
|
||||
("#", "MACRO_DISPATCH",),
|
||||
("'", "SINGLE_QUOTE"),
|
||||
("foo", "SYMBOL",),
|
||||
("foo/bar", "SYMBOL"),
|
||||
(":foo", "KEYWORD",),
|
||||
(":foo/bar", "KEYWORD",),
|
||||
(" ,,\t ,, \t", "WHITESPACE",),
|
||||
("\n\r", "WHITESPACE"),
|
||||
("\n", "WHITESPACE"),
|
||||
(" , ", "WHITESPACE",),
|
||||
("; this is a sample comment\n", "COMMENT"),
|
||||
('"foo"', "STRING"),
|
||||
('"foo bar baz"', "STRING"),
|
||||
],
|
||||
)
|
||||
def test_lex_examples(text, token_type):
|
||||
t = lex_single_token(text)
|
||||
assert t.value == text
|
||||
assert t.type == token_type
|
||||
|
||||
|
||||
@parametrize(
|
||||
"text,token_types",
|
||||
[
|
||||
("foo^bar", ["SYMBOL", "META", "SYMBOL"]),
|
||||
("foo bar", ["SYMBOL", "WHITESPACE", "SYMBOL"]),
|
||||
("foo-bar", ["SYMBOL"]),
|
||||
("foo\nbar", ["SYMBOL", "WHITESPACE", "SYMBOL"]),
|
||||
(
|
||||
"{[^#()]}",
|
||||
[
|
||||
"BRACE_LEFT",
|
||||
"BRACKET_LEFT",
|
||||
"META",
|
||||
"MACRO_DISPATCH",
|
||||
"PAREN_LEFT",
|
||||
"PAREN_RIGHT",
|
||||
"BRACKET_RIGHT",
|
||||
"BRACE_RIGHT",
|
||||
],
|
||||
),
|
||||
("+", ["SYMBOL"]),
|
||||
("-", ["SYMBOL"]),
|
||||
("1", ["INTEGER"]),
|
||||
("-1", ["INTEGER"]),
|
||||
("-1.0", ["FLOAT"]),
|
||||
("-1e3", ["FLOAT"]),
|
||||
("+1.3e", ["FLOAT"]),
|
||||
("f", ["SYMBOL"]),
|
||||
("f1", ["SYMBOL"]),
|
||||
("f1g2", ["SYMBOL"]),
|
||||
("foo13-bar", ["SYMBOL"]),
|
||||
("foo+13-12bar", ["SYMBOL"]),
|
||||
("+-+-+-+-+", ["SYMBOL"]),
|
||||
],
|
||||
)
|
||||
def test_lex_compound_examples(text, token_types):
|
||||
t = cl.lex_buffer(text)
|
||||
result_types = [token.type for token in t]
|
||||
assert result_types == token_types
|
219
projects/calf/tests/test_parser.py
Normal file
219
projects/calf/tests/test_parser.py
Normal file
|
@ -0,0 +1,219 @@
|
|||
"""
|
||||
Tests of calf.parser
|
||||
"""
|
||||
|
||||
import calf.parser as cp
|
||||
from conftest import parametrize
|
||||
|
||||
import pytest
|
||||
|
||||
|
||||
@parametrize("text", [
|
||||
'"',
|
||||
'"foo bar',
|
||||
'"""foo bar',
|
||||
'"""foo bar"',
|
||||
])
|
||||
def test_bad_strings_raise(text):
|
||||
"""Tests asserting we won't let obviously bad strings fly."""
|
||||
# FIXME (arrdem 2021-03-13):
|
||||
# Can we provide this behavior in the lexer rather than in the parser?
|
||||
with pytest.raises(ValueError):
|
||||
next(cp.parse_buffer(text))
|
||||
|
||||
|
||||
@parametrize("text", [
|
||||
"[1.0",
|
||||
"(1.0",
|
||||
"{1.0",
|
||||
])
|
||||
def test_unterminated_raises(text):
|
||||
"""Tests asserting that we don't let unterminated collections parse."""
|
||||
with pytest.raises(cp.CalfMissingCloseParseError):
|
||||
next(cp.parse_buffer(text))
|
||||
|
||||
|
||||
@parametrize("text", [
|
||||
"[{]",
|
||||
"[(]",
|
||||
"({)",
|
||||
"([)",
|
||||
"{(}",
|
||||
"{[}",
|
||||
])
|
||||
def test_unbalanced_raises(text):
|
||||
"""Tests asserting that we don't let missmatched collections parse."""
|
||||
with pytest.raises(cp.CalfUnexpectedCloseParseError):
|
||||
next(cp.parse_buffer(text))
|
||||
|
||||
|
||||
@parametrize("buff, value", [
|
||||
('"foo"', "foo"),
|
||||
('"foo\tbar"', "foo\tbar"),
|
||||
('"foo\n\rbar"', "foo\n\rbar"),
|
||||
('"foo\\"bar\\""', "foo\"bar\""),
|
||||
('"""foo"""', 'foo'),
|
||||
('"""foo"bar"baz"""', 'foo"bar"baz'),
|
||||
])
|
||||
def test_strings_round_trip(buff, value):
|
||||
assert next(cp.parse_buffer(buff)) == value
|
||||
|
||||
@parametrize('text, element_types', [
|
||||
# Integers
|
||||
("(1)", ["INTEGER"]),
|
||||
("( 1 )", ["INTEGER"]),
|
||||
("(,1,)", ["INTEGER"]),
|
||||
("(1\n)", ["INTEGER"]),
|
||||
("(\n1\n)", ["INTEGER"]),
|
||||
("(1, 2, 3, 4)", ["INTEGER", "INTEGER", "INTEGER", "INTEGER"]),
|
||||
|
||||
# Floats
|
||||
("(1.0)", ["FLOAT"]),
|
||||
("(1.0e0)", ["FLOAT"]),
|
||||
("(1e0)", ["FLOAT"]),
|
||||
("(1e0)", ["FLOAT"]),
|
||||
|
||||
# Symbols
|
||||
("(foo)", ["SYMBOL"]),
|
||||
("(+)", ["SYMBOL"]),
|
||||
("(-)", ["SYMBOL"]),
|
||||
("(*)", ["SYMBOL"]),
|
||||
("(foo-bar)", ["SYMBOL"]),
|
||||
("(+foo-bar+)", ["SYMBOL"]),
|
||||
("(+foo-bar+)", ["SYMBOL"]),
|
||||
("( foo bar )", ["SYMBOL", "SYMBOL"]),
|
||||
|
||||
# Keywords
|
||||
("(:foo)", ["KEYWORD"]),
|
||||
("( :foo )", ["KEYWORD"]),
|
||||
("(\n:foo\n)", ["KEYWORD"]),
|
||||
("(,:foo,)", ["KEYWORD"]),
|
||||
("(:foo :bar)", ["KEYWORD", "KEYWORD"]),
|
||||
("(:foo :bar 1)", ["KEYWORD", "KEYWORD", "INTEGER"]),
|
||||
|
||||
# Strings
|
||||
('("foo", "bar", "baz")', ["STRING", "STRING", "STRING"]),
|
||||
|
||||
# Lists
|
||||
('([] [] ())', ["SQLIST", "SQLIST", "LIST"]),
|
||||
])
|
||||
def test_parse_list(text, element_types):
|
||||
"""Test we can parse various lists of contents."""
|
||||
l_t = next(cp.parse_buffer(text, discard_whitespace=True))
|
||||
assert l_t.type == "LIST"
|
||||
assert [t.type for t in l_t] == element_types
|
||||
|
||||
|
||||
@parametrize('text, element_types', [
|
||||
# Integers
|
||||
("[1]", ["INTEGER"]),
|
||||
("[ 1 ]", ["INTEGER"]),
|
||||
("[,1,]", ["INTEGER"]),
|
||||
("[1\n]", ["INTEGER"]),
|
||||
("[\n1\n]", ["INTEGER"]),
|
||||
("[1, 2, 3, 4]", ["INTEGER", "INTEGER", "INTEGER", "INTEGER"]),
|
||||
|
||||
# Floats
|
||||
("[1.0]", ["FLOAT"]),
|
||||
("[1.0e0]", ["FLOAT"]),
|
||||
("[1e0]", ["FLOAT"]),
|
||||
("[1e0]", ["FLOAT"]),
|
||||
|
||||
# Symbols
|
||||
("[foo]", ["SYMBOL"]),
|
||||
("[+]", ["SYMBOL"]),
|
||||
("[-]", ["SYMBOL"]),
|
||||
("[*]", ["SYMBOL"]),
|
||||
("[foo-bar]", ["SYMBOL"]),
|
||||
("[+foo-bar+]", ["SYMBOL"]),
|
||||
("[+foo-bar+]", ["SYMBOL"]),
|
||||
("[ foo bar ]", ["SYMBOL", "SYMBOL"]),
|
||||
|
||||
# Keywords
|
||||
("[:foo]", ["KEYWORD"]),
|
||||
("[ :foo ]", ["KEYWORD"]),
|
||||
("[\n:foo\n]", ["KEYWORD"]),
|
||||
("[,:foo,]", ["KEYWORD"]),
|
||||
("[:foo :bar]", ["KEYWORD", "KEYWORD"]),
|
||||
("[:foo :bar 1]", ["KEYWORD", "KEYWORD", "INTEGER"]),
|
||||
|
||||
# Strings
|
||||
('["foo", "bar", "baz"]', ["STRING", "STRING", "STRING"]),
|
||||
|
||||
# Lists
|
||||
('[[] [] ()]', ["SQLIST", "SQLIST", "LIST"]),
|
||||
])
|
||||
def test_parse_sqlist(text, element_types):
|
||||
"""Test we can parse various 'square' lists of contents."""
|
||||
l_t = next(cp.parse_buffer(text, discard_whitespace=True))
|
||||
assert l_t.type == "SQLIST"
|
||||
assert [t.type for t in l_t] == element_types
|
||||
|
||||
|
||||
@parametrize('text, element_pairs', [
|
||||
("{}",
|
||||
[]),
|
||||
|
||||
("{:foo 1}",
|
||||
[["KEYWORD", "INTEGER"]]),
|
||||
|
||||
("{:foo 1, :bar 2}",
|
||||
[["KEYWORD", "INTEGER"],
|
||||
["KEYWORD", "INTEGER"]]),
|
||||
|
||||
("{foo 1, bar 2}",
|
||||
[["SYMBOL", "INTEGER"],
|
||||
["SYMBOL", "INTEGER"]]),
|
||||
|
||||
("{foo 1, bar -2}",
|
||||
[["SYMBOL", "INTEGER"],
|
||||
["SYMBOL", "INTEGER"]]),
|
||||
|
||||
("{foo 1, bar -2e0}",
|
||||
[["SYMBOL", "INTEGER"],
|
||||
["SYMBOL", "FLOAT"]]),
|
||||
|
||||
("{foo ()}",
|
||||
[["SYMBOL", "LIST"]]),
|
||||
|
||||
("{foo []}",
|
||||
[["SYMBOL", "SQLIST"]]),
|
||||
|
||||
("{foo {}}",
|
||||
[["SYMBOL", "DICT"]]),
|
||||
|
||||
('{"foo" {}}',
|
||||
[["STRING", "DICT"]])
|
||||
])
|
||||
def test_parse_dict(text, element_pairs):
|
||||
"""Test we can parse various mappings."""
|
||||
d_t = next(cp.parse_buffer(text, discard_whitespace=True))
|
||||
assert d_t.type == "DICT"
|
||||
assert [[t.type for t in pair] for pair in d_t.value] == element_pairs
|
||||
|
||||
|
||||
@parametrize("text", [
|
||||
"{1}",
|
||||
"{1, 2, 3}",
|
||||
"{:foo}",
|
||||
"{:foo :bar :baz}"
|
||||
])
|
||||
def test_parse_bad_dict(text):
|
||||
"""Assert that dicts with missmatched pairs don't parse."""
|
||||
with pytest.raises(Exception):
|
||||
next(cp.parse_buffer(text))
|
||||
|
||||
|
||||
@parametrize("text", [
|
||||
"()",
|
||||
"(1 1.1 1e2 -2 foo :foo foo/bar :foo/bar [{},])",
|
||||
"{:foo bar, :baz [:qux]}",
|
||||
"'foo",
|
||||
"'[foo bar :baz 'qux, {}]",
|
||||
"#foo []",
|
||||
"^{} bar",
|
||||
])
|
||||
def test_examples(text):
|
||||
"""Shotgun examples showing we can parse some stuff."""
|
||||
|
||||
assert list(cp.parse_buffer(text))
|
22
projects/calf/tests/test_reader.py
Normal file
22
projects/calf/tests/test_reader.py
Normal file
|
@ -0,0 +1,22 @@
|
|||
"""
|
||||
"""
|
||||
|
||||
from conftest import parametrize
|
||||
|
||||
from calf.reader import read_buffer
|
||||
|
||||
@parametrize('text', [
|
||||
"()",
|
||||
"[]",
|
||||
"[[[[[[[[[]]]]]]]]]",
|
||||
"{1 {2 {}}}",
|
||||
'"foo"',
|
||||
"foo",
|
||||
"'foo",
|
||||
"^foo bar",
|
||||
"^:foo bar",
|
||||
"{\"foo\" '([:bar ^:foo 'baz 3.14159e0])}",
|
||||
"[:foo bar 'baz lo/l, 1, 1.2. 1e-5 -1e2]",
|
||||
])
|
||||
def test_read(text):
|
||||
assert list(read_buffer(text))
|
17
projects/calf/tests/test_types.py
Normal file
17
projects/calf/tests/test_types.py
Normal file
|
@ -0,0 +1,17 @@
|
|||
"""
|
||||
Tests covering the Calf types.
|
||||
"""
|
||||
|
||||
from calf import types as t
|
||||
|
||||
|
||||
def test_maps_check():
|
||||
assert isinstance(t.Map.of([(1, 2)]), t.Map)
|
||||
|
||||
|
||||
def test_vectors_check():
|
||||
assert isinstance(t.Vector.of([(1, 2)]), t.Vector)
|
||||
|
||||
|
||||
def test_sets_check():
|
||||
assert isinstance(t.Set.of([(1, 2)]), t.Set)
|
|
@ -1,3 +1,6 @@
|
|||
"""A shim for executing pytest."""
|
||||
|
||||
import os
|
||||
import sys
|
||||
|
||||
import pytest
|
||||
|
@ -9,4 +12,7 @@ if __name__ == "__main__":
|
|||
cmdline = ["--ignore=external"] + sys.argv[1:]
|
||||
print(cmdline, file=sys.stderr)
|
||||
|
||||
for e in sys.path:
|
||||
print(f" - {os.path.realpath(e)}", file=sys.stderr)
|
||||
|
||||
sys.exit(pytest.main(cmdline))
|
||||
|
|
|
@ -9,6 +9,7 @@ certifi==2020.12.5
|
|||
chardet==4.0.0
|
||||
click==7.1.2
|
||||
commonmark==0.9.1
|
||||
coverage==5.5
|
||||
docutils==0.17
|
||||
idna==2.10
|
||||
imagesize==1.2.0
|
||||
|
@ -37,6 +38,7 @@ Pygments==2.8.1
|
|||
pyparsing==2.4.7
|
||||
pyrsistent==0.17.3
|
||||
pytest==6.2.3
|
||||
pytest-cov==2.11.1
|
||||
pytest-pudb==0.7.0
|
||||
pytz==2021.1
|
||||
PyYAML==5.4.1
|
||||
|
|
|
@ -26,6 +26,7 @@ LICENSES_BY_LOWERNAME = {
|
|||
"apache 2.0": "License :: OSI Approved :: Apache Software License",
|
||||
"apache": "License :: OSI Approved :: Apache Software License",
|
||||
"bsd 3 clause": "License :: OSI Approved :: BSD License",
|
||||
"bsd 3-clause": "License :: OSI Approved :: BSD License",
|
||||
"bsd": "License :: OSI Approved :: BSD License",
|
||||
"gplv3": "License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
|
||||
"http://www.apache.org/licenses/license-2.0": "License :: OSI Approved :: Apache Software License",
|
||||
|
|
Loading…
Reference in a new issue