Import the Calf project

2021-04-09 01:16:16 -06:00 · 2021-04-09 01:16:16 -06:00 · 9feb262454
commit 9feb262454
parent c25e825a95
27 changed files with 1699 additions and 0 deletions
--- a/projects/calf/BUILD
+++ b/projects/calf/BUILD
@ -0,0 +1,10 @@
 package(default_visibility = ["//visibility:public"])
 py_library(
    name = "lib",
    srcs = glob(["src/**/*.py"]),
    imports = ["src"],
    deps = [
        py_requirement("pyrsistent"),
    ]
 )
--- a/projects/calf/NOTES.md
+++ b/projects/calf/NOTES.md
@ -0,0 +1,4 @@
 # Notes
 https://github.com/Pyrlang/Pyrlang
 https://en.wikipedia.org/wiki/Single_system_image
--- a/projects/calf/README.md
+++ b/projects/calf/README.md
@ -0,0 +1,56 @@
 # Calf
 > Calf: Noun.
 A young cow or ox.
 Before I walked from the Clojure space, I kept throwing around the idea of "ox", an ur-clojure.
 Ox was supposed to experiment with some stuff around immutable namespaces and code as data which never came to fruition.
 I found the JVM environment burdensome, difficult to maintain velocity in, and my own ideas too un-formed to fit well into a rigorous object model.
 Calf is a testbed.
 It's supposed to be a lightweight, unstable, easy for me to hack on substrate for exploring those old ideas and some new ones.
 Particularly I'm interested in:
 - compilers-as-databases (or using databases)
 - stream processing and process models of computation more akin to Erlang
 - reliability sensitive programming models (failure, recovery, process supervision)
 I previously [blogged a bit](https://www.arrdem.com/2019/04/01/the_silver_tower/) about some ideas for what this could look like.
 I'm convinced that a programming environment based around [virtual resiliency](https://www.microsoft.com/en-us/research/publication/a-m-b-r-o-s-i-a-providing-performant-virtual-resiliency-for-distributed-applications/) is a worthwhile goal (having independently invented it) and worth trying to bring to a mainstream general purpose platform like Python.
 ## Manifesto
 In the last decade, immutability has been affirmed in the programming mainstream as an effective tool for making programs and state more manageable, and one which has been repeatedly implemented at acceptable performance costs.
 Especially in messaging based rather than state sharing environments, immutability and "data" oriented programming is becoming more and more common.
 It also seems that much of the industry is moving towards message based reactive or network based connective systems.
 Microservices seem to have won, and functions-as-a-service seem to be a rising trend reflecting a desire to offload or avoid deployment management rather than wrangle stateful services.
 In these environments, programs begin to consist entirely of messaging with other programs over shared channels such as traditional HTTP or other RPC tools or message buses such as Kafka, gRPC, ThriftMux and soforth.
 Key challenges with these connective services are:
 - How they handle failure
 - How they achieve reliability
 - The ergonomic difficulties of building and deploying connective programs
 - The operational difficulties of managing N-many 'reliable' services
 Tools like Argo, Airflow and the like begin to talk about such networked or evented programs as DAGs; providing schedulers for sequencing actions and executors for performing actions.
 Airflow provides a programmable Python scheduler environment, but fails to provide an execution isolation boundary (such as a container or other subprocess/`fork()` boundary) allowing users to bring their own dependencies.
 Instead Airflow users must build custom Airflow packagings which bundle dependencies into the Airflow instance.
 This means that Airflow deployments can only be centralized with difficulty due to shared dependencies and disparate dependency lifecycles and limits the return on investment of the platform by increasing operational burden.
 Argo ducks this mistake, providing a robust scheduler and leveraging k8s for its executor.
 This allows Argo to be managed independently of any of the workloads it manages - a huge step forwards over Airflow - but this comes at considerable ergonomic costs for trivial tasks and provides a more limited scheduler.
 Previously I developed a system which provided a much stronger DSL than Airflow's, but made the same key mistake of not decoupling execution from the scheduler/coordinator.
 Calf is a sketch of a programming language and system with a nearly fully featured DSL, and decoupling between scheduling (control flow of programs) and execution of "terminal" actions.
 In short, think a Py-Lisp where instead of doing FFI directly to the parent Python instance you do FFI by enqueuing a (potentially retryable!) request onto a shared cluster message bus, from which subscriber worker processes elsewhere provide request/response handling.
 One could reasonably accuse this project of being an attempt to unify Erlang and a hosted Python to build a "BASH for distsys" tool while providing a multi-tenant execution platform that can be centrally managed.
 ## License
 Copyright Reid 'arrdem' McKenzie, 3/5/2017.
 Distributed under the terms of the MIT license.
 See the included `LICENSE` file for more.
--- a/projects/calf/pytest.ini
+++ b/projects/calf/pytest.ini
@ -0,0 +1,4 @@
 [pytest]
 python_files=test_*.py
 python_classes=Check
 python_functions=test_*
--- a/projects/calf/setup.py
+++ b/projects/calf/setup.py
@ -0,0 +1,45 @@
 #!/usr/bin/env python
 from os import path
 from setuptools import setup, find_namespace_packages
 # Fetch the README contents
 rootdir = path.abspath(path.dirname(__file__))
 with open(path.join(rootdir, "README.md"), encoding="utf-8") as f:
    long_description = f.read()
 setup(
    name="calf",
    version="0.0.0",
    long_description=long_description,
    long_description_content_type="text/markdown",
    packages=find_namespace_packages(include=["calf.*"]),
    entry_points={
        "console_scripts": [
            # DSL testing stuff
            "calf-lex = calf.lexer:main",
            "calf-parse = calf.parser:main",
            "calf-read = calf.reader:main",
            "calf-analyze = calf.analyzer:main",
            "calf-compile = calf.compiler:main",
            # Client/server stuff
            "calf-client = calf.client:main",
            "calf-server = calf.server:main",
            "calf-worker = calf.worker:main",
        ]
    },
    install_requires=[
        "pyrsistent~=0.17.0",
    ],
    extra_requires={
        "node": [
            "flask~=1.1.0",
            "pyyaml~=5.4.0",
            "redis~=3.5.0",
        ],
    },
 )
--- a/projects/calf/src/calf/init.py
+++ b/projects/calf/src/calf/init.py
@ -0,0 +1 @@
 #!/usr/bin/env python3
--- a/projects/calf/src/calf/analyzer.py
+++ b/projects/calf/src/calf/analyzer.py
@ -0,0 +1,3 @@
 """
 The calf analyzer.
 """
--- a/projects/calf/src/calf/cursedrepl.py
+++ b/projects/calf/src/calf/cursedrepl.py
@ -0,0 +1,86 @@
 """
 Some shared scaffolding for building terminal "REPL" drivers.
 """
 import curses
 from curses.textpad import Textbox, rectangle
 def curse_repl(handle_buffer):
    def handle(buff, count):
        try:
            return list(handle_buffer(buff, count)), None
        except Exception as e:
            return None, e
    def _main(stdscr: curses.window):
        maxy, maxx = 0, 0
        examples = []
        count = 1
        while 1:
            # Prompt
            maxy, maxx = stdscr.getmaxyx()
            stdscr.clear()
            stdscr.addstr(0, 0, "Enter example: (hit Ctrl-G to execute, Ctrl-C to exit)", curses.A_BOLD)
            editwin = curses.newwin(5, maxx - 4,
                                    2, 2)
            rectangle(stdscr,
                      1, 1,
                      1 + 5 + 1, maxx - 2)
            # Printing is part of the prompt
            cur = 8
            def putstr(str, x=0, attr=0):
                # ya rly. I know exactly what I'm doing here
                nonlocal cur
                # This is how we handle going off the bottom of the scren lol
                if cur < maxy:
                    stdscr.addstr(cur, x, str, attr)
                    cur += (len(str.split("\n")) or 1)
            for ex, buff, vals, err in reversed(examples):
                putstr(f"Example {ex}:", attr=curses.A_BOLD)
                for l in buff.split("\n"):
                    putstr(f"    | {l}")
                putstr("")
                if err:
                    err = str(err)
                    err = err.split("\n")
                    putstr("  Error:")
                    for l in err:
                        putstr(f"    {l}", attr=curses.COLOR_YELLOW)
                elif vals:
                    putstr("  Values:")
                    for x, t in zip(range(1, 1<<64), vals):
                        putstr(f" {x:>3}) " + repr(t))
                putstr("")
            stdscr.refresh()
            # Readf rom the user
            box = Textbox(editwin)
            try:
                box.edit()
            except KeyboardInterrupt:
                break
            buff = box.gather().strip()
            if not buff:
                continue
            vals, err = handle(buff, count)
            examples.append((count, buff, vals, err))
            count += 1
            stdscr.refresh()
    curses.wrapper(_main)
--- a/projects/calf/src/calf/grammar.py
+++ b/projects/calf/src/calf/grammar.py
@ -0,0 +1,70 @@
 """
 The elements of the Calf grammar. Used by the lexer.
 """
 WHITESPACE = r"\n\r\s,"
 DELIMS = r'%s\[\]\(\)\{\}:;#^"\'' % (WHITESPACE,)
 SIMPLE_SYMBOL = r"([^{ds}\-\+\d][^{ds}]*)|([^{ds}\d]+)".format(ds=DELIMS)
 SYMBOL_PATTERN = r"(((?P<namespace>{ss})/)?(?P<name>{ss}))".format(ss=SIMPLE_SYMBOL)
 SIMPLE_INTEGER = r"[+-]?\d*"
 FLOAT_PATTERN = r"(?P<body>({i})(\.(\d*))?)?([eE](?P<exponent>{i}))?".format(
    i=SIMPLE_INTEGER
 )
 # HACK (arrdem 2021-03-13):
 #
 #   The lexer is INCREMENTAL not TOTAL. It works by incrementally greedily
 #   building up strings that are PARTIAL matches. This means it has no support
 #   for the " closing anchor of a string, or the \n closing anchor of a comment.
 #   So we have to do this weird thing where the _required_ terminators are
 #   actually _optional_ here so that the parser works.
 STRING_PATTERN = r'(""".*?(""")?)|("((\\"|[^"])*?)"?)'
 COMMENT_PATTERN = r";(([^\n\r]*)(\n\r?)?)"
 TOKENS = [
    # Paren (noral) lists
    (r"\(", "PAREN_LEFT",),
    (r"\)", "PAREN_RIGHT",),
    # Bracket lists
    (r"\[", "BRACKET_LEFT",),
    (r"\]", "BRACKET_RIGHT",),
    # Brace lists (maps)
    (r"\{", "BRACE_LEFT",),
    (r"\}", "BRACE_RIGHT",),
    (r"\^", "META",),
    (r"'", "SINGLE_QUOTE",),
    (STRING_PATTERN, "STRING",),
    (r"#", "MACRO_DISPATCH",),
    # Symbols
    (SYMBOL_PATTERN, "SYMBOL",),
    # Numbers
    (SIMPLE_INTEGER, "INTEGER",),
    (FLOAT_PATTERN, "FLOAT",),
    # Keywords
    #
    # Note: this is a dirty f'n hack in that in order for keywords to work, ":"
    # has to be defined to be a valid keyword.
    (r":" + SYMBOL_PATTERN + "?", "KEYWORD",),
    # Whitespace
    #
    # Note that the whitespace token will contain at most one newline
    (r"(\n\r?|[,\t ]*)", "WHITESPACE",),
    # Comment
    (COMMENT_PATTERN, "COMMENT",),
    # Strings
    (r'"(?P<body>(?:[^\"]|\.)*)"', "STRING"),
 ]
 MATCHING = {
    "PAREN_LEFT": "PAREN_RIGHT",
    "BRACKET_LEFT": "BRACKET_RIGHT",
    "BRACE_LEFT": "BRACE_RIGHT",
 }
 WHITESPACE_TYPES = {"WHITESPACE", "COMMENT"}
--- a/projects/calf/src/calf/io/reader.py
+++ b/projects/calf/src/calf/io/reader.py
@ -0,0 +1,101 @@
 """
 Various Reader class instances.
 """
 class Position(object):
    def __init__(self, offset, line, column):
        self.offset = offset
        self.line = line
        self.column = column
    def __repr__(self):
        return "<Pos %r (%r:%r)>" % (self.offset, self.line, self.column)
    def __str__(self):
        return self.__repr__()
 class PosReader(object):
    """A wrapper for anything that can be read from. Tracks offset, line and column information."""
    def __init__(self, reader):
        self.reader = reader
        self.offset = 0
        self.line = 1
        self.column = 0
    def read(self, n=1):
        """
        Returns a pair (position, text) where position is the position of the first character in the
        returned text. Text is a string of length up to or equal to `n` in length.
        """
        p = self.position
        if n == 1:
            chr = self.reader.read(n)
            if chr != "":
                self.offset += 1
                self.column += 1
            if chr == "\n":
                self.line += 1
                self.column = 0
            return (
                p,
                chr,
            )
        else:
            return (
                p,
                "".join(self.read(n=1)[1] for i in range(n)),
            )
    @property
    def position(self):
        """The position of the last character read."""
        return Position(self.offset, self.line, self.column)
 class PeekPosReader(PosReader):
    """A wrapper for anything that can be read from. Provides a way to peek the next character."""
    def __init__(self, reader):
        self.reader = reader if isinstance(reader, PosReader) else PosReader(reader)
        self._peek = None
    def read(self, n=1):
        """
        Same as `PosReader.read`.  Returns a pair (pos, text) where pos is the position of the first
        read character and text is a string of length up to `n`.  If a peeked character exists, it
        is consumed by this operation.
        """
        if self._peek and n == 1:
            a = self._peek
            self._peek = None
            return a
        else:
            p, t = self._peek or (None, "")
            if self._peek:
                self._peek = None
            p_, t_ = self.reader.read(n=(n if not t else n - len(t)))
            p = p or p_
            return (p, t + t_)
    def peek(self):
        """Returns the (pos, text) pair which would be read next by read(n=1)."""
        if self._peek is None:
            self._peek = self.reader.read(n=1)
        return self._peek
    @property
    def position(self):
        """The position of the last character read."""
        return self.reader.position
--- a/projects/calf/src/calf/lexer.py
+++ b/projects/calf/src/calf/lexer.py
@ -0,0 +1,136 @@
 """
 Calf lexer.
 Provides machinery for lexing sources of text into sequences of tokens with textual information, as
 well as buffer position information appropriate for either full AST parsing, lossless syntax tree
 parsing, linting or other use.
 """
 import io
 import re
 import sys
 from calf.token import CalfToken
 from calf.io.reader import PeekPosReader
 from calf.grammar import TOKENS
 from calf.util import *
 class CalfLexer:
    """
    Lexer object.
    Wraps something you can read characters from, and presents a lazy sequence of Token objects.
    Raises ValueError at any time due to either a conflict in the grammar being lexed, or incomplete
    input.  Exceptions from the backing reader object are not masked.
    Rule order is used to decide conflicts.  If multiple patterns would match an input, the "first"
    in token list order wins.
    """
    def __init__(self, stream, source=None, metadata=None, tokens=TOKENS):
        """FIXME"""
        self._stream = (
            PeekPosReader(stream) if not isinstance(stream, PeekPosReader) else stream
        )
        self.source = source
        self.metadata = metadata or {}
        self.tokens = tokens
    def __next__(self):
        """
        Tries to scan the next token off of the backing stream.
        Starting with a list of all available tokens, an empty buffer and a single new character
        peeked from the backing stream, reads more character so long as adding the next character
        still leaves one or more possible matching "candidates" (token patterns).
        When adding the next character from the stream would build an invalid token, a token of the
        resulting single candidate type is generated.
        At the end of input, if we have a single candidate remaining, a final token of that type is
        generated.  Otherwise we are in an incomplete input state either due to incomplete input or
        a grammar conflict.
        """
        buffer = ""
        candidates = self.tokens
        position, chr = self._stream.peek()
        while chr:
            if not candidates:
                raise ValueError("Entered invalid state - no candidates!")
            buff2 = buffer + chr
            can2 = [t for t in candidates if re.fullmatch(t[0], buff2)]
            # Try to include the last read character to support longest-wins grammars
            if not can2 and len(candidates) >= 1:
                pat, type = candidates[0]
                groups = re.match(re.compile(pat), buffer).groupdict()
                groups.update(self.metadata)
                return CalfToken(type, buffer, self.source, position, groups)
            else:
                # Update the buffers
                buffer = buff2
                candidates = can2
                # consume the 'current' character for side-effects
                self._stream.read()
                # set chr to be the next peeked character
                _, chr = self._stream.peek()
        if len(candidates) >= 1:
            pat, type = candidates[0]
            groups = re.match(re.compile(pat), buffer).groupdict()
            groups.update(self.metadata)
            return CalfToken(type, buffer, self.source, position, groups)
        else:
            raise ValueError(
                "Encountered end of buffer with incomplete token %r" % (buffer,)
            )
    def __iter__(self):
        """
        Scans tokens out of the character stream.
        May raise ValueError if there is either an issue with the grammar or the input.
        Will not mask any exceptions from the backing reader.
        """
        # While the character stream isn't empty
        while self._stream.peek()[1] != "":
            yield next(self)
 def lex_file(path, metadata=None):
    """
    Returns the sequence of tokens resulting from lexing all text in the named file.
    """
    with open(path, "r") as f:
        return list(CalfLexer(f, path, {}))
 def lex_buffer(buffer, source="<Buffer>", metadata=None):
    """
    Returns the lazy sequence of tokens resulting from lexing all the text in a buffer.
    """
    return CalfLexer(io.StringIO(buffer), source, metadata)
 def main():
    """A CURSES application for using the lexer."""
    from calf.cursedrepl import curse_repl
    def handle_buffer(buff, count):
        return list(lex_buffer(buff, source=f"<Example {count}>"))
    curse_repl(handle_buffer)
--- a/projects/calf/src/calf/packages.py
+++ b/projects/calf/src/calf/packages.py
@ -0,0 +1,59 @@
 """
 The Calf package infrastructure.
 Calf's packaging infrastructure is very heavily inspired by Maven, and seeks first and foremost to
 provide statically understandable, repeatable builds.
 However the loading infrastructure is designed to simultaneously support from-source builds
 appropriate to interactive development workflows and monorepos.
 """
 from collections import namedtuple
 class CalfLoaderConfig(namedtuple("CalfLoaderConfig", ["paths"])):
    """
  """
 class CalfDelayedPackage(
    namedtuple("CalfDelayedPackage", ["name", "version", "metadata", "path"])
 ):
    """
  This structure represents the delay of loading a packaage.
  Rather than eagerly analyze packages, it may be profitable to use lazy loading / lazy resolution
  of symbols. It may also be possible to cache analyzing some packages.
  """
 class CalfPackage(
    namedtuple("CalfPackage", ["name", "version", "metadata", "modules"])
 ):
    """
  This structure represents the result of forcing the load of a package, and is the product of
  either loading a package directly, or a package becoming a direct dependency and being forced.
  """
 def parse_package_requirement(config, env, requirement):
    """
  :param config:
  :param env:
  :param requirement:
  :returns:
  """
 def analyze_package(config, env, package):
    """
  :param config:
  :param env:
  :param module:
  :returns:
  Given a loader configuration and an environment to load into, analyzes the requested package,
  returning an updated environment.
  """
--- a/projects/calf/src/calf/parser.py
+++ b/projects/calf/src/calf/parser.py
@ -0,0 +1,249 @@
 """
 The Calf parser.
 """
 from collections import namedtuple
 from itertools import tee
 import logging
 import sys
 from typing import NamedTuple, Callable
 from calf.lexer import CalfLexer, lex_buffer, lex_file
 from calf.grammar import MATCHING, WHITESPACE_TYPES
 from calf.token import *
 log = logging.getLogger(__name__)
 def mk_list(contents, open=None, close=None):
    return CalfListToken(
        "LIST", contents, open.source, open.start_position, close.start_position
    )
 def mk_sqlist(contents, open=None, close=None):
    return CalfListToken(
        "SQLIST", contents, open.source, open.start_position, close.start_position
    )
 def pairwise(l: list) -> iter:
    "s -> (s0,s1), (s2,s3), (s4, s5), ..."
    return zip(l[::2], l[1::2])
 def mk_dict(contents, open=None, close=None):
    # FIXME (arrdem 2021-03-14):
    #   Raise a real SyntaxError of some sort.
    assert len(contents) % 2 == 0, "Improper dict!"
    return CalfDictToken(
        "DICT",
        list(pairwise(contents)),
        open.source,
        open.start_position,
        close.start_position,
    )
 def mk_str(token):
    buff = token.value
    if buff.startswith('"""') and not buff.endswith('"""'):
        raise ValueError('Unterminated tripple quote string')
    elif buff.startswith('"') and not buff.endswith('"'):
        raise ValueError('Unterminated quote string')
    elif not buff.startswith('"') or buff == '"' or buff == '"""':
        raise ValueError('Illegal string')
    if buff.startswith('"""'):
        buff = buff[3:-3]
    else:
        buff = buff[1:-1]
    buff = buff.encode("utf-8").decode("unicode_escape")  # Handle escape codes
    return CalfStrToken(token, buff)
 CTORS = {
    "PAREN_LEFT": mk_list,
    "BRACKET_LEFT": mk_sqlist,
    "BRACE_LEFT": mk_dict,
    "STRING": mk_str,
    "INTEGER": CalfIntegerToken,
    "FLOAT": CalfFloatToken,
    "SYMBOL": CalfSymbolToken,
    "KEYWORD": CalfKeywordToken,
 }
 class CalfParseError(Exception):
    """
    Base class for representing errors encountered parsing.
    """
    def __init__(self, message: str, token: CalfToken):
        super(Exception, self).__init__(message)
        self.token = token
    def __str__(self):
        return f"Parse error at {self.token.loc()}: " + super().__str__()
 class CalfUnexpectedCloseParseError(CalfParseError):
    """
    Represents encountering an unexpected close token.
    """
    def __init__(self, token, matching_open=None):
        msg = f"encountered unexpected closing {token!r}"
        if matching_open:
            msg += f" which appears to match {matching_open!r}"
        super(CalfParseError, self).__init__(msg, token)
        self.token = token
        self.matching_open = matching_open
 class CalfMissingCloseParseError(CalfParseError):
    """
    Represents a failure to encounter an expected close token.
    """
    def __init__(self, expected_close_token, open_token):
        super(CalfMissingCloseParseError, self).__init__(
            f"expected {expected_close_token} starting from {open_token}, got end of file.",
            open_token
        )
        self.expected_close_token = expected_close_token
 def parse_stream(stream,
                 discard_whitespace: bool = True,
                 discard_comments: bool = True,
                 stack: list = None):
    """Parses a token stream, producing a lazy sequence of all read top level forms.
    If `discard_whitespace` is truthy, then no WHITESPACE tokens will be emitted
    into the resulting parse tree. Otherwise, WHITESPACE tokens will be
    included. Whether WHITESPACE tokens are included or not, the tokens of the
    tree will reflect original source locations.
    """
    stack = stack or []
    def recur(_stack = None):
        yield from parse_stream(stream,
                                discard_whitespace,
                                discard_comments,
                                _stack or stack)
    for token in stream:
        # Whitespace discarding
        if token.type == "WHITESPACE" and discard_whitespace:
            continue
        elif token.type == "COMMENT" and discard_comments:
            continue
        # Built in reader macros
        elif token.type == "META":
            try:
                meta_t = next(recur())
            except StopIteration:
                raise CalfParseError("^ not followed by meta value", token)
            try:
                value_t = next(recur())
            except StopIteration:
                raise CalfParseError("^ not followed by value", token)
            yield CalfMetaToken(token, meta_t, value_t)
        elif token.type == "MACRO_DISPATCH":
            try:
                dispatch_t = next(recur())
            except StopIteration:
                raise CalfParseError("# not followed by dispatch value", token)
            try:
                value_t = next(recur())
            except StopIteration:
                raise CalfParseError("^ not followed by value", token)
            yield CalfDispatchToken(token, dispatch_t, value_t)
        elif token.type == "SINGLE_QUOTE":
            try:
                quoted_t = next(recur())
            except StopIteration:
                raise CalfParseError("' not followed by quoted form", token)
            yield CalfQuoteToken(token, quoted_t)
        # Compounds
        elif token.type in MATCHING.keys():
            balancing = MATCHING[token.type]
            elements = list(recur(stack + [(balancing, token)]))
            # Elements MUST have at least the close token in it
            if not elements:
                raise CalfMissingCloseParseError(balancing, token)
            elements, close = elements[:-1], elements[-1]
            if close.type != MATCHING[token.type]:
                raise CalfMissingCloseParseError(balancing, token)
            yield CTORS[token.type](elements, token, close)
        elif token.type in MATCHING.values():
            # Case of matching the immediate open
            if stack and token.type == stack[-1][0]:
                yield token
                break
            # Case of maybe matching something else, but definitely being wrong
            else:
                matching = next(reversed([t[1] for t in stack if t[0] == token.type]), None)
                raise CalfUnexpectedCloseParseError(token, matching)
        # Atoms
        elif token.type in CTORS:
            yield CTORS[token.type](token)
        else:
            yield token
 def parse_buffer(buffer,
                 discard_whitespace=True,
                 discard_comments=True):
    """
    Parses a buffer, producing a lazy sequence of all parsed level forms.
    Propagates all errors.
    """
    yield from parse_stream(lex_buffer(buffer),
                            discard_whitespace,
                            discard_comments)
 def parse_file(file):
    """
    Parses a file, producing a lazy sequence of all parsed level forms.
    """
    yield from parse_stream(lex_file(file))
 def main():
    """A CURSES application for using the parser."""
    from calf.cursedrepl import curse_repl
    def handle_buffer(buff, count):
        return list(parse_stream(lex_buffer(buff, source=f"<Example {count}>")))
    curse_repl(handle_buffer)
--- a/projects/calf/src/calf/reader.py
+++ b/projects/calf/src/calf/reader.py
@ -0,0 +1,156 @@
 """The Calf reader
 Unlike the lexer and parser which are mostly information preserving, the reader
 is designed to be a somewhat pluggable structure for implementing transforms and
 discarding information.
 """
 from typing import *
 from calf.lexer import lex_buffer, lex_file
 from calf.parser import parse_stream
 from calf.token import *
 from calf.types import *
 class CalfReader(object):
    def handle_keyword(self, t: CalfToken) -> Any:
        """Convert a token to an Object value for a symbol.
        Implementations could convert kws to strings, to a dataclass of some
        sort, use interning, or do none of the above.
        """
        return Keyword.of(t.more.get("name"), t.more.get("namespace"))
    def handle_symbol(self, t: CalfToken) -> Any:
        """Convert a token to an Object value for a symbol.
        Implementations could convert syms to strings, to a dataclass of some
        sort, use interning, or do none of the above.
        """
        return Symbol.of(t.more.get("name"), t.more.get("namespace"))
    def handle_dispatch(self, t: CalfDispatchToken) -> Any:
        """Handle a #foo <> dispatch token.
        Implementations may choose how dispatch is mapped to values, for
        instance by imposing a static mapping or by calling out to runtime state
        or other data sources to implement this hook. It's intended to be an
        open dispatch mechanism, unlike the others which should have relatively
        defined behavior.
        The default implementation simply preserves the dispatch token.
        """
        return t
    def handle_meta(self, t: CalfMetaToken) -> Any:
        """Handle a ^<> <> so called 'meta' token.
        Implementations may choose how to process metadata, discarding it or
        consuming it somehow.
        The default implementation simply discards the tag value.
        """
        return self.read1(t.value)
    def make_quote(self):
        """Factory. Returns the quote or equivalent symbol. May use `self.make_symbol()` to do so."""
        return Symbol.of("quote")
    def handle_quote(self, t: CalfQuoteToken) -> Any:
        """Handle a 'foo quote form."""
        return Vector.of([self.make_quote(), self.read1(t.value)])
    def read1(self, t: CalfToken) -> Any:
        # Note: 'square' and 'round' lists are treated the same. This should be
        # a hook. Should {} be a "list" too until it gets reader hooked into
        # being a mapping or a set?
        if isinstance(t, CalfListToken):
            return Vector.of(self.read(t.value))
        elif isinstance(t, CalfDictToken):
            return Map.of([(self.read1(k), self.read1(v))
                           for k, v in t.items()])
        # Magical pairwise stuff
        elif isinstance(t, CalfQuoteToken):
            return self.handle_quote(t)
        elif isinstance(t, CalfMetaToken):
            return self.handle_meta(t)
        elif isinstance(t, CalfDispatchToken):
            return self.handle_dispatch(t)
        # Stuff with real factories
        elif isinstance(t, CalfKeywordToken):
            return self.handle_keyword(t)
        elif isinstance(t, CalfSymbolToken):
            return self.handle_symbol(t)
        # Terminals
        elif isinstance(t, CalfStrToken):
            return str(t)
        elif isinstance(t, CalfIntegerToken):
            return int(t)
        elif isinstance(t, CalfFloatToken):
            return float(t)
        else:
            raise ValueError(f"Unsupported token type {t!r} ({type(t)})")
    def read(self, stream):
        """Given a sequence of tokens, read 'em."""
        for t in stream:
            yield self.read1(t)
 def read_stream(stream,
                reader: CalfReader = None):
    """Read from a stream of parsed tokens.
    """
    reader = reader or CalfReader()
    yield from reader.read(stream)
 def read_buffer(buffer):
    """Read from a buffer, producing a lazy sequence of all top level forms.
    """
    yield from read_stream(parse_stream(lex_buffer(buffer)))
 def read_file(file):
    """Read from a file, producing a lazy sequence of all top level forms.
    """
    yield from read_stream(parse_stream(lex_file(file)))
 def main():
    """A CURSES application for using the reader."""
    from calf.cursedrepl import curse_repl
    def handle_buffer(buff, count):
        return list(read_stream(parse_stream(lex_buffer(buff, source=f"<Example {count}>"))))
    curse_repl(handle_buffer)
--- a/projects/calf/src/calf/token.py
+++ b/projects/calf/src/calf/token.py
@ -0,0 +1,239 @@
 """
 Tokens.
 The philosophy here is that to the greatest extent possible we want to preserve lexical (source)
 information about indentation, position and soforth.  That we have to do so well mutably is just a
 pain in the ass and kinda unavoidable.
 Consequently, this file defines classes which wrap core Python primitives, providing all the usual
 bits in terms of acting like values, while preserving fairly extensive source information.
 """
 class CalfToken:
    """
    Token object.
    The result of reading a token from the source character feed.
    Encodes the source, and the position in the source from which it was read.
    """
    def __init__(self, type, value, source, start_position, more):
        self.type = type
        self.value = value
        self.source = source
        self.start_position = start_position
        self.more = more if more is not None else {}
    def __repr__(self):
        return "<%s:%s %r %s %r>" % (
            type(self).__name__,
            self.type,
            self.value,
            self.loc(),
            self.more,
        )
    def loc(self):
        return "%r@%r:%r" % (
            self.source,
            self.line,
            self.column,
        )
    def __str__(self):
        return self.value
    @property
    def offset(self):
        if self.start_position is not None:
            return self.start_position.offset
    @property
    def line(self):
        if self.start_position is not None:
            return self.start_position.line
    @property
    def column(self):
        if self.start_position is not None:
            return self.start_position.column
 class CalfBlockToken(CalfToken):
    """
    (Block) Token object.
    The base result of parsing a token with a start and an end position.
    """
    def __init__(self, type, value, source, start_position, end_position, more):
        CalfToken.__init__(self, type, value, source, start_position, more)
        self.end_position = end_position
 class CalfListToken(CalfBlockToken, list):
    """
    (list) Token object.
    The final result of reading a parens list through the Calf lexer stack.
    """
    def __init__(self, type, value, source, start_position, end_position):
        CalfBlockToken.__init__(
            self, type, value, source, start_position, end_position, None
        )
        list.__init__(self, value)
 class CalfDictToken(CalfBlockToken, dict):
    """
    (dict) Token object.
    The final(ish) result of reading a braces list through the Calf lexer stack.
    """
    def __init__(self, type, value, source, start_position, end_position):
        CalfBlockToken.__init__(
            self, type, value, source, start_position, end_position, None
        )
        dict.__init__(self, value)
 class CalfIntegerToken(CalfToken, int):
    """
    (int) Token object.
    The final(ish) result of reading an integer.
    """
    def __new__(cls, value):
        return int.__new__(cls, value.value)
    def __init__(self, value):
        CalfToken.__init__(
            self,
            value.type,
            value.value,
            value.source,
            value.start_position,
            value.more,
        )
 class CalfFloatToken(CalfToken, float):
    """
    (int) Token object.
    The final(ish) result of reading a float.
    """
    def __new__(cls, value):
        return float.__new__(cls, value.value)
    def __init__(self, value):
        CalfToken.__init__(
            self,
            value.type,
            value.value,
            value.source,
            value.start_position,
            value.more,
        )
 class CalfStrToken(CalfToken, str):
    """
    (str) Token object.
    The final(ish) result of reading a string.
    """
    def __new__(cls, token, buff):
        return str.__new__(cls, buff)
    def __init__(self, token, buff):
        CalfToken.__init__(
            self,
            token.type,
            buff,
            token.source,
            token.start_position,
            token.more,
        )
        str.__init__(self)
 class CalfSymbolToken(CalfToken):
    """A symbol."""
    def __init__(self, token):
        CalfToken.__init__(
            self,
            token.type,
            token.value,
            token.source,
            token.start_position,
            token.more,
        )
 class CalfKeywordToken(CalfToken):
    """A keyword."""
    def __init__(self, token):
        CalfToken.__init__(
            self,
            token.type,
            token.value,
            token.source,
            token.start_position,
            token.more,
        )
 class CalfMetaToken(CalfToken):
    """A ^ meta token."""
    def __init__(self, token, meta, value):
        CalfToken.__init__(
            self,
            token.type,
            value,
            token.source,
            token.start_position,
            token.more,
        )
        self.meta = meta
 class CalfDispatchToken(CalfToken):
    """A # macro dispatch token."""
    def __init__(self, token, tag, value):
        CalfToken.__init__(
            self,
            token.type,
            value,
            token.source,
            token.start_position,
            token.more,
        )
        self.tag = tag
 class CalfQuoteToken(CalfToken):
    """A ' quotation."""
    def __init__(self, token, quoted):
        CalfToken.__init__(
            self,
            token.type,
            quoted,
            token.source,
            token.start_position,
            token.more,
        )
--- a/projects/calf/src/calf/types.py
+++ b/projects/calf/src/calf/types.py
@ -0,0 +1,44 @@
 """Core types for Calf.
 I don't love baking these in, but there's one place to start and there'll be a
 considerable amount of bootstrappy nonsense to get through. So just start with
 good ol' fashioned types and type aliases.
 """
 from typing import *
 import pyrsistent as p
 class Symbol(NamedTuple):
    name: str
    namespace: Optional[str]
    @classmethod
    def of(cls, name: str, namespace: str = None):
        return cls(name, namespace)
 class Keyword(NamedTuple):
    name: str
    namespace: Optional[str]
    @classmethod
    def of(cls, name: str, namespace: str = None):
        return cls(name, namespace)
 # FIXME (arrdem 2021-03-20):
 #
 #   Don't just go out to Pyrsistent for the datatypes. Do something somewhat
 #   smarter, especially given the games Pyrsistent is playing around loading
 #   ctype implementations for performance. God only knows about correctness tho.
 Map = p.PMap
 Map.of = staticmethod(p.pmap)
 Vector = p.PVector
 Vector.of = staticmethod(p.pvector)
 Set = p.PSet
 Set.of = staticmethod(p.pset)
--- a/projects/calf/src/calf/util.py
+++ b/projects/calf/src/calf/util.py
@ -0,0 +1,23 @@
 """
 Bits and bats.
 Mainly bats.
 """
 import re
 def memoize(f):
    memo = {}
    def helper(x):
        if x not in memo:
            memo[x] = f(x)
        return memo[x]
    return helper
@memoize
 def re_mem(regex):
    return re.compile(regex)
--- a/projects/calf/tests/BUILD
+++ b/projects/calf/tests/BUILD
@ -0,0 +1,20 @@
 py_library(
    name = "conftest",
    srcs = [
       "conftest.py"
    ],
    imports = [
        "."
    ],
 )
 py_pytest(
    name = "test",
    srcs = glob(["*.py"]),
    deps = [
        "//projects/calf:lib",
        ":conftest",
        py_requirement("pytest-cov"),
    ],
    args = ["--cov-report", "term", "--cov=calf"],
 )
--- a/projects/calf/tests/conftest.py
+++ b/projects/calf/tests/conftest.py
@ -0,0 +1,7 @@
 """
 Fixtures for testing Calf.
 """
 import pytest
 parametrize = pytest.mark.parametrize
--- a/projects/calf/tests/test_grammar.py
+++ b/projects/calf/tests/test_grammar.py
@ -0,0 +1,30 @@
 """
 Tests covering the Calf grammar.
 """
 import re
 from calf import grammar as cg
 from conftest import parametrize
@parametrize('ex', [
    # Proper strings
    '""',
    '"foo bar"',
    '"foo\n bar\n\r qux"',
    '"foo\\"bar"',
    '""""""',
    '"""foo bar baz"""',
    '"""foo  "" "" "" bar baz"""',
    # Unterminated string cases
    '"',
    '"f',
    '"foo bar',
    '"foo\\" bar',
    '"""foo bar baz',
 ])
 def test_match_string(ex):
    assert re.fullmatch(cg.STRING_PATTERN, ex)
--- a/projects/calf/tests/test_lexer.py
+++ b/projects/calf/tests/test_lexer.py
@ -0,0 +1,89 @@
 """
 Tests of calf.lexer
 Tests both basic functionality, some examples and makes sure that arbitrary token sequences round
 trip through the lexer.
 """
 import calf.lexer as cl
 from conftest import parametrize
 import pytest
 def lex_single_token(buffer):
    """Lexes a single token from the buffer."""
    return next(iter(cl.lex_buffer(buffer)))
@parametrize(
    "text,token_type",
    [
        ("(", "PAREN_LEFT",),
        (")", "PAREN_RIGHT",),
        ("[", "BRACKET_LEFT",),
        ("]", "BRACKET_RIGHT",),
        ("{", "BRACE_LEFT",),
        ("}", "BRACE_RIGHT",),
        ("^", "META",),
        ("#", "MACRO_DISPATCH",),
        ("'", "SINGLE_QUOTE"),
        ("foo", "SYMBOL",),
        ("foo/bar", "SYMBOL"),
        (":foo", "KEYWORD",),
        (":foo/bar", "KEYWORD",),
        (" ,,\t ,, \t", "WHITESPACE",),
        ("\n\r", "WHITESPACE"),
        ("\n", "WHITESPACE"),
        ("  ,    ", "WHITESPACE",),
        ("; this is a sample comment\n", "COMMENT"),
        ('"foo"', "STRING"),
        ('"foo bar baz"', "STRING"),
    ],
 )
 def test_lex_examples(text, token_type):
    t = lex_single_token(text)
    assert t.value == text
    assert t.type == token_type
@parametrize(
    "text,token_types",
    [
        ("foo^bar", ["SYMBOL", "META", "SYMBOL"]),
        ("foo bar", ["SYMBOL", "WHITESPACE", "SYMBOL"]),
        ("foo-bar", ["SYMBOL"]),
        ("foo\nbar", ["SYMBOL", "WHITESPACE", "SYMBOL"]),
        (
            "{[^#()]}",
            [
                "BRACE_LEFT",
                "BRACKET_LEFT",
                "META",
                "MACRO_DISPATCH",
                "PAREN_LEFT",
                "PAREN_RIGHT",
                "BRACKET_RIGHT",
                "BRACE_RIGHT",
            ],
        ),
        ("+", ["SYMBOL"]),
        ("-", ["SYMBOL"]),
        ("1", ["INTEGER"]),
        ("-1", ["INTEGER"]),
        ("-1.0", ["FLOAT"]),
        ("-1e3", ["FLOAT"]),
        ("+1.3e", ["FLOAT"]),
        ("f", ["SYMBOL"]),
        ("f1", ["SYMBOL"]),
        ("f1g2", ["SYMBOL"]),
        ("foo13-bar", ["SYMBOL"]),
        ("foo+13-12bar", ["SYMBOL"]),
        ("+-+-+-+-+", ["SYMBOL"]),
    ],
 )
 def test_lex_compound_examples(text, token_types):
    t = cl.lex_buffer(text)
    result_types = [token.type for token in t]
    assert result_types == token_types
--- a/projects/calf/tests/test_parser.py
+++ b/projects/calf/tests/test_parser.py
@ -0,0 +1,219 @@
 """
 Tests of calf.parser
 """
 import calf.parser as cp
 from conftest import parametrize
 import pytest
@parametrize("text", [
    '"',
    '"foo bar',
    '"""foo bar',
    '"""foo bar"',
 ])
 def test_bad_strings_raise(text):
    """Tests asserting we won't let obviously bad strings fly."""
    # FIXME (arrdem 2021-03-13):
    #   Can we provide this behavior in the lexer rather than in the parser?
    with pytest.raises(ValueError):
        next(cp.parse_buffer(text))
@parametrize("text", [
    "[1.0",
    "(1.0",
    "{1.0",
 ])
 def test_unterminated_raises(text):
    """Tests asserting that we don't let unterminated collections parse."""
    with pytest.raises(cp.CalfMissingCloseParseError):
        next(cp.parse_buffer(text))
@parametrize("text", [
    "[{]",
    "[(]",
    "({)",
    "([)",
    "{(}",
    "{[}",
 ])
 def test_unbalanced_raises(text):
    """Tests asserting that we don't let missmatched collections parse."""
    with pytest.raises(cp.CalfUnexpectedCloseParseError):
        next(cp.parse_buffer(text))
@parametrize("buff, value", [
    ('"foo"', "foo"),
    ('"foo\tbar"', "foo\tbar"),
    ('"foo\n\rbar"', "foo\n\rbar"),
    ('"foo\\"bar\\""', "foo\"bar\""),
    ('"""foo"""', 'foo'),
    ('"""foo"bar"baz"""', 'foo"bar"baz'),
 ])
 def test_strings_round_trip(buff, value):
    assert next(cp.parse_buffer(buff)) == value
@parametrize('text, element_types', [
    # Integers
    ("(1)", ["INTEGER"]),
    ("( 1 )", ["INTEGER"]),
    ("(,1,)", ["INTEGER"]),
    ("(1\n)", ["INTEGER"]),
    ("(\n1\n)", ["INTEGER"]),
    ("(1, 2, 3, 4)", ["INTEGER", "INTEGER", "INTEGER", "INTEGER"]),
    # Floats
    ("(1.0)", ["FLOAT"]),
    ("(1.0e0)", ["FLOAT"]),
    ("(1e0)", ["FLOAT"]),
    ("(1e0)", ["FLOAT"]),
    # Symbols
    ("(foo)", ["SYMBOL"]),
    ("(+)", ["SYMBOL"]),
    ("(-)", ["SYMBOL"]),
    ("(*)", ["SYMBOL"]),
    ("(foo-bar)", ["SYMBOL"]),
    ("(+foo-bar+)", ["SYMBOL"]),
    ("(+foo-bar+)", ["SYMBOL"]),
    ("( foo bar )", ["SYMBOL", "SYMBOL"]),
    # Keywords
    ("(:foo)", ["KEYWORD"]),
    ("( :foo )", ["KEYWORD"]),
    ("(\n:foo\n)", ["KEYWORD"]),
    ("(,:foo,)", ["KEYWORD"]),
    ("(:foo :bar)", ["KEYWORD", "KEYWORD"]),
    ("(:foo :bar 1)", ["KEYWORD", "KEYWORD", "INTEGER"]),
    # Strings
    ('("foo", "bar", "baz")', ["STRING", "STRING", "STRING"]),
    # Lists
    ('([] [] ())', ["SQLIST", "SQLIST", "LIST"]),
 ])
 def test_parse_list(text, element_types):
    """Test we can parse various lists of contents."""
    l_t = next(cp.parse_buffer(text, discard_whitespace=True))
    assert l_t.type == "LIST"
    assert [t.type for t in l_t] == element_types
@parametrize('text, element_types', [
    # Integers
    ("[1]", ["INTEGER"]),
    ("[ 1 ]", ["INTEGER"]),
    ("[,1,]", ["INTEGER"]),
    ("[1\n]", ["INTEGER"]),
    ("[\n1\n]", ["INTEGER"]),
    ("[1, 2, 3, 4]", ["INTEGER", "INTEGER", "INTEGER", "INTEGER"]),
    # Floats
    ("[1.0]", ["FLOAT"]),
    ("[1.0e0]", ["FLOAT"]),
    ("[1e0]", ["FLOAT"]),
    ("[1e0]", ["FLOAT"]),
    # Symbols
    ("[foo]", ["SYMBOL"]),
    ("[+]", ["SYMBOL"]),
    ("[-]", ["SYMBOL"]),
    ("[*]", ["SYMBOL"]),
    ("[foo-bar]", ["SYMBOL"]),
    ("[+foo-bar+]", ["SYMBOL"]),
    ("[+foo-bar+]", ["SYMBOL"]),
    ("[ foo bar ]", ["SYMBOL", "SYMBOL"]),
    # Keywords
    ("[:foo]", ["KEYWORD"]),
    ("[ :foo ]", ["KEYWORD"]),
    ("[\n:foo\n]", ["KEYWORD"]),
    ("[,:foo,]", ["KEYWORD"]),
    ("[:foo :bar]", ["KEYWORD", "KEYWORD"]),
    ("[:foo :bar 1]", ["KEYWORD", "KEYWORD", "INTEGER"]),
    # Strings
    ('["foo", "bar", "baz"]', ["STRING", "STRING", "STRING"]),
    # Lists
    ('[[] [] ()]', ["SQLIST", "SQLIST", "LIST"]),
 ])
 def test_parse_sqlist(text, element_types):
    """Test we can parse various 'square' lists of contents."""
    l_t = next(cp.parse_buffer(text, discard_whitespace=True))
    assert l_t.type == "SQLIST"
    assert [t.type for t in l_t] == element_types
@parametrize('text, element_pairs', [
    ("{}",
     []),
    ("{:foo 1}",
     [["KEYWORD", "INTEGER"]]),
    ("{:foo 1, :bar 2}",
     [["KEYWORD", "INTEGER"],
      ["KEYWORD", "INTEGER"]]),
    ("{foo 1, bar 2}",
     [["SYMBOL", "INTEGER"],
      ["SYMBOL", "INTEGER"]]),
    ("{foo 1, bar -2}",
     [["SYMBOL", "INTEGER"],
      ["SYMBOL", "INTEGER"]]),
    ("{foo 1, bar -2e0}",
     [["SYMBOL", "INTEGER"],
      ["SYMBOL", "FLOAT"]]),
    ("{foo ()}",
     [["SYMBOL", "LIST"]]),
    ("{foo []}",
     [["SYMBOL", "SQLIST"]]),
    ("{foo {}}",
     [["SYMBOL", "DICT"]]),
    ('{"foo" {}}',
     [["STRING", "DICT"]])
 ])
 def test_parse_dict(text, element_pairs):
    """Test we can parse various mappings."""
    d_t = next(cp.parse_buffer(text, discard_whitespace=True))
    assert d_t.type == "DICT"
    assert [[t.type for t in pair] for pair in d_t.value] == element_pairs
@parametrize("text", [
    "{1}",
    "{1, 2, 3}",
    "{:foo}",
    "{:foo :bar :baz}"
 ])
 def test_parse_bad_dict(text):
    """Assert that dicts with missmatched pairs don't parse."""
    with pytest.raises(Exception):
        next(cp.parse_buffer(text))
@parametrize("text", [
    "()",
    "(1 1.1 1e2 -2 foo :foo foo/bar :foo/bar [{},])",
    "{:foo bar, :baz [:qux]}",
    "'foo",
    "'[foo bar :baz 'qux, {}]",
    "#foo []",
    "^{} bar",
 ])
 def test_examples(text):
    """Shotgun examples showing we can parse some stuff."""
    assert list(cp.parse_buffer(text))
--- a/projects/calf/tests/test_reader.py
+++ b/projects/calf/tests/test_reader.py
@ -0,0 +1,22 @@
 """
 """
 from conftest import parametrize
 from calf.reader import read_buffer
@parametrize('text', [
    "()",
    "[]",
    "[[[[[[[[[]]]]]]]]]",
    "{1 {2 {}}}",
    '"foo"',
    "foo",
    "'foo",
    "^foo bar",
    "^:foo bar",
    "{\"foo\" '([:bar ^:foo 'baz 3.14159e0])}",
    "[:foo bar 'baz lo/l, 1, 1.2. 1e-5 -1e2]",
 ])
 def test_read(text):
    assert list(read_buffer(text))
--- a/projects/calf/tests/test_types.py
+++ b/projects/calf/tests/test_types.py
@ -0,0 +1,17 @@
 """
 Tests covering the Calf types.
 """
 from calf import types as t
 def test_maps_check():
    assert isinstance(t.Map.of([(1, 2)]), t.Map)
 def test_vectors_check():
    assert isinstance(t.Vector.of([(1, 2)]), t.Vector)
 def test_sets_check():
    assert isinstance(t.Set.of([(1, 2)]), t.Set)
--- a/tools/python/bzl_pytest_shim.py
+++ b/tools/python/bzl_pytest_shim.py
@ -1,3 +1,6 @@
 """A shim for executing pytest."""
 import os
 import sys
 import pytest
@ -9,4 +12,7 @@ if __name__ == "__main__":
    cmdline = ["--ignore=external"] + sys.argv[1:]
    print(cmdline, file=sys.stderr)
    for e in sys.path:
        print(f" - {os.path.realpath(e)}", file=sys.stderr)
    sys.exit(pytest.main(cmdline))
--- a/tools/python/requirements.txt
+++ b/tools/python/requirements.txt
@ -9,6 +9,7 @@ certifi==2020.12.5
 chardet==4.0.0
 click==7.1.2
 commonmark==0.9.1
 coverage==5.5
 docutils==0.17
 idna==2.10
 imagesize==1.2.0
@ -37,6 +38,7 @@ Pygments==2.8.1
 pyparsing==2.4.7
 pyrsistent==0.17.3
 pytest==6.2.3
 pytest-cov==2.11.1
 pytest-pudb==0.7.0
 pytz==2021.1
 PyYAML==5.4.1
--- a/tools/python/test_licenses.py
+++ b/tools/python/test_licenses.py
@ -26,6 +26,7 @@ LICENSES_BY_LOWERNAME = {
    "apache 2.0": "License :: OSI Approved :: Apache Software License",
    "apache": "License :: OSI Approved :: Apache Software License",
    "bsd 3 clause": "License :: OSI Approved :: BSD License",
    "bsd 3-clause": "License :: OSI Approved :: BSD License",
    "bsd": "License :: OSI Approved :: BSD License",
    "gplv3": "License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
    "http://www.apache.org/licenses/license-2.0": "License :: OSI Approved :: Apache Software License",