Hi everyone. We have great news on the ultimate λ-calculus which will possibly lead to a next-gen compiler for Formality.
Long history: initially, we compiled Formality to interaction combinators, the Levy-optimal system everyone here knows. Sadly, inets, while asymptotically fast, are slightly inefficient in practice due to pointer gloat. A few months ago, I noticed a way to represent a subset interaction combinators in a more compact format that resembled terms, except "ill-scoped" (i.e., a lambda-bound variable can appear outside of its body). I called it the ultimate λ-calculus. It had 4 main constructors:
LAM(name, body)
APP(func, argm)
PAR(val0, val1)
LET(name0, name1, expr, body)
Afterwards, I found a very efficient way to evaluate a subset of the ultimate λ-calculus, by representing variables as pointers on LAM nodes (i.e., lambdas point to variables, not the other way around). This, together with a leftmost-innermost reduction, allowed beta-reduction to become a single array write! It is, ridiculously, 40x faster than interaction nets. It is the "fast-mode" on Formality v0.1.253. Sadly, this mode didn't allow duplications at all, only affine terms. The problem was that the order
let
redexes were found didn't match the leftmost-innermost reduction. This resulted in errors, since pointers would become invalid when the terms they pointed to moved. Today, I noticed that a small change on the ultimate λ-calculus could solve that issue. Instead of a
let
with two variables, we use a
cpy
with only one. That is, instead of:
let [a, b] = 42 in [a, b]
We have:
[cpy a = 42, a]
The
cpy
constructor injects a copy of something into another location, and then returns that thing. The result of this is that 1. the
let
constructor becomes smaller (from 3 to 2 fields), 2. the order it is found matches the innermost-leftmost reduction strategy, solving the problem mentioned.
I'll start immediately working in a C runtime for that new calculus and benchmark. If it works well, I'll proceed with the addition of native numbers and mutable arrays (which can be detected from types as I mentioned previously). If all turns ok, we'll have a functional runtime that is very CPU efficient, levy-optimal, non-garbage collected and with actual mutable arrays that don't need to live inside ST or other monad.