# agentlanguages.dev — full catalogue > Complete machine-readable text of the agentlanguages.dev catalogue. 28 languages designed for AI agents to author code, organised across five buckets: three philosophical camps (syntactic, verification, orchestration) plus adjacent and unclassified. This file is the full-text companion to the short index at https://agentlanguages.dev/llms.txt — included here so that an agent that wants the entire catalogue can fetch it in one HTTP round-trip rather than 29 (homepage + one per entry). Each entry below carries a `## Name` heading and a metadata block (camp, author, implementation language, compilation target, licence, first seen, maturity, site, repo, paper, agent tooling) followed by the editorial prose and, where present, design-DNA cross-references and timeline events. Originally catalogued in a post "Three camps alike in dignity" (https://negroniventurestudios.com/2026/05/20/three-camps-alike-in-dignity/) written for the Negroni Venture Studios blog. Maintained by Alasdair Allan. Editorial principles: descriptive, not promotional. No ranking. Inclusion is based on whether a language's designers explicitly target LLMs or agents as authors. Tools that *use* an LLM at runtime are out of scope. --- # Syntactic camp (9) > If the problem is that models trip on syntax, the fix is to strip ambiguity from the syntax itself. The syntactic camp treats the problem as one of representation — models choke on tokens that mean different things in different positions, on operators that need disambiguation, on whitespace that might or might not be load-bearing. Their answer: build a syntax where every token has one job. ## B-IR > A Jason Hall blog post on three attempts at an LLM-optimised language: B-IR with unicode opcodes, TBIR with control characters the model rewrote into English keywords, and Loom with pre/postconditions and stable error codes. **Camp:** Syntactic **Author:** Jason Hall (Chainguard) **Implementation language:** Python (bootstrap) **Compilation target:** Arm64 assembly (Mach-O via clang) **Licence:** Unknown **First seen:** January 2026 **Maturity:** thought experiment **Site:** https://articles.imjasonh.com/llm-programming-language.md **Repo:** https://github.com/imjasonh/loom ### Key idea B-IR is a written narrative of three iterations. Gemini produced B-IR with multi-byte unicode opcodes, then proved too cumbersome to bootstrap. Claude Opus replaced it with TBIR using single-byte control characters in the 0x80-0x8B range, then on its own decided the unreadable characters were getting in the way and substituted short English keywords (init, fetch, emit, print, loop, exit). The final iteration, Loom, keeps token density but adds unambiguous scope, mandatory pre/postconditions, and stable error codes that the model is expected to look up rather than re-read in prose. ## What it is. B-IR is not a project. It is a blog post by Jason Hall, principal engineer at Chainguard, first published on his personal articles site on 11 January 2026. The article describes a Sunday spent prompted by a prediction made on the Oxide and Friends "Predictions for 2026" episode that 2026 would be the year LLMs got a programming language not intelligible to humans. Hall asked first Gemini and then Claude Opus to design such a language and recorded what each model produced. The experimental artefacts — manual.md, an l1-compiler.tbir clocking in at just under 700 lines, and the loom.md specification — live in the companion GitHub repository `imjasonh/loom` (the repo was originally published as `imjasonh/b-ir`, and that URL still redirects). The article is candid about the iteration arc: the first design (multi-byte unicode opcodes) was too unwieldy for the model itself to bootstrap; the second (single-byte control characters) was abandoned mid-implementation by the model, which substituted short English keywords on its own initiative; the third (Loom) keeps token density but adds the kind of structure the verification camp has long argued for. ## Why it's here. The catalogue includes B-IR as a marker of a meta-question. The interesting result is not the languages themselves but what happens when a working engineer asks two frontier models to design for their own consumption and reports on it honestly. Hall's own observation is that the third iteration ends up resembling existing languages with cleaner error codes and unambiguous scope — which the catalogue reads as the design space converging on the same concerns the verification camp arrived at independently. The catalogue does not rate B-IR against working compilers. It marks the article as a different kind of evidence: a candid record of what the model gravitates toward when given a blank page. ### Design DNA - **Sever** *(Syntactic)* — Both are catalogue-meta companions. Each captures what falls out when a frontier model is asked to design a language for itself; both authors keep the result at arm's length from any production claim. - **Laze** *(Syntactic)* — Both are small explorations by a single author working with one model over a weekend. Laze ships a compiler; B-IR ships an article. - **Vera** *(Verification)* — Loom's conclusions — unambiguous scope, mandatory pre/postconditions, stable error codes — converge on the diagnosis Vera arrived at independently in the verification camp. *Detail page: https://agentlanguages.dev/languages/b-ir/ · Markdown companion: https://agentlanguages.dev/languages/b-ir.md* ## Codong > Designed for AI to write, with one canonical way to express every operation. Nine bundled modules, structured JSON errors with fix/retry fields, ? operator for propagation. Compiles to Go IR, then native via go build. **Camp:** Syntactic **Author:** Brett (brettinhere) **Implementation language:** Go **Compilation target:** Native binary via Go IR + `go build` **Licence:** MIT **First seen:** March 2026 **Maturity:** working compiler **Site:** https://codong.org **Repo:** https://github.com/brettinhere/Codong **Codong Arena:** https://codong.org/arena/ **Agent tooling:** - SPEC_FOR_AI.md (system-prompt injection — Markdown spec with paired CORRECT/WRONG examples for every rule) - Structured JSON errors with `fix` and `retry` repair fields - Compact error format (project-reported ~39% token reduction) ### Key idea Codong is designed for AI to write, humans to review, machines to execute. The syntactic-camp move is to collapse choice paralysis: one canonical function per task, nine bundled modules covering most AI workloads with zero external dependencies, structured JSON errors with fix/retry fields, ? operator for error propagation. Compiles through Go — .cod source goes to Go IR, then `go build` produces a static native binary. ## The thesis. Codong's diagnosis is choice paralysis. Python has five ways to make an HTTP request. JavaScript has four state-management libraries. Every choice costs tokens to navigate and produces unpredictable output. The README states it directly: "Codong is designed for AI to write, humans to review, and machines to execute." The syntactic-camp move is to collapse the language and its standard library to one canonical form per task — `http.get(url)`, `web.serve(port: N)`, `db.connect(url)`, `json.parse(s)`. Nine modules bundled, zero external dependencies, no package manager required.
Codong has exactly one way to do everything.
The distinctive move is which kind of choice gets eliminated. NERD strips operators down to English keywords; Magpie strips ambiguity at the surface by making SSA the user-facing form. Codong leaves both operators and surface alone and instead collapses the standard library — one HTTP function, one JSON parser, one error shape. Compilation routes through Go: `.cod` source passes through lexer, parser, AST, Go IR, then `go build` for a static native binary. The compiler is essentially a frontend for Go's toolchain, in the same way TypeScript is a frontend for JavaScript or Kotlin is a frontend for JVM bytecode. ## What it looks like.fn load_config(path) { content = fs.read(path)? config = json.parse(content)? host = config.get("host", "localhost") port = config.get("port", 8080) return {host: host, port: port} } try { config = load_config("config.json")? print("Server: {config.host}:{config.port}") } catch err { print("Failed: {err.code} - {err.fix}") }
Two bundled stdlib calls, ? on three of them propagating structured errors up the stack, and the err.fix field doing repair-loop work in the catch branch.
raises markers, and a typed zero fix --plan --json API inside a verification project; Codong buys it with a single canonical stdlib inside a pure syntactic project. Industrial-backing contrast: Vercel Labs vs single author.
- **Magpie** *(Syntactic)* — Same camp, opposite mechanism. Magpie surfaces SSA — every value %-prefixed, ~2.3× more tokens per operation — so the model has nowhere to be wrong; Codong keeps conventional surface but ships one canonical function per task so the model never has to choose. Magpie pays in tokens for unambiguity; Codong pays in stdlib scope.
*Detail page: https://agentlanguages.dev/languages/codong/ · Markdown companion: https://agentlanguages.dev/languages/codong.md*
## Laze
> Minimal indentation-based syntax with no punctuation. A Python bootstrap compiler emits C in memory and pipes it to cc -O2. Framed by its author as a weekend experiment in what an LLM produces when asked to design a language for itself.
**Camp:** Syntactic
**Author:** kerv
**Implementation language:** Python (bootstrap)
**Compilation target:** C (via gcc/clang)
**Licence:** Unknown
**First seen:** April 2026
**Maturity:** early implementation
**Site:** https://github.com/kerv/laze
**Repo:** https://github.com/kerv/laze
### Key idea
Laze is an indentation-based language with infix operators and no punctuation. The compiler is a single Python script (laze/lazec.py) that parses .laze files into an AST, generates C in memory without writing it to disk, and pipes the result to a C compiler. The bet is that LLMs are most accurate when emitting text-shaped, readable input; ergonomic syntax for the model is treated as more important than expressive power or efficiency at the language layer.
## What it is.
Laze is a weekend experiment, not a production tool. The README opens with the warning, verbatim: "This was just an experiment in which I asked Claude Opus 4.7 to create a programming language in the most efficient way it could." The surface is indentation-based, drops most punctuation, and uses infix operators. The compiler is a Python script that emits C internally — never written to disk — and pipes it to `cc -O2` for a native macOS binary. The repository (linked from the author's LinkedIn handle millerkev) contains four commits, two example files, and a single demonstration: `nes.laze`, a 2,000-plus-line NES emulator file covering a 6502 CPU, PPU sprites and scrolling, an APU, and mappers 0, 1, and 4. The author reports Super Mario Bros. as fully playable and Legend of Zelda playable with minor glitches.
## Why it's here.
The catalogue includes Laze as a marker of a position in the syntactic camp: optimise the surface for what an LLM finds easiest to produce, not for what a compiler can analyse. The thesis, stated in the README, is that an LLM specialises in text-shaped input because that is what it is trained on, so the right target is whichever syntax it generates most correctly and fastest. The catalogue does not rate Laze against working compilers shipped by larger teams. It marks it as a different kind of evidence: a single contributor's snapshot of what falls out when an LLM is allowed to design its own language and a human supplies only the prompt.
### Design DNA
- **Magpie** *(Syntactic)* — Same camp, opposite end of the same axis. Magpie chooses an explicit SSA surface to remove ambiguity; Laze strips punctuation and indentation rules to maximise the model's generation speed.
- **B-IR** *(Syntactic)* — Both are small individual explorations — a weekend's worth of letting an LLM design its own language and recording what came out.
*Detail page: https://agentlanguages.dev/languages/laze/ · Markdown companion: https://agentlanguages.dev/languages/laze.md*
## Magpie
> SSA as the surface syntax. Every value %-prefixed and typed at definition; one canonical way to express each operation; compiles to native via LLVM.
**Camp:** Syntactic
**Author:** Magpie Language Developers
**Implementation language:** Rust
**Compilation target:** LLVM IR / native, also WebAssembly
**Licence:** MIT
**First seen:** April 2026
**Maturity:** early implementation
**Site:** https://magpie-lang.com
**Repo:** https://github.com/magpie-lang/magpie
**Agent tooling:**
- SKILL.md
- AGENTS.md
### Key idea
The textual program is already in SSA form. Every binding is %-prefixed and typed at definition, basic blocks are explicit, branches and ownership transitions are first-class operations. The bet is that removing surface ambiguity reduces LLM error rates more than added verification does.
## The thesis.
Magpie is a syntactic-camp project that takes the camp's premise to its logical end: don't add verification, remove ambiguity. The site states the goal directly — "Magpie eliminates ambiguity so LLMs can write perfect code on the first try" — and the language realises it by making the textual program identical to the compiler's intermediate representation. Every value is named at the point of definition with a `%`-prefixed identifier, typed inline, and assigned exactly once. Basic blocks are explicit (`bb0:`). Branches are explicit (`cbr`, `br`). Ownership transitions (`borrow`, `mutborrow`, `share`) are first-class operations. The premise is that the hidden semantics of conventional syntax — operator overloading, implicit conversions, invisible lifetime rules — are exactly the places LLMs hallucinate.
~2.3× more tokens per operation, but eliminates the hidden rules that cause AI retries and borrow checker failures.
The distinctive move shows up in the cross-camp comparison with Vera. Vera adds verification on top of conventional surface syntax (mandatory contracts, Z3 discharge, the `module demo.main exports { @main } imports { } digest "0000000000000000" fn @add_two(a: i64, b: i64) -> i64 { bb0: %sum: i64 = i.add { lhs=%a, rhs=%b } ret %sum }
Every value is %-prefixed, typed at definition, and computed by a named operation with named operands.
The full spec fits in 3,200 tokens.
The distinctive move is what Mog refuses to do. It is not standalone; it has no standard library; it ships only as an embedded language inside a host application that provides every capability. Compilation is in-process — no JIT, no interpreter overhead, no process startup cost; the frontend compiles a `.mog` file via rqbe (a safe-Rust port of QBE, ~15,000 lines) and produces a `.dylib` or `.so` the host can `dlopen`. The first version of Mog itself was authored by Voltropy's Volt coding agent in a single three-week continuous session, using Claude Opus 4.6, Kimi k2.5, and GLM-4.7 with Voltropy's lossless context management preserving working memory across compactions. This puts Mog in the same agent-authored cluster as AILANG. ## What it looks like.import agent; optional log; // post-compaction hook: re-inject key context pub fn on_post_compaction(session: agent.Session) { log.info("post-compaction hook: injecting reminder"); session.messages.push(agent.Message { role: agent.Role.SYSTEM, content: "IMPORTANT: Always run tests before committing.", }); }
A Mog script declaring an agent import and an optional log capability. The host decides whether to provide log; the script runs either way. The agent namespace is host-provided typed data.
The irony: cryptic symbols don't save tokens. Plain words win.
The distinctive move is what NERD does *not* ship: no type system, no error union, no contracts, no checker beyond the parser. This is the syntactic camp at its purest — the bet is that smoothing the generation surface buys more than verification would, and the difference shows up in the inference bill. Magpie reaches a similar diagnosis through the opposite mechanism (SSA form, every value named and typed at definition); NERD picks the lower-effort lever and accepts that "audit" rather than "verify" is the only safety net. ## What it looks like.fn fizzbuzz n repeat n times as i if i mod 15 eq zero out "FizzBuzz" else if i mod three eq zero out "Fizz" else if i mod five eq zero out "Buzz" else out i done fn main call fizzbuzz 15
Every operator is a word: mod, eq, repeat, done. No braces, no semicolons, no +/==.
If human readability is not the primary constraint, a language can optimise for three things at once.
Two views render the same tree: a dense **authoring view** sized for model token budgets, and an **inspection view** layered for debugging and human review. Both round-trip losslessly through a JSON sidecar (`.tacd`). Canonical text is byte-exact — there is exactly one serialisation per AST, which eliminates formatter debates and makes hashing meaningful. Definitions are identified by the BLAKE3 hash of their canonical text; display names live in the sidecar and carry no semantic weight. Variable references use DeBruijn indices in canonical form. The cross-camp move is the verification-adjacent effect lattice: Tacit-Lite tracks `IO` / `Alloc` / `Mut` / `Div` in signatures, and unit boundaries must declare type and effect rows explicitly. ## What it looks like. ```tacit unit Math { import inc : Int -> Int from blake3:0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef; private double : Int -> Int = lambda x. @add x x; export public add_two : Int -> Int = lambda x. inc (inc x); } ``` Imports name exact `blake3:<64-hex>` definition hashes, not paths or version ranges. Visibility (`public` / `package` / `private`) is part of the artefact. The display names `x`, `inc`, `double` are sidecar metadata; in canonical form references are DeBruijn indices. ## Distinctive moves. - **AST as authoritative source.** The on-disk `.tac` file is the byte-exact canonical projection of the tree; comments and free-form formatting are not in the language. Names, field order, and type/effect hints live in a `.tacd` JSON sidecar. - **Content-addressed definitions.** Imports resolve to BLAKE3 hashes, not names. Renaming a definition leaves its hash unchanged; changing its signature, body, or referenced hashes produces a new identity. Package manifests pin a hash-indexed local cache. - **Typed `Hole` nodes instead of parse failures.** Malformed code reduces to a `Hole` node with a structured diagnostic and a type slot, so downstream tools can keep operating on partial programs (ADR 0040, landed in Phase 2). - **Explicit effect rows on boundaries.** Tacit-Lite's fixed lattice is `IO` / `Alloc` / `Mut` / `Div`; effect signatures are mandatory at unit exports and inferred locally elsewhere. Tacit-Full (refinement types, capability-based security, handlers) is reserved as a roadmap, not shipped. - **Toolchain pin as a first-class artefact.** `tacit init` writes a `tacit-toolchain.toml` that pins toolchain, primer, and bundled stdlib hashes; every package-aware command refuses to run on a mismatched pin and surfaces a `toolchain-pin-*` diagnostic. ## Maturity. A Rust workspace (five crates: `tacit-canonical`, `tacit-views`, `tacit-typecheck`, `tacit-codegen`, `tacit-cli`) at v0.7.7, released 19 May 2026. Apache-2.0 / MIT dual-licensed; 237 commits, 3 stars, 2 forks at time of cataloguing. The decision log runs to ~90 ADRs; Phase 6 was frozen by [ADR 0089](https://github.com/weetster/tacit/blob/main/decisions/0089-phase-6-frozen.md) on 2026-05-17, closing modules/units, package manifests with hash-pinned lockfiles, package tests with stable `tacit-test-v1` JSON output, fixed-width integers with wrapping/checked/saturating arithmetic, typed mutable-memory handles, source-level stdlib packages (`tacit.core`, `.bytes`, `.array`, `.text`, `.collections`, `.io`), and a constrained host-interface ABI with generated C headers and Rust bindings. A Rust embedding demo links a Tacit kernel as a static library. Phase 7 is the next planned phase; debugger, diff/blame, IDE, public registry, and arbitrary FFI are explicitly out of scope until a later ADR. LLVM 19 is pinned via `inkwell`; published release artefacts target Linux x86_64 with a glibc 2.35 floor. ## Agent tooling. `AGENTS.md` (1.7 KB) carries the Codex-facing sealed-corpus guardrail and a pointer to `CLAUDE.md`. `CLAUDE.md` (~20 KB) functions as a full development guide rather than a SKILL.md — it enumerates frozen artefacts, ground rules, the file-extension contract (`.tac` / `.tacd` / `.taca`), and the per-phase delivered surface. The toolchain ships its own primer: `tacit primer` prints the byte-pinned Tacit-Lite primer, and `tacit primer --search` / `--list-sections` / `--section.tac file is a canonical projection of the AST, not the source. Both pay a token cost to strip ambiguity.
- **X07** *(Syntactic)* — Same direction along the ‘text is lossy’ axis. X07 stores programs as canonical JSON ASTs and edits them with JSON Patch; Tacit stores them as canonical text projected from the AST, with BLAKE3-addressed identity. Different surfaces, same diagnosis.
- **Vera** *(Verification)* — Cross-camp foil on names. Vera abolishes parameter names entirely in favour of typed DeBruijn slots (@Int.0); Tacit keeps display names as sidecar metadata but uses DeBruijn indices in canonical form. Both treat names as a source of model error rather than a feature.
- **Mog** *(Syntactic)* — Adjacent on the embedding angle. Mog is a small embedded language with a capability system and a sub-3,200-token spec; Tacit ships a constrained host-interface ABI for a Rust host and explicitly defers capabilities to Tacit-Full. Different bets on whether capability tracking belongs in v1.
*Detail page: https://agentlanguages.dev/languages/tacit/ · Markdown companion: https://agentlanguages.dev/languages/tacit.md*
## X07
> Eliminates text syntax. Programs are canonical JSON ASTs (x07AST); edits are RFC 6902 JSON Patch operations; diagnostics ship as stable JSON with quickfixes the toolchain applies deterministically.
**Camp:** Syntactic
**Author:** Author unknown
**Implementation language:** Rust
**Compilation target:** Native (via C codegen); WebAssembly
**Licence:** Apache-2.0 OR MIT
**First seen:** April 2026
**Maturity:** working compiler
**Site:** https://x07lang.org
**Repo:** https://github.com/x07lang/x07
**Agent tooling:**
- AGENT.md
- Agent portal at /agent with versioned JSON entrypoints (manifest, schemas, skills, examples, stdlib, packages)
- x07-mcp (build MCP servers in X07)
- x07lang-mcp bridge (typed access to the toolchain over MCP)
- Per-release skills pack
- Stable error codes; quickfixes emitted as JSON Patch
### Key idea
X07's canonical source is not text. A program is an x07AST JSON document (`*.x07.json`) with a versioned schema; edits are RFC 6902 JSON Patch operations the toolchain applies mechanically. The toolchain emits stable error codes as structured JSON, paired with quickfixes the agent can apply with `x07 fix --write` or `x07 ast apply-patch`. Side effects live in explicit capability worlds; sandboxing is policy-driven.
## The thesis.
X07's diagnosis is that text-based source is exactly where agents lose. Whitespace becomes load-bearing, identical ASTs serialise differently, patches collide on formatting noise, and diagnostics are written for humans. The syntactic-camp move is to delete the text layer: the canonical source is `*.x07.json` (the x07AST), patches are RFC 6902 JSON Patch documents, diagnostics are JSON with stable error codes, and quickfixes are themselves JSON Patches the toolchain applies deterministically via `x07 fix --write` or `x07 ast apply-patch`.
One canonical approach. No "should I use a for loop or map?" decisions.
The distinctive move is the breadth of the machine-facing surface around the language. `x07lang.org/agent` publishes versioned JSON entrypoints — `manifest.json`, `schemas/index.json`, `skills/index.json`, `examples/catalog.json`, `stdlib/index.json`, `packages/index.json` — explicitly so agents consume them directly rather than scraping the HTML. Side effects are gated by named capability worlds (`run-os`, `run-os-sandboxed`); the official tooling includes an MCP kit (`x07-mcp`) for authoring MCP servers in X07 and a separate bridge (`x07lang-mcp`) for connecting external agents to the toolchain. The closest direction-of-travel is Magpie, which surfaces SSA at the source level; X07 deletes the source level. ## What it looks like.{
"schema_version": "x07.x07ast@0.4.0",
"kind": "entry",
"module_id": "main",
"imports": ["std.bytes"],
"decls": [],
"solve": ["std.bytes.reverse", "input"]
}
A program is a JSON document. A quickfix is an array of JSON Patch operations applied to this document; the agent never edits a string of source code, only the structural tree.
For humans, a language is a tool for expression. For AIs, it's a substrate for reasoning.
The distinctive move is the no-loops decision. AILANG commits to lambda calculus, pattern matching, and ADTs as its only forms of control flow — no `for`, no `while`, no mutable accumulator. Where Vera tracks model calls as a single `module examples/hello import std/io (println) export func main() -> () ! {IO} { println("Hello from AILANG!") }
The ! {IO} after the return type is the effect row. A caller without an IO capability granted via ailang run --caps IO cannot invoke this function. Effect rows compose: a function that calls IO- and FS-effecting helpers must declare {IO, FS}.
Code is cheap to generate. Expensive to trust.
The distinctive move shows up in the comparison with Vera. The two share design DNA — mandatory verification artefacts, explicit effects, no `if`/`else`, no closures, no exceptions, no nulls, no loops — but disagree on what to drop. Vera drops names entirely via De Bruijn slot references (`@Int.0`). Aver keeps names and makes the surrounding metadata mandatory. Vera's bet is that names are the failure mode; Aver's is that absence of intent is. ## What it looks like.fn safeDivide(a: Int, b: Int) -> Result<Int, String> ? "Safe integer division. Returns Err on zero." match b 0 -> Result.Err("Division by zero") _ -> Result.Ok(a / b) verify safeDivide safeDivide(10, 2) => Result.Ok(5) safeDivide(7, 0) => Result.Err("Division by zero")
Prose intent (?), no if/else, and a colocated verify block. The function ships its specification.
The contract system is the product. The implementation is secondary.
The distinctive move is the **intent block** itself: a named natural-language goal paired with a list of `verified_by` references that name specific contracts in the codebase (`BankAccount.invariant`, `BankAccount.deposit.requires`, `BankAccount.withdraw.ensures`). The compiler resolves each reference during semantic analysis and emits an error on a dangling one; published intent blocks therefore cannot drift from the contracts they cite. Z3 discharges what it can statically via `intentc verify`; the rest becomes runtime enforcement that fires identically across all three targets — a `requires` failure panics the Rust binary, throws an `Error` in JavaScript, and traps in WebAssembly. ## What it looks like. ```intent entity BankAccount { field balance: Int; invariant self.balance >= 0; method deposit(amount: Int) returns Void requires amount > 0 ensures self.balance == old(self.balance) + amount { self.balance = self.balance + amount; } } intent "Safe withdrawal preserves non-negative balance" { goal "BankAccount.withdraw never results in balance < 0"; guarantee "if withdraw returns false then balance is unchanged"; verified_by BankAccount.invariant; verified_by BankAccount.withdraw.requires; } ``` `old(...)` captures pre-mutation state for `ensures` clauses. `forall i in 0..n: p` and `exists i in 0..n: p` quantify over integer ranges in contracts. Loops carry `invariant` and `decreases` clauses for inductive reasoning at verification time. ## Distinctive moves. - **Mandatory contracts at three levels.** Functions carry `requires`/`ensures`; entities carry `invariant`; loops carry `invariant`/`decreases`. The grammar reserves the slots; the type checker enforces that `verified_by` references resolve. - **Intent blocks as compiled artefacts.** A `verified_by` path (`Entity.member.clause` or `function.clause`) is resolved by the semantic checker, not by convention. Unresolved references are compile errors, which prevents prose-level drift between stated goals and machine-checkable contracts. - **Z3 as an optional static layer, runtime checks as the floor.** `intentc verify` translates IR contracts to SMT-LIB and invokes Z3; results are reported per contract as `verified` / `unverified` / `error` / `timeout`. When Z3 is absent, the compiler degrades gracefully and the runtime asserts still run. - **Rust as IR, not as the only backend.** An explicit IR layer (~30 node types, contracts as first-class IR nodes, `OldCapture`/`OldRef` as explicit pre-state) feeds three sibling backends: a Rust generator that hands off to `cargo`, a direct JavaScript emitter, and a direct WebAssembly binary emitter that does not require the Rust toolchain. Each enforces the same contracts at runtime. - **Property-based test generation from contracts.** `intentc test-gen` derives property-based tests from `requires`/`ensures`, complementing — not replacing — the SMT discharge. ## Maturity. A Go workspace at v0.2.0 (released 16 February 2026) with 45 commits and 5 stars at time of cataloguing. The roadmap (`docs/ROADMAP.md`) records milestones 1–6 as complete: usable language surface (loops, arrays, enums, pattern matching, `Result`/`Option`, try operator, multi-file imports), the Z3 verifier (`internal/verify/`), and three working backends (Rust via `internal/rustbe/`, JavaScript via `internal/jsbe/`, direct WASM via `internal/wasmbe/`). The `docs/DESIGN.md` specification runs to 1,764 lines and notes that traits, generics, async/await, and a package manager with semver constraints have all landed since the POC. The CLI surface — `intentc build / check / verify / fmt / lint / test-gen`, plus `intentc pkg init / add / remove / install` — is shippable; a four-target showcase (CLI binary, browser dashboard, Node server, browser WASM at 155 bytes) runs against unmodified compiler output. Last commit was 25 March 2026; an LSP, REPL, and release-mode contract stripping sit on the milestone-8 roadmap and are not yet built. ## Agent tooling. Three documents target agent authors directly. `AGENTS.md` (~18 KB) is the Codex/general-agent orientation. `CLAUDE.md` is the Claude-specific working guide. `INTENT.md` (~26 KB) is the language reference written as agent instructions — it opens “You are generating code in Intent” and ends with ten explicit guidelines for AI code generation, including “Write contracts first” and “Every function should have contracts.” `docs/REPRODUCE.md` documents reproducing the compiler with the agent of the reader's choice. Diagnostics are textual rather than structured JSON; the LSP that would expose them programmatically is on the roadmap rather than shipped. ### Design DNA - **Vera** *(Verification)* — Closest design relative. Both ship mandatory contracts on every function, both use Z3, both treat the agent as the primary author. Vera abolishes parameter names via typed DeBruijn slots and tracks LLM inference as a first-class<Inference> effect; Intent keeps names and concentrates its novelty in intent blocks that bind natural-language goals to verified contract references.
- **Aver** *(Verification)* — Same camp, different proof story. Aver exports verify blocks as Lean 4 theorems or Dafny lemmas through aver proof, lifting effectful code into proof artefacts via Oracle; Intent commits to Z3 SMT with runtime enforcement on every backend. Same diagnosis, different upper-bound check.
- **MoonBit** *(Verification)* — Closest sibling on compilation strategy. MoonBit ships four backends (WASM GC, JavaScript, native via C codegen, LLVM) on an OCaml-implemented compiler; Intent ships three (Rust, JavaScript, WASM) on a Go-implemented one. MoonBit's edge is years of training data; Intent's framing is auditability over breadth.
- **Prove** *(Verification)* — Same contract machinery, opposite politics. Prove ships refinement types and verb-based IO under the Prove Source License v1.0, which prohibits use as AI training data; Intent ships under Apache-2.0 and addresses its agent-instruction documents to the model directly.
*Detail page: https://agentlanguages.dev/languages/intent/ · Markdown companion: https://agentlanguages.dev/languages/intent.md*
## MoonBit
> AI-friendly general-purpose language. ICSE 2024 paper on real-time semantics-aware token sampling. Three years of training data.
**Camp:** Verification
**Author:** Hongbo Zhang / IDEA Shenzhen
**Implementation language:** OCaml
**Compilation target:** WASM GC, JavaScript, native (C codegen), LLVM
**Licence:** Unknown
**First seen:** January 2023
**Maturity:** working compiler
**Site:** https://www.moonbitlang.com
**Agent tooling:**
- `moon doc` AI symbol lookup
- MoonBit Pilot coding agent
- `declare` keyword for AI-native specification
### Key idea
AI-friendly general-purpose language with the deepest history in the
space — three years of training data, full toolchain across four
backends, a package registry (mooncakes.io), cloud IDE, and IDE plugins.
Published an ICSE 2024 paper on a real-time semantics-aware token
sampler. Backed by the International Digital Economy Academy (Shenzhen).
## The thesis.
MoonBit is the catalogue's exception that proves the rule. Most entries are recent (Jan–May 2026); MoonBit has been shipping since 2023. Most are single-author or small-team experiments; MoonBit is backed by the International Digital Economy Academy in Shenzhen and led by Hongbo Zhang, who created ReScript and contributed to OCaml. Most ship a thought experiment or an early implementation; MoonBit ships four backends, a package registry, a cloud IDE, and two IDE plug-ins.
The model doesn't need to be retrained. The sampler needs to know the type system.
The distinctive technical move is in how the model interacts with the compiler. The ICSE 2024 paper describes a real-time semantics-aware token sampler: as the model generates code, a fast type-checker prunes ill-typed continuations at the token level. The model can still hallucinate, but the hallucinations never get past the sampler. This is closer to the verification camp's "make it checkable" intuition than the syntactic camp's "make it easier to generate" — applied at the layer where the generation actually happens. ## Distinctive moves. - **Real-time semantics-aware sampling.** The compiler participates in token generation, not just post-hoc checking. - **`declare` keyword.** A first-class form for AI-native specification of intent and constraints, distinct from regular function signatures. - **Four backends.** WASM GC, JavaScript, native (via C codegen), and LLVM. No other entry in the catalogue targets this breadth. - **mooncakes.io.** A first-party package registry. Most catalogued languages don't have one because there's no ecosystem yet; MoonBit has the ecosystem. - **Three years of training data.** The unmatched advantage. Every other entry is racing to generate examples; MoonBit has them. ## Maturity. The most mature project in the catalogue by a clear margin. 2,115+ stars (the second-highest after Zero). Full toolchain, multiple backends in active production use, IDE integrations across both major desktop IDEs, working debugger. Documentation depth and developer experience are at a level no other entry approaches. The pragmatic question is whether MoonBit's general-purpose framing remains compelling against narrower agent-native languages as the field matures. MoonBit's bet is that general-purpose plus AI-aware tooling beats agent-native plus narrow ecosystem. The next two years will test it. ## Agent tooling. `moon doc` exposes AI-friendly symbol lookup; MoonBit Pilot is a coding agent that targets MoonBit specifically; the `declare` keyword gives agents a structured way to express intent. Less prominent than the SKILL.md/AGENTS.md pattern other catalogue entries use — MoonBit's bet is that an agent that knows the language outperforms an agent reading instructions about the language. ### Design DNA - **Vera** *(Verification)* — Both verification camp; opposite breadth. MoonBit is a full-stack general-purpose language; Vera narrows to checkability and drops names entirely. MoonBit assumes the model needs help; Vera assumes the model needs supervision. - **Zero** *(Verification)* — Closest in industrial backing (Vercel Labs vs IDEA Shenzhen) and product framing. Zero leans syntactic (one obvious way to express things); MoonBit leans toward typed sampling at the model level. - **AILANG** *(Verification)* — Both ship effect typing; MoonBit's is conventional, AILANG's is row-polymorphic with capability-based carving (IO/FS/Net/Clock/AI). MoonBit's edge is the training data depth that no other entry has. ### Timeline - **2023** — MoonBit project initiated at IDEA Shenzhen under Hongbo Zhang. Pre-LLM-craze; framing changes over the following two years. - **Jan 2024** — ICSE 2024 paper on real-time semantics-aware token sampling for MoonBit code generation. - **2024–2025** — Toolchain hardens: four backends (WASM GC, JavaScript, native via C codegen, LLVM), package registry (mooncakes.io), cloud IDE, VS Code and IntelliJ plugins, debugger. - **2026** —declare keyword and MoonBit Pilot agent ship, repositioning the language explicitly as AI-native rather than just AI-friendly.
*Detail page: https://agentlanguages.dev/languages/moonbit/ · Markdown companion: https://agentlanguages.dev/languages/moonbit.md*
## NanoLang
> Mandatory shadow test blocks on every function. The proved core (NanoCore) has 193 Coq theorems with zero axioms. Multi-target codegen across C, WebAssembly, LLVM IR, PTX, and RISC-V.
**Camp:** Verification
**Also spans:** Syntactic
**Author:** Jordan Hubbard
**Implementation language:** C
**Compilation target:** C (default), WebAssembly, LLVM IR, PTX, RISC-V
**Licence:** Apache-2.0
**First seen:** September 2025
**Maturity:** working compiler
**Site:** https://jordanhubbard.github.io/nanolang
**Repo:** https://github.com/jordanhubbard/nanolang
**Agent tooling:**
- AGENTS.md
- CLAUDE.md
- MEMORY.md ("training reference for patterns and idioms")
- spec.json (machine-readable formal specification)
- .mcp.json
- .claude/, .cursor/, .factory/ config folders
- VS Code extension with LSP (nanolang-lsp) and DAP (nanolang-dap)
- Web playground with CodeMirror-6, share permalinks, live evaluation
### Key idea
Mandatory shadow test blocks on every function (the compiler refuses
to compile without one) plus 6,170 lines of Coq proving the language's
core. NanoCore is the proved subset, not the entire surface language —
algebraic effects, async/await, FFI, the VM, and multi-target codegen
sit outside the proof set. The first-person README persona ("I refuse
to compile a function unless you provide a shadow test block") is
documented as deliberate design, not flourish.
## The thesis.
NanoLang takes the verification camp's diagnosis literally: if LLMs are going to write code, the language should refuse to accept their work without tests, and the language's core should have proofs to back its promises. Every function declaration must be paired with a `shadow` test block; the compiler rejects functions without one. The proved core (NanoCore) has 6,170 lines of Coq across 9 files, 193 theorems and lemmas, zero axioms, zero `Admitted` — the proofs cover preservation, progress, determinism, semantic equivalence of big-step and small-step, and evaluator soundness against a fuel-based reference interpreter extractable to OCaml. The README announces it in first person: "I refuse to compile a function unless you provide a shadow test block for it."
"I am a minimal programming language designed for machines to write and humans to read. I require tests, I use unambiguous syntax, and my core is formally proved."
The distinctive move is the depth of what is actually proved — and the honesty about what is not. NanoCore covers integers, booleans, strings, arithmetic, conditionals, mutable variables, while loops, lambda/application, arrays, records, recursive functions, variants, and pattern matching. Algebraic effects, async/await, FFI, the VM, and the multi-target codegen sit outside the proof set, and `formal/README.md` says so plainly. Vera proves contracts at the function level via Z3; NanoLang proves the type system itself, from below. Same camp, complementary layers. ## What it looks like.fn greet(name: string) -> string { return (+ "Hello, " name) } shadow greet { assert (== (greet "World") "Hello, World") } fn main() -> int { (println (greet "World")) return 0 } shadow main { assert true }
Every function needs a shadow block. shadow main { assert true } exists only because the compiler refuses to compile without it — and the trivial case still has to satisfy the discipline.
<Inference> effect has no NanoLang analogue.
- **Aver** *(Verification)* — Same-camp neighbour on the ship-the-verification-artefacts axis. Aver exports per-function proofs to Lean 4 and Dafny; NanoLang ships its proofs alongside the source as Coq, zero axioms, 193 theorems.
- **Magpie** *(Syntactic)* — Cross-camp foil on the syntactic axis. Magpie strips ambiguity via SSA-as-surface; NanoLang reduces it via prefix-call disambiguation, mandatory annotations, and one canonical form. Different mechanisms for the same diagnosis.
- **NERD** *(Syntactic)* — Syntactic-camp direction without the formalism. NERD ships a token-friendly surface and no type system; NanoLang ships Coq proofs. The contrast clarifies what 'unambiguous syntax' costs to back up.
*Detail page: https://agentlanguages.dev/languages/nanolang/ · Markdown companion: https://agentlanguages.dev/languages/nanolang.md*
## Pact
> Intent blocks on every function and route, pipeline syntax, explicit effects, errors as types. Single binary that ships an HTTP server, SQLite, an LSP, and an MCP server with five tools.
**Camp:** Verification
**Author:** Viktor Kikot
**Implementation language:** Rust
**Compilation target:** Interpreted (tree-walking)
**Licence:** MIT
**First seen:** April 2026
**Maturity:** working compiler
**Site:** https://github.com/KikotVit/pact-lang
**Repo:** https://github.com/KikotVit/pact-lang
**Agent tooling:**
- CLAUDE.md
- .mcp.json
- Built-in MCP server (pact mcp) with 5 tools
- LSP server (pact lsp)
- Built-in docs (pact docs)
### Key idea
Every function and route opens with an `intent` clause and a `needs` list declaring effects. Errors are part of the type signature (`-> User or NotFound`), data flows through left-to-right pipelines, and the runtime ships HTTP, SSE, SQLite, JWT, and an MCP server inside a single ~5MB binary. The bet is that surfacing intent, effects, and outcomes at the signature level lets agents skip the reverse-engineering pass.
## The thesis.
Pact's diagnosis is that most backend code is glue, and that the glue is exactly where agents waste iterations. Intent is hidden in comments that drift, effects are hidden in implementation bodies, and errors are hidden in exception hierarchies. The verification-camp move is to drag all three back into the signature: every function and every route opens with an `intent` string, declares a `needs` list of effects, and resolves to a sum type like `User or NotFound`. The compiler reads that as the contract; the type checker, formatter, LSP, and MCP server all consume the same declarations.
Every function says why it exists. Errors are data, not explosions.
The distinctive move is the breadth of what ships inside one binary. A `.pact` file declares `app Notes { port: 8080, db: "sqlite://notes.db" }` and `pact run` brings up an HTTP server with SSE streaming, SQLite in WAL mode, JWT auth, a structured logger, and a built-in MCP server — no dependencies, no ORM, tables auto-created from struct fields. This is close to Aver in design DNA (declared intent + declared effects + colocated checks), but where Aver lifts verify blocks into Lean 4 and Dafny, Pact spends its complexity budget on the runtime an agent will actually drive. ## What it looks like.intent "create a new user with default Viewer role"
fn create_user(data: NewUser) -> User or BadRequest
needs db, time, rng
{
let existing: User = find_by_email(data.email)
return BadRequest { message: "Email already taken" } if existing != nothing
let user: User = {
id: rng.uuid(),
email: data.email,
role: "Viewer",
active: true,
created_at: time.now(),
}
db.insert("users", user)
}
An intent line, an effect list, and a sum-typed return — all in the signature before the body.
Source code is covered by an anti-training licence.
Where Prove diverges from Vera and Aver is on the politics rather than the mechanics. Vera publishes a benchmark and invites models to compete. Aver exports proofs and ships an `llms.txt`. Prove ships an *anti-training* licence — the Prove Source License v1.0, applied automatically by `proof new` to every project — that prohibits use of `.prv` source code as training data, in dataset inclusion, vector stores, RAG indices, embedding databases, synthetic data generation, sublicensing for AI use, and downstream propagation. AILANG sits closest on effect typing (both ship typed effects), but Prove encodes effect category in the verb itself rather than in a row-polymorphic effect list. The project frames its own stance as "AI resistance" and states that generating semantically correct Prove code "requires genuine understanding, not pattern matching." ## What it looks like.matches apply_discount(discount Discount, amount Price) Price ensures result >= 0 ensures result <= amount requires amount >= 0 from FlatOff(off) => max(0, amount - off) PercentOff(rate) => amount * (1 - rate)
A pure verb (matches), hard postconditions, and a precondition — all enforced at compile time.
attached, detached, listens, renders backed by stackful coroutines), terminal UI, GUI via SDL2 + Nuklear, and the proof CLI wrapper.
- **Mar 2026** — v1.2.0 enforces verb semantic guarantees across 22 stdlib modules (~105 corrections); recursive variant types and Value<T> phantom types land.
- **Apr 2026** — v1.3.0 makes tree-sitter the sole parser, renames reads to derives, adds the dispatches verb, integrates linting into the check pipeline. v1.3.1 is a bugfix release.
*Detail page: https://agentlanguages.dev/languages/prove/ · Markdown companion: https://agentlanguages.dev/languages/prove.md*
## Raskell
> Not a new language. The thesis is that declarative Haskell already plays to LLM strengths and the barrier is toolchain friction. hx wraps cabal/stack/ghcup/HLS in Rust; BHC is a clean-slate Haskell 2026 compiler with per-profile runtimes.
**Camp:** Verification
**Author:** Raffael Schneider
**Implementation language:** Rust
**Compilation target:** GHC/Cabal/HLS (hx, wrapping); LLVM, WebAssembly, GPU (BHC, in development)
**Licence:** MIT (hx); BSD-3-Clause (BHC)
**First seen:** April 2026
**Maturity:** early implementation
**Site:** https://raskell.io
**Repo:** https://github.com/arcanist-sh/hx
### Key idea
Raskell is a position more than a project. Schneider argues, across a public trilogy on raskell.io, that types-as-proofs and pure-by-default already give LLMs the formal scaffolding they need; the reason Haskell loses to procedural languages in agent benchmarks is the surrounding toolchain — fragmented build tools, slow compiles, one-size-fits-all runtime. The work under arcanist.sh is the engineering response: hx (a Rust binary that wraps GHC, Cabal, GHCup, and HLS behind one interface) and BHC (an in-development Haskell 2026 compiler with multiple runtime profiles).
## The thesis.
Raskell's thesis is that the verification camp has the right diagnosis but the wrong locus of intervention. In Schneider's public writing, declarative languages with strong type systems already play to what LLMs are good at: generating expressions that satisfy formal constraints, rather than simulating execution across many mutable steps. Type-checked Haskell looks like a proof; the compiler is the proof checker; once the types align, large classes of error are eliminated by construction. The reason this is not the dominant agent-coding stack today, Schneider argues, is friction outside the language — three overlapping build tools, slow cold builds, a runtime that assumes one performance profile fits every use case.
The language was right. The surrounding infrastructure was not.
The distinctive move is the refusal to design a new language at all. Where AILANG, Vera, and Aver each ship a fresh syntax with effect typing built in, Raskell extends Haskell. The engineering lives under the arcanist-sh GitHub organisation: hx, a Rust binary that wraps GHC, Cabal, GHCup, and HLS behind one interface; and BHC, an in-development clean-slate compiler targeting the Haskell 2026 specification with profile-specific runtimes selected at compile time. The bet is that the typed substrate is already correct and the missing layer is operational, not linguistic. ## Distinctive moves. - **The position, stated.** A trilogy on raskell.io ("The Last Programming Language Might Not Be for Humans", "What Comes After the Last Programming Language", "Source Code Is the New Assembly") makes the case that the medium-term winner is declarative-plus-typed, not procedural-plus-checked. - **hx wraps before it replaces.** Cargo-workspace Rust binary built from ~14 crates (`hx-cli`, `hx-core`, `hx-toolchain`, `hx-cabal`, `hx-solver`, `hx-lsp`, `hx-plugins`, …); commands cover `build`, `run`, `test`, `lock`, `sync`, `watch`, `fmt`, `lint`, `docs`, `publish`; lockfile is `hx.lock`. The repo describes the strategy as wrap GHC/Cabal/GHCup/HLS first, replace last. - **BHC targets multiple runtime profiles.** The BHC repository README lists four profiles today — default, server, numeric, edge — selected at compile time. Schneider's essays and the arcanist-sh organisation profile describe a planned six (adding `realtime` and `embedded`); the catalogue treats the four shipped in the repo as ground truth and the additional two as stated intent. - **Conservative scope per release.** Both projects are pre-1.0. The arcanist-sh org profile advertises a 5.6× cold-build speedup over Cabal, but no methodology, benchmark suite, GHC/Cabal versions, or hardware are published anywhere reachable; treat the number as a stated marketing claim, not a verified measurement. - **No agent-specific surface yet.** No SKILL.md, AGENTS.md, MCP server, or `llms.txt` in either repo. The argument is that a well-typed Haskell program already gives an agent what it needs; tooling for agents is downstream. ## Maturity. Early. hx is MIT-licensed Rust, at v0.6.0 (Feb 2026), with 12 tagged releases, 129 commits, and 23 stars; it currently orchestrates GHC/Cabal/GHCup/HLS rather than replacing them. BHC is BSD-3-Clause, at v0.2.1 (Jan 2026), with 389 commits, 3 releases, 11 stars, and a single contributor. The roadmap in the BHC README marks the parser, type checker, Core IR, and one codegen path as substantially complete and WASM/GPU lowerings as in progress; no conformance suite or benchmark numbers ship in the repository today. The bet is on a multi-year arc, and the public surface reflects that — essays and infrastructure now, language-level claims later. ## Agent tooling. None shipped at present. The position Schneider defends is that the right intervention is upstream of agent-specific files: a faster, more coherent build, a compiler whose error messages and runtime profile match the deployment target, and a type system the agent can already use as a proof obligation. Whether that bet pays off depends on whether the medium-term arc Schneider describes — declarative-plus-typed beats procedural-plus-checked once the tooling friction is gone — actually materialises before agent-native languages with built-in MCP surfaces lock in a different shape. ### Design DNA - **AILANG** *(Verification)* — Closest design relative. Both bet on purely functional, effect-typed code as the right shape for agents to author. AILANG designed a new language; Raskell argues the language is already fine and rebuilds the tooling around Haskell. - **MoonBit** *(Verification)* — Industrial-backing foil. MoonBit pairs a sampler-level verification story with three years of training data and a Shenzhen-funded team; Raskell is a one-person Swiss effort betting that better tooling around an established language beats a new language with a new ecosystem. - **Vera** *(Verification)* — Schneider's essays cite Vera by name as the strongest example of the 'explicit language for machines' bet, then take the opposite bet: declarative types-as-proofs over imposed-contracts-plus-Z3. *Detail page: https://agentlanguages.dev/languages/raskell/ · Markdown companion: https://agentlanguages.dev/languages/raskell.md* ## Vera > Mandatory contracts on every function. Z3 SMT verification. Typed slot references replace variable names. LLM inference is a first-class typed effect. **Camp:** Verification **Also spans:** Orchestration **Author:** Alasdair Allan **Implementation language:** Python **Compilation target:** WebAssembly **Licence:** MIT **First seen:** February 2026 **Maturity:** working compiler **Site:** https://veralang.dev **Repo:** https://github.com/aallan/vera **vera-bench:** https://github.com/aallan/vera-bench **Agent tooling:** - SKILL.md - AGENTS.md - CLAUDE.md - LLM-oriented diagnostics - stable error codes (E001–E702) - JSON diagnostic output ### Key idea Mandatory requires/ensures/effects contracts on every function. Three-tier Z3 SMT verification. Typed De Bruijn slot references (@T.n) instead of variable names — the only language in the space that drops names. LLM inference is a first-class typed effect. The thesis: the model doesn't need to be right, it needs to be checkable. ## The thesis. Vera takes the verification camp's diagnosis literally. If LLMs make semantic errors faster than humans can catch them by reading code, the compiler has to do the catching. Every function declares preconditions and postconditions; the compiler discharges them via the Z3 SMT solver in a three-tier scheme (compile-time, runtime guard, runtime check) before any code runs.The model doesn't need to be right. It needs to be checkable.
The distinctive move is replacing variable names with typed slot references. A function `safe_divide(@Int, @Int -> @Int)` has no parameter names — its arguments are referred to as `@Int.0` (most recent) and `@Int.1` (next most recent) using De Bruijn indexing. The empirical literature shows models are particularly vulnerable to naming-related errors: choosing misleading names, reusing names incorrectly, losing track of which name refers to which value. Vera's answer is to remove names from the language entirely. ## What it looks like.public fn safe_divide(@Int, @Int -> @Int) requires(@Int.1 != 0) ensures(@Int.result == @Int.0 / @Int.1) effects(pure) { @Int.0 / @Int.1 }
Division by zero is not a runtime error — it is a type error. A caller that can't prove the denominator is non-zero won't compile. @Int.1 is the first parameter (next-most-recent binding); @Int.0 is the second (most-recent).
<Inference> as one effect, AILANG carves it into IO/FS/Net/Clock/AI.
- **Magpie** *(Syntactic)* — Cross-camp foil. Magpie strips ambiguity at the surface (SSA form); Vera adds a layer of mechanical checks. Different bets on where the error budget should be spent.
### Timeline
- **Feb 2026** — First public release (v0.0.4). Grammar, parser, and type checker. No verifier yet.
- **Mar 2026** — Z3 verifier lands. Three-tier verification scheme published. First externally-contributed example merges.
- **Apr 2026** — VeraBench published with 93% flagship correctness vs Python baseline on zero training data.
- **Apr 2026** — <Inference> effect added as first-class typed effect. Aver becomes first external language integrated into VeraBench.
- **May 2026** — v0.0.157 releases. 300+ stars, 76 conformance programs, 13-chapter spec.
*Detail page: https://agentlanguages.dev/languages/vera/ · Markdown companion: https://agentlanguages.dev/languages/vera.md*
## Zero
> Vercel Labs' agent-first systems language. Sub-10 KiB native binaries. Structured JSON diagnostics with stable codes and typed repair plans. One obvious path.
**Camp:** Verification
**Also spans:** Syntactic
**Author:** Chris Tate and Matt Van Horn / Vercel Labs
**Implementation language:** C (zero-c bootstrap); self-hosted compiler-zero in progress
**Compilation target:** Native binaries (direct ELF/Mach-O/PE emitters, no LLVM), WebAssembly
**Licence:** Apache-2.0
**First seen:** May 2026
**Maturity:** early implementation
**Site:** https://zerolang.ai
**Repo:** https://github.com/vercel-labs/zerolang
**Agent tooling:**
- structured JSON diagnostics
- stable error codes
- typed repair plans
- zero skills (version-matched agent guidance)
- zero explain
- zero fix --plan --json
- zero doctor
### Key idea
Zero is Vercel Labs' bet on agent-first systems programming. The compiler
emits structured JSON diagnostics with stable error codes (NAM003 means
"unknown identifier" and will keep meaning that), typed repair plans an
agent can apply without parsing prose, and version-matched guidance served
through the CLI itself rather than scraped from a docs site. The language
is intentionally explicit: capability objects on main, no hidden allocator,
no implicit async, one obvious path for most things.
## The thesis.
Zero is Vercel Labs' bet that the bottleneck in agentic coding is not the language but the channel between the compiler and the agent. The standard loop is fragile: the compiler emits prose written for human engineers, the agent parses it as text, the agent guesses at a fix, the next compile produces a new prose error in a slightly different format. Zero's answer is to replace the prose channel with a structured one. `zero check --json` emits a stable error code (`NAM003`), a human-readable message, a line number, and a typed `repair` object an agent can act on. `zero fix --plan --json` returns a machine-readable edit plan. `zero explain` returns structured explanations against the installed compiler version.
Humans read the message. Agents read the JSON.
The distinctive move sits at the language level, not the toolchain level: Zero collapses the syntactic and verification camps into a single product decision. The language documents itself as preferring "one obvious way to express most things, even when that makes code more explicit than a human might choose," which is syntactic-camp framing; but the obviousness is bought with capability objects on `main`, explicit `raises` markers, and effect-visible signatures, which is verification-camp machinery. Where MoonBit, the catalogue's other industrially backed verification entry, invests in semantic-aware token sampling, Zero invests in making the surface area small enough that an agent doesn't need help generating it in the first place. ## What it looks like.pub fun main(world: World) -> Void raises { check world.out.write("hello from zero\n") }
There is no hidden global process object. world: World is an explicit capability passed in by the runtime; raises declares the function can propagate errors; check handles a fallible operation. A function that doesn't ask for World cannot write to stdout.
` returns explanations; `zero fix --plan --json` returns edit plans; `zero skills get zero --full` returns version-matched workflows. `zero graph --json`, `zero size --json`, `zero routes --json`, and `zero doctor --json` round out the inspection surface. Vercel's separately released `skills.sh` ecosystem is consumable by Claude Code, Cursor, Codex, and other agent harnesses through the same Agent Skills spec that `zero skills` follows.
### Design DNA
- **MoonBit** *(Verification)* — Industrial backing parallel. Vercel Labs and IDEA Shenzhen are the two best-resourced bets in the catalogue; MoonBit invests in semantic-aware sampling, Zero invests in structured compiler output and version-matched skills.
- **NERD** *(Syntactic)* — Cross-camp foil. Both lean on a small keyword vocabulary and 'one obvious way' framing; NERD does it for syntactic legibility, Zero does it inside a verification project with capability-typed effects and a typed repair API.
- **Boruna** *(Orchestration)* — Structured-diagnostics parallel. Zero ships JSON diagnostics with typed repair IDs at the language level; Boruna ships hash-chained evidence bundles at the runtime level. Both reject prose as an interface for agents.
*Detail page: https://agentlanguages.dev/languages/zero/ · Markdown companion: https://agentlanguages.dev/languages/zero.md*
---
# Orchestration camp (5)
> It isn't a language problem. It's an agent-coordination problem. The orchestration camp re-frames the question — the trouble with LLM-authored code, they argue, isn't any specific defect in the code; it's that agents need to be sequenced, sandboxed, audited, and approved by humans at the right points. The language is just the substrate; the runtime is where the action is.
## Boruna
> Deterministic, capability-safe workflow execution. Every effect declared, policy-gated. Hash-chained tamper-evident evidence bundles.
**Camp:** Orchestration
**Also spans:** Verification
**Author:** escapeboy
**Implementation language:** Rust
**Compilation target:** Bytecode (custom VM)
**Licence:** MIT
**First seen:** April 2026
**Maturity:** working compiler
**Site:** https://github.com/escapeboy/boruna
**Repo:** https://github.com/escapeboy/boruna
**Agent tooling:**
- MCP server with 10 tools
- AGENTS.md
- CLAUDE.md
- diagnostics and auto-repair commands
### Key idea
Deterministic, capability-safe workflow execution for auditable AI
systems. DAG workflows where steps are `.ax` source files. Every side
effect — LLM calls, HTTP, database, filesystem — is declared and
policy-gated at the VM level. Hash-chained tamper-evident evidence
bundles. Deterministic replay. Approval gates for human-in-the-loop.
The pitch: when a regulator asks what exactly ran and what the model
returned, you can prove it.
## The thesis.
Boruna doesn't think the problem with LLM code is the code. It thinks the problem is that when an agent system does something consequential — sends an email, transfers money, modifies a database — you need to be able to prove what ran, what the model said, and who approved it. That's not a language problem. That's a runtime problem. So Boruna builds the runtime.
When a regulator asks what exactly ran, you can prove it.
The unit of computation is a DAG workflow. Each step is an `.ax` source file. Every side effect — LLM call, HTTP request, database write, filesystem mutation — is declared in the source and policy-gated at the VM level. The VM refuses to execute a step that would perform an undeclared effect; the policy layer lets administrators forbid specific declared effects per workflow or per role. Every executed step writes to a hash-chained evidence bundle that's tamper-evident; the bundle is sufficient to replay the workflow deterministically (same inputs, same model responses recorded, same outputs).
## Distinctive moves.
- **Capability-safe by construction.** A step can't reach for an effect it didn't declare. The VM is the enforcement point, not a linter.
- **Hash-chained evidence bundles.** Every step's inputs, outputs, model responses, and approvals chain into a Merkle structure. Tampering breaks the chain.
- **Deterministic replay.** Re-running a workflow against its evidence bundle produces bit-identical results. No "it worked on my machine" for LLM-driven workflows.
- **Approval gates.** Human-in-the-loop steps are a first-class workflow primitive, not bolted on. The approval becomes part of the evidence.
- **MCP server with 10 tools.** Boruna's agent-facing surface lets a coding agent author, validate, and run workflows without leaving the protocol.
## Maturity.
v0.2.0 with 34 commits and 1 release. 557+ tests passing across a 9-crate Rust workspace covering the compiler (lexer, parser, type checker, code generator), the bytecode VM, the orchestrator, and the MCP server. Single-author project; zero stars at the time of cataloguing, which dramatically understates the engineering depth here. The architecture is more carefully thought through than several entries with two orders of magnitude more attention.
The bet is that the regulated-industries angle (financial services, healthcare, government) will discover Boruna before the broader market does. The agent-system gold rush will eventually hit regulators, and when it does, "I can prove what ran" stops being a feature and starts being a requirement.
## Agent tooling.
`AGENTS.md` and `CLAUDE.md` orient agents working in the repository. The MCP server exposes ten tools an agent can call to draft workflows, run them in dry-run mode, validate effect declarations against policy, inspect evidence bundles, and trigger approvals. Diagnostics ship with auto-repair commands — when the type checker rejects a workflow, the diagnostic suggests the specific edit that would satisfy it.
### Design DNA
- **Vera** *(Verification)* — Cross-camp cousin. Both treat agent code as untrusted by default; Vera builds the trust at the type level (contracts), Boruna builds it at the runtime level (policy-gated effects + evidence bundles). Vera's <Inference> effect is conceptually close to Boruna's declared LLM call.
- **Pel** *(Orchestration)* — Same camp, different stack. Pel argues for grammar-level capability control; Boruna implements bytecode-level capability gating. Pel exists as an academic paper; Boruna ships as a 9-crate Rust workspace.
- **Quasar** *(Orchestration)* — Shares the approval-gate intuition. Quasar measured 52% fewer user-approval interactions by lifting approval into the language; Boruna lifts it into the runtime with deterministic replay so the approval can be audited after the fact.
- **Plumbing** *(Adjacent)* — Plumbing defines the wiring between agents at the type level (typed channels, structural morphisms); Boruna defines what runs inside one agent and how it's audited. Complementary rather than competing.
### Timeline
- **Apr 2026** — v0.2.0 published. 9-crate Rust workspace: compiler (lexer, parser, type checker, code generator), bytecode VM, orchestrator, MCP server.
- **Apr 2026** — 557+ tests passing. Hash-chained evidence bundle format stabilises.
- **May 2026** — Catalogued. Still 0 stars; the engineering depth runs ahead of the public profile.
*Detail page: https://agentlanguages.dev/languages/boruna/ · Markdown companion: https://agentlanguages.dev/languages/boruna.md*
## Lumen
> Markdown-native source (.lm.md). Algebraic effects, grants for tool and model calls, @deterministic compile-time enforcement, and pipeline / machine / memory process kinds. A language for humans authoring agent workflows.
**Camp:** Orchestration
**Author:** alliecatowo
**Implementation language:** Rust
**Compilation target:** LIR bytecode → register-based VM (~100 opcodes); WebAssembly via lumen-wasm
**Licence:** MIT
**First seen:** February 2026
**Maturity:** working compiler
**Site:** https://alliecatowo.github.io/lumen/
**Repo:** https://github.com/alliecatowo/lumen
**Agent tooling:**
- AGENTS.md (multi-agent dev team config)
- CLAUDE.md
- .opencode/agents/
- LSP server (lumen-lsp) with semantic search
- VS Code extension
- Tree-sitter grammar
- MCP provider crate (lumen-provider-mcp)
- emit-bytecode-as-JSON CLI (`lumen emit`)
### Key idea
Lumen is for humans authoring agent workflows, not for agents to
author general code — it earns its place in the catalogue via the
orchestration-camp criterion of first-class effect declarations for
model calls and agent-coordination primitives. Algebraic effects
appear in function signatures after a slash; grants constrain every
call to a tool with explicit caps; @deterministic enforces rejection
of nondeterministic ops at compile time; pipeline / machine / memory
are first-class process kinds. Source is markdown-native — .lm.md
files unify code and documentation in one artefact.
## The thesis.
Lumen is a language for humans, but its target is the substrate above the model rather than the model itself. The catalogue's nominal inclusion bar is "designed for LLMs to author code"; Lumen earns its place via the orchestration-camp criterion of first-class effect declarations for model calls and agent-coordination primitives. The vocabulary is the giveaway: `cell` (function), `effect`, `grant`, `agent`, `pipeline`, `machine`, `memory` are all language keywords. Functions declare effects in the return type after a slash (`cell main() -> String / {Log}`); tools must be granted with explicit caps (`grant Chat max_tokens 1024 temperature 0.7`); the runtime can be locked into `@deterministic true` mode that rejects nondeterministic operations at compile time, not at runtime.
Build deterministic agent workflows with static types, first-class AI primitives, and markdown-native source files.
The distinctive move is making the source file the same artefact as the documentation. Three source extensions ship: `.lm.md` (markdown with fenced Lumen blocks), `.lm` (raw source), and `.lumen` (markdown-native). The compiler's first pipeline stage is markdown extraction — code and prose share one file, and the model writing one writes the other. Where Boruna does deterministic-workflow enforcement at the bytecode VM via policy-gated effects and hash-chained evidence, Lumen does it at the type system via algebraic effects, grants, and the `@deterministic` annotation. Same orchestration-camp diagnosis, different layer.
## What it looks like.
effect Log
cell info(msg: String) -> Unit
end
cell main() -> String / {Log}
perform Log.info("Starting")
return "Done"
end
handle main() with Log.info(msg) -> resume(unit)
print("LOG: {msg}")
end
The / {Log} in the return type declares the effect. perform invokes it; handle ... with ... discharges it. One-shot delimited continuations under the hood. cell is the function keyword — Lumen does not use fn.
## Distinctive moves.
- **Markdown-native source.** `.lm.md` files contain markdown prose with fenced `lumen` blocks. The compiler extracts code as its first pipeline stage. Documentation and implementation are one artefact.
- **`cell` is the function keyword.** Not `fn`. Cells take typed parameters and declare effects in the return type after a slash.
- **Algebraic effects, first-class.** `effect Log` declarations, `perform` to invoke, `handle ... with ...` to discharge. The runtime uses one-shot delimited continuations; opcodes `Perform`, `HandlePush`, `HandlePop`, and `Resume` are first-class in the VM.
- **Grants as syntactic policy.** `grant Chat max_tokens 1024 temperature 0.7` constrains every call to that tool. Policy lives in source, not configuration — Boruna's effect declarations lifted to the language surface.
- **`@deterministic true` mode.** A compile-time annotation that rejects `uuid()`, `timestamp()`, and other nondeterministic operations. The static analogue of Boruna's runtime deterministic replay.
- **Three process kinds.** `pipeline` for auto-chained stages (extract → transform → load), `machine` for state graphs, `memory` for key-value stores. Each is a first-class language construct rather than a library pattern.
- **MCP as a provider crate.** `lumen-provider-mcp` ships alongside `lumen-provider-http`, `lumen-provider-json`, `lumen-provider-fs`, `lumen-provider-gemini`, `lumen-provider-crypto`, `lumen-provider-env`. MCP is one tool source among several, not the privileged one.
## Maturity.
v0.1.10 (February 2026), 352 commits, ~5,300 passing tests across all crates (AGENTS.md figure; the README's 1,365+ is at a different cut). MIT-licensed, written in Rust (96.5% of the source), compiles to LIR bytecode for a register-based VM with ~100 opcodes, 32-bit fixed-width encoding, and COW collections via `Rc::make_mut`. The workspace contains 12+ crates covering compiler, VM, runtime, CLI, LSP, JIT codegen, WebAssembly bindings, tensor operations, and provider integrations. Single-author at the human level (`alliecatowo`); AGENTS.md notes that "only the Delegator agent commits code" — the contributors listing reflects agent runs of the project's own multi-agent dev team.
## Agent tooling.
The agent-facing surface is unusually elaborate. `AGENTS.md` declares a multi-agent development team — Delegator (Gemini 3 Pro), Auditor, Debugger (Claude Opus 4.6), Coder (Claude Sonnet 4.5), Worker (Claude Haiku 4.5), Tester, Task Manager, Performance, Planner — each with a defined role and only the Delegator authorised to commit. `CLAUDE.md` and `.opencode/agents/` provide further orientation. The LSP includes semantic search; the VS Code extension covers `.lm.md` files; a tree-sitter grammar ships at `tree-sitter-lumen/`. The CLI's `lumen emit` mode outputs bytecode as JSON for downstream agent consumption.
### Design DNA
- **Boruna** *(Orchestration)* — Closest design relative. Both target deterministic AI workflows. Boruna enforces at the bytecode VM via policy-gated effects and hash-chained evidence bundles; Lumen enforces at the type system via algebraic effects, grants, and a compile-time @deterministic annotation. Both ship MCP integration. Lumen targets humans authoring workflows; Boruna targets auditable execution.
- **Plumbing** *(Adjacent)* — Plumbing wires agents (typed channels, copy-discard symmetric monoidal category); Lumen is what runs inside a single node of that wiring. Complementary substrates rather than competitors.
- **AILANG** *(Verification)* — Cross-camp on effect systems. AILANG's row-polymorphic Hindley-Milner with capability categories (IO/FS/Net/Clock/AI) is the verification cousin of Lumen's effect-row syntax. Different mechanism, same diagnosis: model calls must be visible in the signature.
*Detail page: https://agentlanguages.dev/languages/lumen/ · Markdown companion: https://agentlanguages.dev/languages/lumen.md*
## Marsha
> Functional, English-based language whose .mrsh files (declaration, description, examples) are compiled to tested Python by an LLM. Alpha implementation; last maintainer activity early Aug 2023.
**Camp:** Orchestration
**Author:** David Ellis (Alan Technologies)
**Implementation language:** Python
**Compilation target:** Python (generated by an LLM, with auto-generated tests)
**Licence:** MIT
**First seen:** July 2023
**Maturity:** early implementation
**Site:** https://github.com/alantech/marsha
**Repo:** https://github.com/alantech/marsha
### Key idea
Marsha's framing is that the LLM is the compiler. A .mrsh source file is a
Markdown-shaped specification with three sections per function: a typed
declaration, an English description, and a list of input/output examples.
The Marsha toolchain prompts an LLM to produce Python that satisfies the
declaration, uses the examples to synthesise a test suite, and iterates with
corrective feedback until the tests pass or the attempt budget is exhausted.
## The thesis.
Marsha's framing predates almost every other entry in the catalogue: the LLM is the compiler. A `.mrsh` source file is a Markdown-shaped specification with three sections per function — a typed declaration (`# func name(InputType): OutputType`), an English description of behaviour, and a bullet list of input/output examples including expected error cases. The Marsha toolchain prompts an LLM to generate Python that satisfies the declaration, uses the examples to synthesise a pytest suite, runs the suite, and iterates with corrective feedback until the tests pass or the configured attempt budget is exhausted. CLI flags (`-a` attempts, `-n` parallel "thought" threads, `-q` quick-and-dirty, `--exclude-main-helper`) expose the iteration parameters directly to the user. Generated programs ship with an auto-attached CLI wrapper and an optional REST-server mode (`-s`).
## Published results.
The repository ships an alpha implementation, an examples directory (general-purpose, web-scraping, data-mangling), and a CI workflow that times compilations. The README's only quantitative target is its own roadmap: "We aim for 80%+ accuracy on our examples", with the roadmap looking to push that "above 90%". The compiler requires `OPENAI_ORG` and `OPENAI_SECRET_KEY`; support for other or local LLMs is listed as planned but unimplemented. The `setup.py` PyPI classifier is `Development Status :: 2 - Pre-Alpha`.
## Status.
The repository is MIT-licensed, was launched alongside a Show HN post on 1 August 2023 (news.ycombinator.com item 36864021), and reached the high-hundreds in stars with around a dozen forks. The last maintainer activity on the main branch — pull requests #159–#164 from `dfellis` and issue #165 from `depombo` — is dated 1–8 August 2023 (PR #164 "Embed Llama.cpp into Marsha for local usage" opened 7 Aug 2023; issue #165 "Add LlamaCPP support" opened 8 Aug 2023); the project has received no further maintainer pull requests or issues since. Both principals subsequently moved on (David Ellis to IaSQL and later the Alan-lang project; Luis de Pombo's LinkedIn lists continued tenure at Alan Technologies). Marsha is catalogued because the "LLM is the compiler" framing it shipped in 2023 anticipates the 2025 orchestration papers in this camp, not because the alpha implementation is under active development.
### Design DNA
- **Pel** *(Orchestration)* — Same camp, two years apart. Marsha (2023) treats the LLM as a compiler emitting Python under English+examples; Pel (2025) treats the LLM as a code emitter constrained by a grammar designed for it. Both predate the now-common 'agents write code' framing.
- **Boruna** *(Orchestration)* — Opposite stance on where the LLM sits in the stack. Marsha puts the LLM at the back end of a compiler. Boruna treats every LLM call as a policy-gated effect inside a deterministic VM. Same camp, inverted topology.
- **Quasar** *(Orchestration)* — Different generation: Quasar measures execution-time and approval-interaction reductions on ViperGPT/CaMeL; Marsha measures end-to-end compile success rate against its own examples ('we aim for 80%+ accuracy').
*Detail page: https://agentlanguages.dev/languages/marsha/ · Markdown companion: https://agentlanguages.dev/languages/marsha.md*
## Pel
> Lisp-flavoured language for orchestrating LLM agents, with capability control enforced at the grammar level and a REPeL self-healing loop modelled on Common Lisp restarts.
**Camp:** Orchestration
**Author:** Behnam Mohammadi (CMU)
**Implementation language:** N/A (paper-only)
**Compilation target:** N/A (paper-only)
**Licence:** N/A (academic paper)
**First seen:** April 2025
**Maturity:** research paper
**Site:** https://arxiv.org/abs/2505.13453
**Paper:** https://arxiv.org/abs/2505.13453
### Key idea
Pel argues that orchestrating LLM agents should not rely on Python plus a sandbox.
Instead, the grammar itself is the capability surface: an LLM emits Pel under
constrained generation, and an action the grammar cannot express is an action
the agent cannot take. The runtime adds piping, first-class closures, natural
language conditions evaluated by an LLM, automatic parallelisation via static
dependency analysis, and a REPeL (Read-Eval-Print-Loop) with Common Lisp-style
restarts and helper agents for automated error correction.
## The thesis.
Pel's diagnosis is that function/tool calling and free-form Python code generation each fail the orchestration problem from opposite directions: tool calling cannot express control flow, and Python is too expressive to safely run without a sandbox. The paper introduces Pel as a Lisp-inspired, homoiconic, minimal-grammar language whose syntactic surface is the capability surface. Because the grammar is small enough to be used as a constrained-decoding target, an LLM cannot emit an action the grammar cannot express; capability control becomes a property of generation, not a property of runtime sandboxing. The design takes additional cues from Elixir (piping for linear composition), Gleam (typing discipline), and Haskell (first-class closures and partial application). A REPeL — Read-Eval-Print-Loop with Common Lisp-style restarts — couples an evaluator to LLM-powered helper agents that propose restart choices when an error is signalled, so error recovery is a language feature rather than an application concern.
## Published results.
The paper is a design and rationale document rather than a benchmark study. It specifies the grammar, data types, closure semantics, piping operators, list operations, control flow, the natural-language condition form, and automatic asynchronicity via static dependency analysis. Pel is the implementation substrate for BEACON (Business Enhancement through Adaptive COordinated Networks), Mohammadi's separate SSRN paper (abstract 5191583), which describes a hierarchical multi-agent framework distributing specialised knowledge across marketing, finance, HR, and strategic-planning agents for small and family-owned businesses; BEACON reports advantages over single-model generative AI on information retrieval accuracy, cost-efficiency, and interpretability, with Pel cited as the orchestration substrate.
## Status.
Pel exists as an arXiv preprint (v1 3 Apr 2025; v2 9 Jun 2025) by a single author who completed a PhD at CMU Tepper in 2025 (thesis "Human-AI Interaction in the Era of Large Language Models (LLMs)" posted to KiltHub on 9 Jul 2025) and joined UT Dallas's Naveen Jindal School of Management as tenure-track faculty in Quantitative Marketing. The paper reports that Pel is used inside BEACON, which is supported by a BNY Foundation of Southwestern Pennsylvania fellowship via the Center for Intelligent Business at Tepper. No public implementation, package, or repository has been released; independent evaluation would require either a reference compiler or access to the BEACON codebase.
### Design DNA
- **Boruna** *(Orchestration)* — Same camp, different layer. Pel argues for grammar-level capability control; Boruna gates the same effects at the bytecode VM. Pel is a paper; Boruna ships a 9-crate Rust workspace.
- **Quasar** *(Orchestration)* — The other 2025 academic orchestration paper. Quasar transpiles a Python subset and instruments it with conformal prediction and approval gates; Pel replaces the surface language entirely and constrains generation against its grammar.
- **Marsha** *(Orchestration)* — Two-year-earlier predecessor on the same axis. Marsha treats the LLM as a compiler back-end producing Python; Pel treats the LLM as a code emitter constrained by a grammar designed for it.
*Detail page: https://agentlanguages.dev/languages/pel/ · Markdown companion: https://agentlanguages.dev/languages/pel.md*
## Quasar
> Penn group's LLM-agent language with automatic parallelisation, conformal-prediction reliability bounds, and approval-gated security; LLMs write a Python subset that transpiles to Quasar.
**Camp:** Orchestration
**Also spans:** Verification
**Author:** Stephen Mell et al. (Penn)
**Implementation language:** N/A (paper-only)
**Compilation target:** N/A (paper-only)
**Licence:** N/A (academic paper)
**First seen:** June 2025
**Maturity:** research paper
**Site:** https://arxiv.org/abs/2506.12202
**Paper:** https://arxiv.org/abs/2506.12202
### Key idea
Quasar (a backronym for QUick And Secure And Reliable) accepts code actions
in a Python subset that is transpiled to a custom language with three built-in
properties: automatic parallelisation of independent external calls, compositional
conformal prediction for uncertainty quantification, and explicit user-approval
gates around sensitive tool invocations. The bet is that the LLM keeps writing
the Python it knows while the runtime supplies the guarantees Python lacks.
## The thesis.
Quasar starts from the observation that LLM agents increasingly act by writing code, and that Python is the default not because it is well suited but because LLMs are fluent in it. The paper enumerates Python's weaknesses for this role — limited built-in support for performance, security, and reliability — and proposes a purpose-built language that addresses all three at once. Performance comes from automatic parallelisation of independent external calls, drawing on Mell, Kallas, Zdancewic and Bastani's "Opportunistically Parallel Lambda Calculus" (arXiv:2405.11361, published as Proc. ACM Program. Lang. 9, OOPSLA2, October 2025). Reliability comes from compositional conformal prediction (Ramalingam, Park and Bastani, "Uncertainty Quantification for Neurosymbolic Programs via Compositional Conformal Prediction", arXiv:2405.15912), which converts model outputs into prediction sets with a user-chosen target error rate. Security comes from user-validated action gates that surface only when the static analysis cannot rule out a sensitive effect. To avoid asking LLMs to learn a new language, the model writes a constrained subset of Python that is transpiled to Quasar.
## Published results.
The arXiv v1 (13 Jun 2025) reports an evaluation on the ViperGPT visual question answering agent over the GQA dataset, where LLMs emitting Quasar code instead of Python retain task performance while reducing execution time when possible by 42% and reducing user-approval interactions when possible by 52%, with conformal prediction achieving a chosen target coverage. The OpenReview revision (id TvpaeQVTGQ) extends the evaluation to the CaMeL agent on the AgentDojo prompt-injection benchmark and revises the headline numbers upward to "up to 56%" execution-time reduction and "up to 53%" fewer user approvals.
## Status.
No public implementation, repository, or release has been published; the OpenReview submission is under review at the time of cataloguing, and conference acceptance has not been announced. Independent evaluation would require either the transpiler and runtime from the authors or a reimplementation against the published semantics; the ViperGPT/GQA and CaMeL/AgentDojo baselines are public and reproducible. Full author list: Stephen Mell, Botong Zhang, David Mell, Shuo Li, Ramya Ramalingam, Nathan Yu, Steve Zdancewic, Osbert Bastani.
### Design DNA
- **Boruna** *(Orchestration)* — Both lift approval gates into a first-class language primitive. Quasar reports a 52% reduction in user-approval interactions by inferring when approval is unnecessary; Boruna routes the same primitive through a deterministic VM that chains every approval into a tamper-evident evidence bundle.
- **Pel** *(Orchestration)* — The other 2025 academic orchestration paper. Pel replaces the surface language with a Lisp-shaped grammar designed for constrained generation; Quasar keeps a Python subset and inserts the guarantees underneath.
- **Vera** *(Verification)* — Cross-camp foil on what 'make it checkable' means. Vera discharges contracts via Z3 at compile time; Quasar layers conformal prediction over LLM outputs to get a target coverage probability at runtime.
*Detail page: https://agentlanguages.dev/languages/quasar/ · Markdown companion: https://agentlanguages.dev/languages/quasar.md*
---
# Adjacent (1)
> Infrastructure that operates around agent-authored code rather than being authored by agents itself. These are wiring layers, runtime substrates, and tooling that the three-camp argument depends on but doesn't directly produce.
## Plumbing
> A typed language for the wiring between agents. Symmetric monoidal category, typed channels, structural morphisms, agents as stateful morphisms with control ports.
**Camp:** Adjacent
**Author:** William Waites / Leith Document Company
**Implementation language:** Not publicly disclosed
**Compilation target:** Native binaries (Linux x86_64, macOS Apple Silicon)
**Licence:** Free for educational/personal use; commercial licence on request
**First seen:** March 2026
**Maturity:** early implementation
**Site:** https://johncarlosbaez.wordpress.com/2026/03/11/a-typed-language-for-agent-coordination/
**Paper:** https://arxiv.org/abs/2602.13275
**Agent tooling:**
- MCP server
### Key idea
Plumbing is a typed language for the wiring between agents — the substrate
that orchestration-camp languages run on top of. Objects are typed channels
carrying infinite streams. Morphisms are processes: four structural
(copy, discard, merge, barrier) and two utility (map, filter), composed
sequentially or in parallel via the tensor product. Agents are stateful
morphisms with main, control, tool, operator-in-the-loop, and telemetry
ports. Type-checking happens before the graph runs.
## The thesis.
Plumbing is the catalogue's piece of infrastructure. It is not an orchestration language in the sense Boruna, Pel, or Marsha are — it is the typed substrate that orchestration languages can be expressed *on top of*. Where existing agent frameworks (n8n, LangGraph, CrewAI) coordinate agents with ad hoc engineering, Plumbing coordinates them with morphisms in a copy-discard symmetric monoidal category. Objects are typed channels: `!A` is a stream of `A`s, `!string` a stream of strings. Morphisms are processes with typed inputs and outputs. Four structural morphisms (copy, discard, merge, barrier) plus two utilities (map, filter) compose sequentially and via the tensor product. The compiler statically checks that every wiring is well-formed before any agent runs.
Static typing prevents the waste.
The distinctive move is to refuse the orchestration camp's normal framing. Where Boruna treats the unit of computation as an `.ax` workflow with declared effects, and Pel treats it as a grammar-level capability, Plumbing treats the unit as a channel between two processes, with the agent itself reduced to a stateful morphism with a typed protocol — main input, main output, plus control ports for runtime parameter modulation (e.g. temperature), tool-call ports, operator-in-the-loop ports, and telemetry. A judge agent that wants to cool down a debate sends a `set_temp` message on the debaters' control ports; the wiring is type-checked the same as the data path.
## What it looks like.
type Verdict = { verdict: bool, commentary: string, draft: string }
type Review = { score: int, review: string, draft: string }
let composer : !string -> !string = agent { ... }
let checker : !string -> !Verdict = agent { ... }
let critic : !Verdict -> !Review = agent { ... }
let main : !string -> !string = plumb(input, output) {
input ; composer ; checker
checker ; filter(verdict = false)
; map({verdict, commentary}) ; composer
checker ; filter(verdict = true) ; critic
critic ; filter(score < 85)
; map({score, review}) ; composer
critic ; filter(score >= 85).draft ; output
}
An adversarial cover-letter composer with two feedback loops. The critic cannot see source materials — the information partition is a type-level consequence of the wiring, not a prompt instruction.
## Distinctive moves.
- **Typed channels, not typed messages.** Objects in the category are streams. `!A` is a stream of `A`s; sequential composition glues stream-producing morphisms; the tensor product runs them in parallel. Well-formedness is a category-theoretic property, checked at compile time.
- **Four structural morphisms.** Copy duplicates a stream, discard throws it away, merge interleaves two streams of the same type (after coproduct injection), barrier synchronises two streams into a pair. Barrier is the synchronisation primitive that unlocks session types.
- **Agents as stateful morphisms with control ports.** An agent has main input/output plus typed control, tool, operator-in-the-loop, and telemetry ports. Runtime parameter changes (e.g. temperature) flow through the same typed pipework as data.
- **The κ-calculus "don't care, don't write" convention.** Any output port not mentioned in the program is implicitly connected to discard. The textual surface stays small while the type system still tracks every port.
- **MCP server in the release.** The first public drop ships a compiler, an interpreter, and an MCP server — agent harnesses are first-class consumers of the language, not an afterthought.
## Maturity.
Version 0p1, first public release March 2026, binary downloads for Linux x86_64 and macOS Apple Silicon. The release ships a compiler, interpreter, and MCP server. There is no public Git repository; the licence is free for educational and personal use, with a separate commercial licence available from Leith Document Company. The author is William Waites, a Chancellor's Fellow at the University of Strathclyde; the broader programme of work is described in his arXiv paper *Artificial organisations* (arXiv:2602.13275). A second blog post, "The agent that doesn't know itself," extends the calculus with session types and context compaction.
The bet is that orchestration languages eventually need a category-theoretic substrate underneath them, and that the substrate is more valuable as a typed coordination layer than as another orchestration framework competing for workflow attention.
## Agent tooling.
The release includes an MCP server alongside the compiler and interpreter, which is the entire agent-facing surface — there is no `SKILL.md`, `AGENTS.md`, or `CLAUDE.md` in this drop, because Plumbing's framing is that the language *is* the agent tooling for everything above it. Agent harnesses consume Plumbing through MCP; what agents author *in* Plumbing is the wiring diagram for other agents.
### Design DNA
- **Boruna** *(Orchestration)* — Plumbing defines the wiring between agents; Boruna defines what runs inside one agent and how it is audited. Plumbing is substrate, Boruna is workload.
- **Pel** *(Orchestration)* — Same orchestration adjacency, different formalism. Pel is a grammar-level capability calculus on an academic paper; Plumbing is a copy-discard category with session types, with a working compiler and runtime.
*Detail page: https://agentlanguages.dev/languages/plumbing/ · Markdown companion: https://agentlanguages.dev/languages/plumbing.md*
---
# Unclassified (3)
> Candidates that haven't shipped enough machinery — or enough public evidence — to classify yet. Their presence in the catalogue is a marker of position rather than a placement claim.
## Koru
> Zig-superset systems language with event continuations, mandatory branch handling, phantom typing, and purity tracking. Pre-alpha — 'only ever been compiled on a single computer.' AI-First framing intentionally tongue-in-cheek.
**Camp:** Unclassified
**Author:** Author anonymous (korulang)
**Implementation language:** Koru (metacircular, bootstrapped through Zig)
**Compilation target:** Zig (then native via Zig's backends)
**Licence:** Unknown
**First seen:** December 2025
**Maturity:** early implementation
**Site:** https://www.korulang.org/
**Repo:** https://github.com/korulang/koru
### Key idea
Koru is a Zig-superset systems language. Every .kz file is valid Zig;
Koru constructs are marked by a ~ leader. The distinctive design move
is event continuations with mandatory branch handling — events
declare their inputs and possible output branches in advance, and
invoking an event requires explicitly handling each branch. Phantom
typing tracks compile-time resources; purity is tracked; the compiler
is metacircular (Koru compiles to Zig). The "AI-First" claim is
architectural (event boundaries aid AI reasoning) rather than
machinery-based — no SKILL.md, AGENTS.md, MCP server, or structured-
JSON diagnostics ship. The compiler itself was authored using
Claude Opus 4.1–4.5 and Sonnet 4.5.
## What it is.
Koru is a pre-alpha Zig-superset systems programming language. Every `.kz` file is valid Zig; Koru constructs are marked by a `~` leader. The distinctive design move is event continuations with mandatory branch handling — events declare their inputs and possible output branches in advance, and invoking an event requires explicitly handling each branch. Phantom typing tracks compile-time resources; purity is tracked through the type system; the compiler is metacircular (Koru compiles to Zig). The author is anonymous behind the `korulang` GitHub org and Twitter account; the project's site lists Claude Opus 4.1–4.5 and Sonnet 4.5 as the models that authored the compiler itself.
The site's tagline is intentionally tongue-in-cheek: "The Hyper-Performant AI-First Postmodern Zero-Cost Fractal Metacircular Phantom-Typed Auto-Disposing Monadic Event Continuation Language with Semantic Space Lifting and Event Taps." Underneath the buzzword cascade, the README is candid about state: *"Pre-Alpha — Koru is pre-alpha. It has only ever been compiled on a single computer. Use and testing at your own risk."* No `SKILL.md`, `AGENTS.md`, MCP server, or structured-JSON diagnostics ship. The closest planned agent-facing surface is the Compiler Control Protocol (CCP), described as "soon" — a Koru-internal proposal, not the Model Context Protocol.
## Why it's here.
The catalogue includes Koru as a marker of the position where "AI-First" became a tagline trope. The architectural claim (event boundaries provide bounded contexts that aid AI reasoning) is a real design move; the buzzword-cascade tagline is real satire; and the catalogue's unclassified bucket exists for projects whose camp placement isn't yet evidenced by shipped agent-authoring machinery. Companion to Valea and Spec: a real project with substantive design, candidly aware of its pre-alpha state, where the "AI-First" claim is a language-architecture argument rather than a tooling commitment. The pre-alpha disclosure quoted above is the editorial centre of gravity — the rest of the entry exists to give it context.
### Design DNA
- **Valea** *(Unclassified)* — Companion in the unclassified bucket. Both stake an 'AI-native systems programming language' position with substantive design proposals but limited public evidence and no agent-authoring machinery shipped. Valea is a Rust MVP announced on Hacker News with JSON diagnostics planned; Koru is a Zig-superset metacircular compiler with event continuations and an explicitly tongue-in-cheek marketing voice.
- **Spec** *(Unclassified)* — Adjacent unclassified entry on the 'architecture as AI-friendliness' axis. Spec proposes a two-domain IR for multi-agent collaboration; Koru proposes architectural primitives (event boundaries, mandatory branch handling) that aid AI reasoning at the language level. Both make architectural claims about AI-friendliness without shipping agent-authoring tooling.
*Detail page: https://agentlanguages.dev/languages/koru/ · Markdown companion: https://agentlanguages.dev/languages/koru.md*
## Spec
> v0.2 design proposal for a language-agnostic IR for agent-driven development. Six specialised agents (Product, Architect, Scrum, Developer, Tester, DevOps) collaborate on shared .spec.ir artefacts; language-specific code generation is downstream.
**Camp:** Unclassified
**Also spans:** Orchestration
**Author:** M. Abdullah Onus
**Implementation language:** TypeScript (React POC)
**Compilation target:** Not applicable — IR artefacts (.spec.ir files), not executable
**Licence:** MIT
**First seen:** April 2026
**Maturity:** thought experiment
**Site:** https://github.com/mronus/spec
**Repo:** https://github.com/mronus/spec
**Agent tooling:**
- Browser-based React/TypeScript POC at mronus.github.io/spec orchestrating six specialised agents end-to-end
- Multi-agent pipeline with feedback loops, state persistence, and resume support
- Support for Claude and GPT models; API keys remain in the browser
### Key idea
Spec separates concerns into two domains. The Spec Domain is language-agnostic: six specialised agents (Product, Architect, Scrum, Developer, Tester, DevOps) collaborate to produce a set of .spec.ir artefacts — contract, module, infrastructure, data, types, interfaces, functions, events, tests, pipeline — that describe what the system should do. The External Agents Domain is language-specific: separate language agents (Java, Go, Terraform, etc.) consume the IR and produce code. The bet is that this separation lets one specification target multiple languages and enables incremental modification with the proposal's claimed 200 tokens of context instead of the 1,500 a comparable Java change requires.
## What it is.
Spec is a draft language design proposal at v0.2, not a working compiler. The repository ships a design document (`spec-language-design-proposal-v0.2.md`), a browser-based React/TypeScript proof-of-concept that orchestrates the agent pipeline against Claude or GPT, and a README that explicitly labels the project as a draft for discussion. The IR format defines ten artefact types — contract.spec.ir, module.spec.ir, infrastructure.spec.ir, data.spec.ir, types.spec.ir, interfaces/*.spec.ir, functions/*.spec.ir, events.spec.ir, tests.spec.ir, pipeline.spec.ir — each owned by a specific agent role (Product, Architect, Scrum, Developer, Tester, DevOps). The external language agents that would translate IR to running code in Java, Go, Rust, Terraform, or other targets are listed in the future-work section as not yet implemented.
## Why it's here.
The catalogue includes Spec as a marker of an architectural position that spans into orchestration. Where the syntactic and verification camps argue about what an agent should write, Spec argues about who should write what, and in what order — Product before Architect before Developer, with explicit IR handoffs between roles. The distinctive move is to treat the multi-agent pipeline as the primary artefact and language-specific code generation as a downstream concern that can be deferred. The catalogue does not rate Spec against working compilers. It marks it as a different kind of evidence: a structured argument that "language for agents to write" might be the wrong unit of analysis, and that "IR for agents to coordinate over" is the unit that matters.
### Design DNA
- **Boruna** *(Orchestration)* — Cross-camp neighbour. Boruna runs DAG workflows with policy-gated effects and hash-chained evidence; Spec coordinates specialist agents producing shared IR artefacts. Both treat the language as one layer in a larger orchestration story.
- **Pel** *(Orchestration)* — Academic-leaning kin. Both propose architectures for agent collaboration that have not yet shipped a usable language — Pel as a CMU paper, Spec as a draft proposal with a browser-based POC.
*Detail page: https://agentlanguages.dev/languages/spec/ · Markdown companion: https://agentlanguages.dev/languages/spec.md*
## Valea
> An AI-native systems programming language announced on Hacker News in March 2026. The Rust MVP compiler ships JSON diagnostics, a JSON AST exporter, a formatter, and a C backend. Public information beyond the repository README is limited.
**Camp:** Unclassified
**Author:** Hans Voetsch (Google)
**Implementation language:** Rust
**Compilation target:** C (via the emit-c command)
**Licence:** MIT
**First seen:** March 2026
**Maturity:** early implementation
**Site:** https://github.com/hvoetsch/valea
**Repo:** https://github.com/hvoetsch/valea
**Agent tooling:**
- JSON-emitting diagnostics (`valea check --json`) with stable error codes such as E001
- JSON AST export (`valea ast --json`)
- Canonical formatter (`valea fmt`)
- AGENTS.md and CLAUDE.md present in the repository for agent orientation
### Key idea
Valea declares five properties as its design surface: deterministic syntax, explicit semantics with no hidden allocations or exceptions, machine-readable diagnostics, canonical formatting, and a small language surface to reduce edge cases. The README sketches the intended workflow as an agent receiving a goal, writing Valea code, reading JSON diagnostics from the compiler, applying fixes, and producing a program that compiles and runs.
## What it is.
We do not have enough public information to classify Valea, and that honesty is itself the entry's defining quality. What is publicly available is a Hacker News post from March 2026 titled "Valea: An AI-native systems programming language," a GitHub repository at `hvoetsch/valea`, and a README that frames the project as a community language experiment. The README lists five design principles (deterministic syntax, explicit semantics, machine-readable diagnostics, canonical formatting, small language surface) and an example function in a Rust-flavoured surface. The compiler is a Rust MVP exposing four subcommands: `check` (with JSON output), `ast` (with JSON output), `fmt`, and `emit-c`. The repository contains a SPEC.md, MANIFESTO.md, ROADMAP.md, AGENTS.md, and CLAUDE.md, with 24 commits and a single demo recording on asciinema. Licence is MIT. The Google affiliation listed on the catalogue card is not stated in the repository itself.
## Why it's here.
The catalogue includes Valea as a marker of the noise floor of the field. Projects with this much intent, this little code, and this much manifesto are now common enough that the catalogue has an unclassified bucket for them. The relevant observation is not what Valea is but that an entry like this exists at all — that "AI-native systems programming language" has become a recognisable category on Hacker News with a recognisable shape (Rust compiler, JSON diagnostics, agent-oriented README files). The catalogue does not rate Valea against working compilers with measured benchmarks. It marks it as a different kind of evidence: an early signal that the design vocabulary of the field is stabilising even where the implementations have not yet shipped enough to evaluate.
*Detail page: https://agentlanguages.dev/languages/valea/ · Markdown companion: https://agentlanguages.dev/languages/valea.md*
---
## See also
- Homepage (HTML): https://agentlanguages.dev/
- Homepage (markdown): https://agentlanguages.dev/index.md
- Short index: https://agentlanguages.dev/llms.txt
- Sitemap: https://agentlanguages.dev/sitemap.xml
- Source repository: https://github.com/aallan/agentlanguages