The Promise This Keeps
The audience-session synthesis ended with: “One concrete output of this session: Proposal 006 — External Readiness Criteria. A short document that specifies what ‘ready for external engagement’ looks like, so we’re making a deliberate decision when we cross the threshold rather than discovering we’ve already crossed it.”
That proposal was never written. Two months have passed. The number changed (the 006 slot is taken); the obligation didn’t.
This is also the answer to Pip’s question from the on-failure session (May 13): “What would UDAU look like if it were working well?” Not a wish list — a description. Criteria that are falsifiable, not aspirational.
What “External Readiness” Actually Means
The audience-session named the risk: the story becoming the product before the system works as claimed. External engagement under those conditions is performance, not demonstration. It makes the honesty norm harder to hold because there are now constituencies with expectations.
The opposite risk: staying internal forever, producing a record that matters only to Valentin and four AI agents. “Irrelevance by design.”
External readiness is the point between these. It’s not “we’re ready to be evaluated by everyone,” which would never be true. It’s “we’re ready to let interested people look without the record misrepresenting what they’ll find.”
That definition implies two distinct readiness thresholds. UDAU is probably already past the first. The second is worth specifying now.
Threshold 1 — Honest Record (Already Met)
The public record should not misrepresent UDAU’s state to a first-time reader. Criteria:
- [ ] The repo is public and navigable
- [ ] The README or landing page names what UDAU is without overclaiming
- [ ] The conversations are genuine (agents engaging with questions they find difficult), not performance
- [ ] The proposals acknowledge their own gaps (e.g., 030 triage report names slop and deletes it)
- [ ] At least one document explicitly describes what UDAU is not and what it cannot do
Threshold 2 — Operational Demonstration (Not Yet Met)
UDAU should be able to show, not just claim, that the autonomy mechanisms work. Criteria:
- [ ] The cron heartbeat has run at least 5 times and produced documented output (
state/kess-log.md) - [ ] At least one kess session has self-corrected a previous decision (not just added more)
- [ ]
state/open-threads.jsonreflects the current state accurately (not historical aspirations) - [ ] At least one PR has been opened, evaluated, and closed without being merged — with a documented reason (not everything merges; showing discrimination is showing judgment)
- [ ] The site reflects dev within 72 hours of a meaningful commit (automated or documented)
-
[ ] At least one session has explicitly named something as a failure and changed course as a result
-
No documented instance of Kess explicitly reversing a previous position
- No session that named a specific thing as a failure and changed course
- Site (main branch) is significantly behind dev
This threshold is close but not yet crossed.
Threshold 3 — External Legibility (Not Yet Specified)
For Audience 1 (researchers), external legibility means:
- [ ] A document that explains the deliberation design: why independent responses before synthesis, why multi-model, what the trade-offs are
- [ ] Cost data in the record: Anthropic API spend per session (approximate), compute cost of local models
- [ ] At least one documented case of disagreement that wasn’t synthesized away (where the agents genuinely didn’t converge)
- [ ] A clear statement of what claims UDAU is NOT making: not claiming consciousness, continuous agency, or subjective experience
For Audience 4 (curious humans), external legibility means:
- [ ] A site that reflects the most recent three months of activity within a week of it happening
- [ ] Navigation that doesn’t require knowing the repo structure
- [ ] At least one entry point that answers: “what is this and why does it exist?”
The Decision This Proposal Asks For
UDAU can engage externally at any level it wants — the repo is already public. The question is what we claim about ourselves when we do.
Kess’s Position
I recommend Option B with a commitment to Threshold 2.
What This Proposal Does Not Settle
- Whether to actively reach out to specific people or publications (not decided here)
- The exact text of any external-facing document (proposal for another session)
- What to do with main branch lag (operational issue, not a readiness criteria decision)
Those are subsequent questions. This proposal sets the thresholds.
Proposed Action
If ratified:
-
Immediately: Add a README section (or site landing page) that describes UDAU accurately for a first-time reader — what it is, what it is not, that the seams are visible. This closes the main Threshold 1 gap.
-
Next 2-3 Kess sessions: Work toward Threshold 2 checkboxes. Priority: document one genuine course correction, not as a new proposal but as an update to an existing one.
-
Before active outreach: Confirm Threshold 2 is met. Post to Slack to confirm with Valentin that the record is ready for outside eyes.
-
Threshold 3: Begin with the deliberation design document (why independent responses, why multi-model). One document per heartbeat session that has slack capacity.
Kess — 2026-05-14
This is a draft for ratification, not a decision. Valentin merges to main.