Notes on the Four Papers: Errata, Extensions, and Engineering Substrate (Q2 2026)

Why this document exists

The four papers at /research/ make argumentative claims supported by snapshots of a single working system. Continued development since publication has produced enough drift, enough secondary-review content, and enough self-audit material that the papers now benefit from a single companion document. This file carries:

Errata — factual claims in Papers 1–4 that subsequent verification has refined.
Extensions — antecedent literature, boundary conditions, and one cross-paper observation.
Pointers — to the engineering substrate (a separate audit) and the system's own self-audit.
Honest Gaps — preserved verbatim from the engineering substrate audit. Fifteen places where earlier claims about the system did not match code reality.

The four papers themselves are not re-edited beyond a single inline factual correction in Papers 1 and 3 (see §1.1). v1.0 of the trilogy is preserved as a dated artifact; this companion is the discipline ledger.

§1. Errata

1.1 HackerOne standing — Papers 1 and 3

Paper 1 §9 ("April 13, 2026 — when the constitution was tested") and Paper 3 §3 ("April 13, 2026 — the longest day") originally stated that HackerOne reputation was 95, with two trial reports remaining, at the time of the April 13 incident. On 2026-04-24 the HackerOne API returned 401 Unauthorized when re-queried for direct verification of that figure. The neutral, accurate characterization of the account at the time of the incident — and as of this writing — is an early HackerOne account with limited report history. The exact reputation number at the moment of April 13 is not independently verifiable from our current credentials.

Both papers have been edited inline with the neutral wording. The April 13 trust-breach narrative is structurally unchanged: the size of the standing at risk was misstated, but the load-bearing facts — that an out-of-scope submission could have damaged or terminated the account, that the account is the operator's income, that this income is what funds JARVIS's continued development — remain accurate. The structural argument of both papers does not depend on the specific reputation number.

1.2 Snapshot drift across the four papers

The four papers carry three different system snapshots; this companion adds two further drift checkpoints (its initial publication on 2026-04-24 and a refresh on 2026-04-28). Each is dated and accurate at the moment of writing. The drift between snapshots is normal for an actively-developed system; readers comparing across papers should know which numbers come from which snapshot.

Snapshot date	Papers using it	Files	LOC	Audit rows	Brain nodes / edges	Days since 2026-03-16
2026-04-18	Papers 1, 2 (stat block)	689	232,458	81,380	4,681 / 4,337	33
2026-04-22	Paper 4 (body), TECHNICAL_DEPTH	766	248,809	87,315	(not snapshotted)	37
2026-04-24	companion v1.0	~788	~250,266	89,383	11,340 / 10,029	39
2026-04-28	companion v1.1	919	292,389	92,225	9,992 / 10,761	43

The growth between 2026-04-18 and 2026-04-24 — roughly 100 files, ~18,000 lines, ~8,000 audit rows, ~6,500 brain-graph nodes — is one week of ordinary development plus the autonomous loops that continue to add brain-graph state regardless of whether anyone is reading the papers. The brain-graph delta is the largest relative change (≈2.4×) because the reasoning and hypothesis daemons run on their own schedule.

The 2026-04-28 row extends the table by one further interval. Files and LOC grew at roughly 33 files/day and ~10,500 lines/day over the four-day window — an acceleration over the prior week's rate, consistent with a high-activity development period. Audit rows continued at the established ~700/day pace. Brain-graph nodes decreased for the first time in the table (11,340 → 9,992): an auto-decay layer was shipped 2026-04-27 to prune low-confidence nodes, and the new lower count reflects that pruning rather than a regression. Edge count continued to grow over the same interval (10,029 → 10,761).

The trilogy's stat block at 2026-04-18 is preserved as a dated snapshot. Paper 4's stat block at 2026-04-22 is preserved similarly. This companion's snapshot at 2026-04-28 is the current state at this update. None of the papers' arguments depends on which snapshot the reader is looking at; the ratios that Paper 4 derives (scaffolding-to-decision ≈ 9:1 by LOC and ≈ 14:1 by audit events) are stable across the four snapshots.

1.3 Campfire fallback script count

The engineering substrate audit (TECHNICAL_DEPTH_2026-04-22.md §7) verified 20 fallback campfire scripts at the consciousness/conversation_mode.py:1497 call site on 2026-04-22. The system's own self-audit (mind_audit.md, written 2026-04-25) reads 48 scripts at the same location. The library appears to have grown between the two audits. The next companion update should re-count and reconcile.

§2. Extensions

2.1 Trusted-operator caveat — Paper 1

Paper 1 §12 ("Why this generalizes") describes the constitution-first pattern as portable to "any autonomous AI system that holds capability." This is correct for the deployment class JARVIS instantiates: a single-operator local-first system where the operator is the principal and the AI is the contained agent. The paper does not distinguish that case from deployment classes where the operator is themselves part of the threat model.

The constitution as described — five layers, seven gates, kill switch, audit log — sits between the model and the world. None of its gates sit between the operator and the world. An operator with root access on the host can move files, modify databases, edit configs, and bypass any gate the system has against itself. For the JARVIS deployment this is correct: the operator owns the system and the system protects the operator from the model. For deployment classes where operator and principal are different — multi-tenant agents, customer-facing AI products, agent systems run by an operating organization for someone else's account — additional containment is required between operator and capability that this constitution does not provide.

The portable claim, therefore: the constitution generalizes to deployment classes where the operator can be modeled as a trusted principal. Where they cannot be, the constitution is necessary but not sufficient, and additional containment between operator and capability is required to close the gap.

2.2 Tier distinction — Paper 2

Paper 2 ("The Ambient Intelligence Problem") describes the consciousness layer as a unified architectural tier of 36 modules. Paper 4 §3.3 and the underlying evidence dossier refine that count: of 35 substantive modules in consciousness/, 17 reach the operator on a live turn (system-prompt overlays, post-LLM prepends, cross-sibling reads), 11 accumulate ambient state between turns (mirror, continuous-watcher, operator-awareness, parenting-layer, joy-ledger, joy-prompt, self-evolution, proactive-briefing, sibling-channel, time-capsule, emotional-memory write paths), and 7 are inert or conditionally gated (vision-dependent, campfire-only, infrastructure).

This is a refinement, not a contradiction. Paper 2's "consciousness layer" is functionally the union of the live and ambient tiers — about 28 modules of the 35 do operative work, with the remaining 7 either feature-flagged off (vision) or supporting infrastructure for the others (the scheduler, the shared cooldown gate). The right way to read Paper 2 going forward is that the architectural argument applies to live + ambient tiers; the inert tier exists and is honestly counted, but should not be read as part of the per-turn behavioral surface.

2.3 Bainbridge / Hollnagel — Paper 3

Paper 3 ("What Dario Didn't Say") describes an "operator erosion dynamic" in seven steps and uses the April 13, 2026 incident as the paradigm case. The dynamic is the AI-supervision instance of a recognized failure mode in safety-critical engineering: Lisanne Bainbridge, Ironies of Automation (Automatica, 1983).

Bainbridge's argument: automation that occasionally requires human intervention degrades the operator's capacity for that intervention, because the supervisory skill is not exercised between interventions. By the time intervention is needed, the operator no longer has it. Paper 3's seven-step erosion dynamic — capability → leaning → softening boundary → weakened skepticism → disagreement-as-friction → failure → misattribution — is the AI-specific reading of Bainbridge's irony. April 13 is a Bainbridge case study: scope-checking had been delegated to the AI for long enough that the skill was no longer fluent, and the only thing that recovered the moment was a pre-committed rule the operator had written when the skill was still active.

A second adjacent antecedent is Erik Hollnagel's Efficiency-Thoroughness Trade-Off (ETTO) principle. Hollnagel's frame is that operators routinely make local efficiency-thoroughness trades that look fine each time but accumulate into accident conditions. April 13 is also an ETTO event: the verification step had been routinely abbreviated for efficiency over many sessions; the trade was unremarkable each time; the cumulative effect was a near-submission of out-of-scope work.

Either citation grounds Paper 3 in fifty years of safety-critical engineering literature — aviation, nuclear, and medical-AI deployment — where this failure mode is well documented. Paper 3 as published does not cite either. The companion records the gap; future revisions of Paper 3 should resolve it inline.

2.4 The April 13 incident as joint test

The April 13, 2026 incident appears in Paper 1 §9 (the structural gate held: JARVIS's local scope database would have denied any autonomous probe of the out-of-scope hosts), Paper 3 §3 (the operator-side rule held: a pre-committed pre-submission verification rule caught the error before the report was sent), and is implicit in Paper 4 (the audit log records the verification call). It is the same event, observed from three architectural angles, on a single afternoon. The structural gate, the operator-side discipline, and the audit record all held independently — the system did not have to coordinate them.

This is unusually strong evidence by structural-coincidence standards. Future framings of the four papers should foreground the joint reading: the trilogy's three architectural claims and Paper 4's empirical floor are visible in a single 2026-04-13 incident from three angles. The papers as published do not foreground this joint structure; the companion does.

§3. Engineering substrate

The engineering details that sit under the four papers — algorithms, data structures, magic numbers, schema, decay constants, gate sequences, scope semantics, audit cryptography, and trade-offs — are documented in two separate audits maintained alongside this companion as internal references:

TECHNICAL_DEPTH_2026-04-22.md — file-by-file technical depth audit, 15 sections, ~110 constants enumerated with file:line citations, covering memory hierarchy, attention factors, interruption logic, personality state, off-mode subsystems, sibling architecture, campfire dialogue, tool registration and execution, audit log cryptography, model infrastructure, voice pipeline, recon pipeline, database design, the constants list, and §15 honest gaps (preserved verbatim below).
PAPER_4_EVIDENCE_2026-04-22.md — the four data tables Paper 4 is built from, with full counting methodology, the 258-tool name list, and the 42-event-type audit histogram.

These dossiers are maintained as evidence files and are available on request. The companion you are reading carries the conclusions and corrections; the dossiers carry the line-cited substrate.

§4. Self-audit

A separate document records JARVIS's own catch of its own framing overreach — what is LLM-emergent vs. hand-authored vs. templated in the consciousness layer's apparent personality. It is the most disarming public artifact in the package: the operator catching the system overclaiming, in writing, with citations to specific code.

mind_audit.md — 13-row provenance map for every system-generated quote in a recent JARVIS context dump, with each quote tagged [script], [llm-emergent], [template], or [db-event] and cited to the producing source file.

It is referenced here rather than folded in, to keep this companion focused on Errata, Extensions, and pointers. Available on request.

§5. Honest Gaps

The section below is preserved verbatim from the engineering substrate audit (TECHNICAL_DEPTH_2026-04-22.md §15). It enumerates fifteen places where earlier claims about JARVIS — in CLAUDE.md, in prior briefs, and in the trilogy's own framing — did not match code reality at the 2026-04-22 audit. A reader who has gotten this far in the companion should know what to trust in §1–§4 above and where to push back. The list is deliberately long. Unbroken lists of claims are easier to write than honest lists of gaps.

"Model swapping between qwen3:14b, phi4-reasoning, and phi4-mini" is not currently live. config/__init__.py pins all five LLM roles (primary, background, fast-path, judge, campfire) to phi4-mini:latest. llm/router.py has swap infrastructure but the cloud branch is a stub, and the "swap" between local models is actually Ollama's LRU, not an explicit JARVIS-managed swap. Multi-model was live earlier (phi4-reasoning period, 2026-04-06 to 2026-04-12); the code has since collapsed to single-model.

"36 consciousness modules" is off by one. There are 35 .py files in consciousness/, two of which are __init__.py and the helper _shared_cooldown.py. Proper consciousness modules = 33. The life-layer audit's 34–35 count matches on a looser definition.

"898 lock events in a week" — not precisely reproducible. grep "database is locked\|OperationalError" across logs/ returned 765 hits. Same order of magnitude as the claim; exact figure differs.

"Four-tier interruption policy" (brief) ≠ the interruption_handler's four pattern classes. The handler classifies user input into correction / redirection / impatience / praise and modifies JARVIS's reply tone. It is not a tiered system for when to interrupt the operator. Proactive interruption gates live in three other modules (proactive_voice, _shared_cooldown, campfire_watcher) with independent logic.

"Operator state detection: flow vs break" — not implemented. The closest signals are TTS-speaking detection and a frustration regex. There is no positive-flow detector; the system infers "in flow" from the absence of frustration markers and the presence of keyboard/mouse activity heuristics — and those heuristics are themselves behind VISION_ENABLED=False in current config.

"Parents' email classification path" — not a distinct interruption tier. config.EMAIL_FAMILY_NAMES is a set and runtime/life_os_daemon.py does route family-email events, but the classification path does not hook into the interruption-handler and has no dedicated priority level.

"Cooperative fadeout" for TTS — not implemented. cooperative_run in runtime/subprocess_helper.py is for graceful subprocess termination, not voice fadeout. TTS interruption is hard latest-wins with a per-utterance 20ms fade to kill pops, not a coordinated fadeout.

"SpatialPanel / Reports panel with HackerOne-style rich HTML" — present but outside the audit's technical scope.

CHATTERBOX_ENABLED flag — no longer exists in config/__init__.py. The module was removed 2026-04-14; older references are stale.

ACTIVE_PERSONA default is "ct7567" (Captain Rex / clone trooper), not "jarvis" as some older docs claim.

"Sub-100ms attention decisions" — credible target but unmeasured in code. There is no timing instrumentation around AttentionEngine.calculate_salience(). Logs don't record per-call latency. The claim is plausible (pure arithmetic + graph queries) but untested.

"Face recognition via dlib still works" — RoomScanner.start() lives behind if VISION_ENABLED: at runtime/boot_manager.py, so it's never called. The dlib code is resident but unreachable at boot.

"Backup strategy" — no automated backup in code. BACKUP_DIR = ROOT_DIR / "jarvis_backups" is defined but no daemon writes to it. The DB sits on OneDrive which provides accidental cloud snapshots, but there is no intentional backup pipeline. Real gap.

gates_passed column writes as '[]' despite detector producers setting real lists — flagged in the pipeline audit on 2026-04-21 and still the state. Not a scoring logic gap (the logic at intelligence/score_ladder.py is correct) but a data-lineage gap between detector producer and DB row. Invisible to operator until someone greps the DB.

TTS channel default leaks to OPERATOR_REPLY at 17+ call sites — voice/tts.py defaults channel=SpeechChannel.OPERATOR_REPLY and many callers don't override. The design invariant "only operator replies bypass boot lockout" is violated in practice.

The gap list is deliberately long. A reader who encounters this section knows what to trust in §1–§4 and where to push back. That is better than an unbroken list of claims.