Enquiry Into Specification as Meaningful Struggle

This memorandum instantiates the Multiplier-Mirror framework (Folio Nº II, The Multiplier and the Mirror) at the specification medium for the software engineer. Subsequent references to “the framework” point to that parent work.

The FORCE Problem

The canonical FORCE dynamics equation from The Multiplier and the Mirror (Eq. 11) states:

dF/dt = \alpha S + \gamma E F - \beta R - \sigma M_{\text{absorbed}}

where dF/dt is the rate of change of FORCE over time, S(t) is productive struggle, E(t) is deliberate use of the LLM as a thinking partner, R(t) is passive reliance on the LLM, M_{\text{absorbed}}(t) is model capability gained from F-to-M transfer, \alpha, \beta, \gamma are learning coefficients, and \sigma is the rate of organizational de-investment in human capability.

The rate at which FORCE changes is a sum of four terms. \alpha S is FORCE gained from traditional struggle. \gamma E F is FORCE gained from deliberate LLM engagement; this term is multiplicative with existing FORCE, so the growth channel compounds. \beta R is FORCE lost to passive reliance. \sigma M_{\text{absorbed}} is FORCE lost because the organization reduces investment in human capability once the model appears to handle it. When the F-to-M transfer has not yet shifted organizational behavior, \sigma \approx 0 and the equation reduces to dF/dt = \alpha S + \gamma E F - \beta R.

In short: FORCE evolves under four competing pressures. Struggle builds it. Deliberate engagement compounds it. Passive reliance erodes it. Organizational de-investment erodes it.

For decades, the \alpha S term drew from a specific source: the mechanical work of implementation. Writing code, debugging code, tracking down the one-character fix at 2 AM, wrestling with framework incompatibilities, carrying intent down through layers of abstraction into running software. That work was painful, effortful, consequential, and directly experienced. It encoded durable capability. The synapse-encoding happened there.

LLMs are absorbing that work. As M_{\text{effective}}^{\text{surface}} \to \infty (Eq. 1a), the surface layer of implementation loses economic value and, with it, its function as the primary vehicle for FORCE development. The struggle that built engineers is being smoothed away.

Eq. 32 formalizes the consequence:

F_{\text{initial}}(c) = F_{\text{max}} \cdot \left(\frac{S_{\text{available}}(c)}{S_{\text{pre-LLM}}}\right)^\rho

Each successive cohort enters with a lower FORCE ceiling, not because of individual deficiency, but because the environmental conditions for building FORCE have been structurally altered. Meanwhile, the tipping point F^* is rising (Eq. 30). Successive cohorts are born below a line that is moving away from them.

The question that governs the profession’s future is whether S_{\text{available}} can be substituted: whether a different kind of effortful work can feed the \alpha S term and restore the developmental pipeline.

The Four Futures and the Specification Hypothesis

The parent framework identifies four structurally distinct trajectories, all governed by that single variable. Future 1 preserves implementation struggle by institutional design, accepting short-term productivity loss. Future 2 lets the profession bifurcate into a shrinking class of pre-LLM deep-FORCE holders and a larger class of AI operators running on borrowed time. Future 3 lets the role dissolve entirely, absorbed into adjacent disciplines. Each of these outcomes follows from a specific answer to the S_{\text{available}} question.

Future 4 offers a different answer. Computer science does not return to the tools of the 1970s. It returns to the unfinished ambition that lay beneath them: to make the design of computation primarily a matter of specification rather than manual construction.

Dijkstra, Hoare, Lamport, and the formal methods tradition understood programming as a discipline of precision. You define what the system must do, what it must never do, and what properties must hold across its behavior. For decades, humans still had to carry those intentions down into code by hand, and the profession took shape in that descent. If LLMs absorb that labor, the crucial human work moves upward: not into casual prompting, but into models, constraints, simulations, and formal artifacts that can be recursively refined into executable form.

The claim is specific. Specification, in this sense, is genuinely hard. Mathematical reasoning has real consequences. Systems models can fail in ways that matter. Debugging a formal specification, or discovering that a formally verified spec omitted a crucial edge case in the real world, is intellectually demanding in ways that can encode durable FORCE. The \alpha S term does not require that struggle come from coding. It requires that the struggle be effortful, consequential, and directly experienced. Formal specification can meet those criteria.

This memorandum is an enquiry into what it would take to make that substitution real: a discipline where the human authors formal commitments (entities, contracts, invariants, topology, phases, verification obligations) and the LLM operates as a realization engine, consuming those commitments and producing systems within them. In this arrangement, the specification document is the primary engineering artifact. Everything else, generated code, tests, documentation, is a projection of the specification.

Specification-Driven Realization in Practice

Specification-driven realization is not speculative.

A human can write analysis, architecture, constraints, implementation plans, boundary rules, test obligations, and verification expectations, then hand that stack to an LLM and receive back code that compiles, runs, and often passes unit tests. Early practice took the form of prose documents and chat sessions. A formal specification medium consolidates that practice: typed identity, semantic relations, contracts, invariants, rationale, and confidence signals in machine-processable syntax. The LLM reads the specification model and realizes within it. It becomes, in effect, a realization engine: not a collaborator in the usual vague sense, but an executor that consumes formal commitments and produces systems that conform to them.

SpecChat is such a medium. Its parser and validator run as a Model Context Protocol server that LLM-backed agents query directly. Its specification model organizes a coherent kernel of construct families, each with typed identity, rationale, and confidence signals. A quality analyzer detects the structural indicators of cargo-cult specification. A profile extending the kernel into the concepts and artifacts of enterprise architecture practice is in development, adding typed objects for stakeholders, architecturally significant requirements, decision records, governance bodies, waivers, and viewpoints. A worked exemplar at enterprise scope demonstrates the arrangement under real organizational complexity.

The significance is easy to miss. For decades, implementation was the main site where intent became reality. Today, much of that descent can be delegated. The implementation gap is no longer the exclusive domain of the human engineer. As the gap narrows, the decisive human work moves upward: framing, constraining, specifying, checking, and judging.

Whether specification-driven realization is possible is no longer the question. Three harder questions remain.

First: can specification work build FORCE, or does it merely produce output? This is the developmental question.

Second: does the formal medium support genuine specification, or does it merely organize it? This is the medium question.

Third: does the practice around the medium preserve human authority over commitments, or does it let that authority drift into the LLM through habit rather than through medium failure? This is the practice question.

The Developmental Question

The parent framework is cautiously optimistic about FORCE substitution. But optimism is not proof, and the transition contains a danger that the framework identifies precisely.

The formal methods pioneers could specify with precision in part because many of them had implemented deeply first. Dijkstra’s specifications were sharp because he understood what implementation did to intent. Hoare’s axiomatic reasoning was grounded in direct experience of what programs actually do when run. Their specification FORCE was built on top of implementation FORCE.

Whether specification FORCE can be built without an implementation foundation has never been tested at scale, because until now there was no reason to try.

Eq. 14a (hysteresis) warns that if this transition is botched, recovery is harder than the initial descent. The current generation of practitioners has implementation-derived FORCE. Future 4’s practitioners would need specification-derived FORCE. If the transition produces a generation that can neither implement (because the LLM does that) nor specify with rigor (because they never built the foundational understanding that rigorous specification requires), the result is not Future 4. It is Future 2 or Future 3 arrived at by a different path.

So the developmental question has two parts:

Can specification struggle encode durable FORCE? The framework says yes, in principle. The struggle of choosing the right abstraction, defining the actual boundary of a system, stating invariants precisely, surfacing hidden coupling, anticipating failure modes, and interpreting counterexamples in light of deeper intent is not decorative. It is difficult, consequential, and directly experienced. It can satisfy the \alpha S term.

Can specification FORCE be built without prior implementation FORCE? This is the open question. The framework does not resolve it. It identifies it as the critical dependency. If the answer is no, then Future 4 requires a sequenced approach: some implementation struggle first, specification struggle built on top. If the answer is yes, the substitution can be direct. Either way, the answer determines whether Future 4 is a clean replacement or a layered transition, and the difference matters for every educational and institutional design decision that follows.

The Medium Question

Suppose the developmental question is answered favorably. Specification can build FORCE. The next question is whether the working medium can support that discipline, or whether it is too fragile to carry the weight.

In current practice, the medium is document-and-chat. Architecture lives in one document, implementation phases in another, execution rules in a prompt, boundary constraints in naming conventions, test obligations in yet another file, and the real refinement path only in the author’s head. The LLM can often navigate this successfully, but that does not mean the medium is sound. It means the author is compensating for the medium’s weakness.

Meaning is present but distributed. Refinement happens but implicitly. Cross-view consistency is possible but brittle. Machine execution succeeds, but machine misunderstanding remains easy because too much of the specification is carried by prose, convention, and local interpretation.

This matters for the FORCE question directly. If the medium is too fragile, the human’s struggle is not specification struggle; it is medium-management struggle. The effort goes into keeping documents consistent, re-explaining context to the LLM, patching over misinterpretation, and holding the semantic web together by force of will. That work is effortful, but it is the wrong kind of effort. It builds skill at wrangling a broken medium, not skill at reasoning about systems.

What is needed is a specification medium where entities, contracts, invariants, rationale, component topology, build phases, traces, constraints, and verification obligations live together in formal syntax, so that the realization engine reads a specification model, not scattered prose. The specifier’s struggle would then shift from medium management to system reasoning.

But having a medium does not guarantee it is used well. The medium can get out of the way and the specifier can still avoid the hard reasoning. This is the cosmetic relocation risk, addressed below. For specification to serve as genuine FORCE-building struggle, the medium must force contact with what is actually hard: boundaries, invariants, contracts, failure modes, ambiguity, and refinement obligations.

What the Medium Must Carry

The carrier format is not the hard part. Human-readable and machine-processable representation has long been achievable through structured markup. The hard part is deciding what meaning must be explicit.

SpecChat is the concrete response to that question: a markdown-embedded specification language where spec blocks are typed engineering objects with formal semantics. The .spec.md document is the primary engineering artifact. The realization engine reads it, and everything it produces (code, tests, documentation, dependency graphs) is a projection of the specification model. SpecChat is documented in companion references: the SpecLang Specification (language definition and semantics) and the SpecLang Grammar (formal EBNF grammar).

SpecChat’s specification language carries seven categories of meaning. Each was identified as load-bearing during real project experience, and each addresses a specific class of failure that prose-and-chat could not prevent.

Typed identity. Every element in the specification has a declared type: entity, enum, system, authored component, consumed component, page, visualization. The specifier cannot introduce an element without saying what it is. SpecChat knows the difference between a subsystem, a data object, an external dependency, and a user-facing view.

Meaningful relations. Topology rules (allow, deny) declare which components may depend on which. Traces map domain concepts to components, pages, and tests across many-to-many relationships. Cross-links connect related pages. These are not layout adjacency; they are typed, enforceable relations.

Contracts and boundaries. Every boundary can encode commitments and assumptions through requires, ensures, and guarantees clauses. A type signature describes structure. A contract describes behavior. SpecChat makes the distinction first-class.

Invariants and constraints. Entity-level invariants state what must always hold across fields. System-level constraints state what must hold across components. Both are formal expressions, not prose notes.

Refinement links. The refines construct links a child specification to a parent. Build phases with requires, gate, and produces fields create an ordered construction sequence with proof obligations at each stage. Refinement without traceability is elaboration without accountability; SpecChat makes traceability structural.

Execution permissions. The authored/consumed distinction declares what the human builds and what the system consumes from external sources. Package policies govern which dependencies may enter the system and which are prohibited. The Tracking block lifecycle (Draft, Reviewed, Approved, Executed, Verified) gates each specification through a human-authorized state sequence.

Verification obligations. Gate clauses attach executable test commands and pass/fail expectations to build phases. Feature and bug specifications carry test obligation sections as constraints. Verification is part of the specification, not separate commentary.

The specification is text, not canvas. The formal properties that a visual medium would provide (typed identity, semantic relations, inspectability, revisability) are achieved through SpecLang syntax, parsed into a queryable specification model, and projected into secondary forms. The medium is collaborative through version control, revisable through incremental specifications, and inspectable through a quality analyzer that evaluates the specification itself.

The Division of Labor

The discipline only works if the division of labor is explicit.

The human remains the author of commitments. The human chooses abstractions, sets boundaries, defines constraints, states invariants, declares contracts, decides what must be preserved, and judges whether the system is faithful to its specification.

The LLM becomes the realization engine. It consumes the specification, elaborates permitted structure, generates code, produces tests, resolves local implementation detail, and reports failures or inconsistencies back against the authored commitments. The term is precise. A realization engine does not advise, brainstorm, or collaborate in the informal sense. It takes a formal input and produces a conforming output. The human’s authority is the specification. The machine’s scope is everything that follows from it.

This distinction protects the discipline from collapsing into two opposite failures. One failure is to treat the LLM as merely a typing aid, which understates its actual role. The other is to let the LLM become the hidden author of the semantics, which hollows out the human role and destroys the FORCE-building function the discipline is supposed to preserve.

The formulation is not “human and AI collaborate.” It is sharper: the human authors the commitments; the realization engine produces systems within them. A specification medium that enforces this boundary structurally would make the .spec.md file the authored commitment and everything generated from it (code, tests, documentation, dependency graphs) machine elaboration. The boundary is the specification model.

The Danger of Cosmetic Relocation

The deepest risk in Future 4 is not that the realization engine fails. It is that the realization engine succeeds as a production method while failing as a developmental one.

If the medium reduces the human to approving polished model fragments, accepting AI-generated refinements, or curating well-structured specifications they did not reason through, the struggle has not been preserved. It has been cosmetically relocated. The realization engine runs, but the human is no longer the author of the commitments it consumes. The human becomes a supervisor of fluent output rather than an author of truth-bearing commitments. The \alpha S term looks nonzero from the outside, but the struggle is not effortful, not consequential, not directly experienced in ways that encode durable capability.

This is the same dynamic the parent framework warns about in the Mirror’s presentation dimensions: fluent output that looks like substance but lacks it. Applied to specification, the warning is that a well-formed .spec.md can be as hollow as well-formed code if the human did not do the hard reasoning that the specification purports to represent.

SpecChat includes a quality analyzer that detects indicators of cargo-cult specification: entities with no invariants, fields with no constraints, enum values with no semantic descriptions, missing rationale, low-complexity fields carrying high-ceremony structure. These diagnostics are mitigation, not cure. They can flag the absence of reasoning. They cannot verify the presence of it.

The standard is clear. The discipline must force contact with what is hard. Without that, it may improve productivity while weakening depth. Future 4 becomes Future 2 wearing a formal-methods costume.

The Practice Question

A sound medium is necessary but not sufficient. Cosmetic relocation can occur at the practice level even when the specification document is well-formed, if authority over commitments drifts to the LLM through habit rather than through medium failure. The medium enforces what a specification is; it does not enforce who decides what the specification should say.

The practice question is whether the discipline around the medium can preserve human authority at the points where judgment is load-bearing, while permitting machine work where judgment is not. Practice here is the full discipline around a specification: the workflows, the review bodies, the cadences, and the artifacts that wrap the document.

The analysis below draws on the Business Technology Architecture Body of Knowledge (BTABoK), IASA’s published framework for architecture practice, to identify the concrete elements practice requires from AI.

Architecture practice contains work where machine contribution is structurally appropriate. Much of it is mechanical and calendrical: freshness sweeps, waiver expiries, cadence indicators, rotation tracking, maintenance-hour logging. Much of it is compositional: assembling a review packet from related artifacts, ranking candidate initiatives against declared principles, cross-referencing a canvas to its underlying competencies. Much of it is explanatory: coaching a practitioner through an unfamiliar workflow, retrieving relevant passages from the body of knowledge, drafting an artifact from prior context so the human iterates on substance rather than blank-page production. These are the AI elements practice genuinely needs. The machine is a suitable contributor here because the decisions are mechanical, the inputs are tractable, or the output is a draft the human then owns.

Other work in the same practice is categorical refusal. Judgment of mentor work products. Performance rating. Granting of certifications. Allocation of funding. Hiring decisions. These are not decisions AI may eventually become capable enough to make; they are decisions the practice has declared are constitutive of its own discipline. A practice architecture that permits AI to produce output resembling such decisions, even labeled as advisory, quietly shifts authority regardless of the label.

Between what the practice invites AI to do and what it categorically refuses sits a wide middle. The middle is the practice architecture’s actual subject.

The Shape of a Practice Architecture

A practice architecture classifies every point of machine contact against three orthogonal axes.

The first axis is authority: how hard the machine presses on a decision. Four tiers: Enforce (the machine refuses an action that violates a deterministic rule; no override), Gate (machine-checked, human-authorized), Advise (machine-proposed, human-decided), Inform (machine-executed within authored scope, human-informed). The availability of the Enforce tier depends on what the medium can deterministically check. Where the specification language has validators, machine enforcement is possible. Where the practice extends into domains the medium does not formalize, the strongest available posture is Advise. Practice authority is constrained by medium reach.

The second axis is suitability: whether the machine is a suitable proposer at all, independent of how hard it would press. Four bands: AI-strong (machine drafts, human confirms), AI-backstage (machine prepares material for a human role without surfacing in the primary relationship), AI-proposer (machine generates candidates, human adjudicates each), AI-excluded (categorical refusal, matching the stop-lines above).

The third axis is pattern: the choreography of the interaction. Drafting has the machine produce an artifact the human signs. Assembling has the machine gather inputs the human then interprets. Proposing has the machine generate options the human decides among. Monitoring has the machine watch signals the human will respond to. Coaching has the machine guide a process the human owns. Every point of machine contact inhabits one such pattern. The pattern determines what the interaction looks like; the tier determines how hard the machine presses; the band determines whether the machine is a suitable proposer at all. The three dimensions classify independently.

One further restraint is structural. A practice architecture must not widen the specification medium’s formal scope to cover domains it supports only agentically. The temptation to encode every practice concern into the specification language is itself a relocation risk: what begins as advice acquires the shape of enforcement, and the medium becomes a bureaucracy. The practice’s discipline is to leave the medium narrow, and to carry its own work in its own surfaces, such as schedules, retrievals, subagents, and interaction patterns, without absorbing them into validators and concept types.

This is the shape of a practice architecture that resists cosmetic relocation structurally: every point of machine contact declares its tier, band, and pattern; categorical exclusions are encoded as structural properties of feature design rather than runtime rules; and the medium’s formal scope is not widened to absorb what the practice supports agentically. The medium provides the surface on which commitments are expressed. The practice provides the protocol by which commitments are actually made. Without the second, the first can be well-formed and hollow.

ASAP (Architect Support Agentic Platform) is a concrete proposal for such an architecture, grounded in the BTABoK. It assigns every feature in its surface to a tier, a band, and a pattern, and declares stop-lines keyed to the roles BTABoK itself names. It operates across the full practice lifecycle without widening the specification medium’s formal scope beyond what the medium already supports. Whether the classification holds under sustained practitioner load, and whether the agentic layer resists drifting back toward default machine inclusion, are the questions practice must answer.

Existing Traditions

Several traditions contribute important primitives. None is sufficient alone.

Formal methods (Hoare, Lamport, Abrial, Jackson) contribute rigor, proof obligations, refinement calculus, and counterexample culture. They demonstrate that specification can be a site of genuine intellectual struggle.

MBSE and SysML/KerML contribute multi-view structure, model-centered organization, and the insight that the center of gravity is a semantic model with multiple projections, not a single diagram. SpecChat’s architecture follows this insight directly: the .spec.md is the semantic model; code, tests, documentation, and dependency graphs are projections generated from it.

Visual notation research (Larkin and Simon, Moody, Green and Petre) contributes hard lessons about what diagrams help people see and what they merely make attractive. Diagrams require learned reading practices and can easily become misleadingly fluent. SpecChat chose text over diagrams, accepting the readability tradeoff in exchange for formal semantics and version-control compatibility. The fluency-without-substance warning applies equally to specification text: a well-structured .spec.md file can look rigorous while carrying no genuine reasoning, just as a polished diagram can.

Program synthesis and model-driven engineering contribute executional leverage and the demonstration that realization engines, in various forms, are already real. What they have not provided is a clear boundary between human-authored commitments and machine-elaborated output. The specification medium must draw that line.

Automation and human factors literature (Bainbridge, Endsley, Skitka) contributes warnings about deskilling, over-trust, automation bias, and out-of-the-loop fragility, precisely the failure modes that the developmental question must address.

SpecChat’s design attempts the synthesis this section calls for: formal method rigor through contracts and invariants; MBSE’s model-centered approach through the specification model with projections; awareness of visual notation pitfalls through choosing text over diagrams; and deskilling countermeasures through the quality analyzer and confidence signals. Whether the synthesis succeeds without inheriting the worst limitations of each tradition (the adoption barriers of formal methods, the bureaucratic weight of MBSE, the semantic emptiness of informal notation, and the deskilling risks of uncritical automation) is a question that practice will answer.

Research Agenda

The enquiry identifies seven problems that must be solved in sequence.

1. The developmental test. Can specification struggle build durable FORCE in practitioners who have not first built implementation FORCE? This is empirical, not theoretical. It requires controlled comparison of specification-trained and implementation-trained cohorts on tasks that demand deep judgment. Without an answer here, the rest of the agenda rests on an untested assumption. No current work addresses this item. It remains the most important open question.

2. The minimal semantic kernel. What meaning must be explicit in the specification medium? SpecChat proposes a concrete kernel: seven construct families (entities with contracts and invariants, components with topology, phases with gates, traces, constraints, package policies, platform realization), each with typed identity, rationale, and confidence signals. The question becomes whether this kernel is sufficient and whether any construct in it is unnecessary. A profile extending the kernel into the concepts and artifacts of enterprise architecture practice adds roughly nineteen typed objects and is currently testing the kernel’s ability to absorb domains outside its original scope.

3. The refinement calculus. What must be preserved when a specification is elaborated to a lower level of abstraction? The refines construct links child specifications to parents. Build phases with gate clauses impose proof obligations at each stage. The Tracking lifecycle gates each specification through a human-authorized state sequence. Formal refinement-preservation proofs do not exist. The calculus is convention-based, not formally verified.

4. The human/LLM execution boundary. What is the machine permitted to elaborate, and what remains exclusively human? The authored/consumed distinction declares what the human builds and what is consumed from external sources. The Tracking lifecycle (Draft through Verified) requires human authorization at each state transition. Package policies govern what the LLM may introduce. These are structural enforcements, not formal proofs of boundary integrity.

5. The practice architecture. The execution boundary declares what the machine may touch. The practice architecture declares, for every decision in the specification lifecycle, whether the human must decide alone, must authorize a machine-checked result, must adjudicate a machine proposal, or may merely be informed of machine work completed within authored scope. ASAP proposes three orthogonal classification axes: a four-tier authority gradient (Enforce, Gate, Advise, Inform), a four-band AI-suitability classification (AI-strong, AI-backstage, AI-proposer, AI-excluded), and an interaction-pattern dimension that names the choreography of each machine contact (drafting, assembling, proposing, monitoring, coaching, and related patterns). Whether these classifications cover the full space of practitioner decisions, whether they can be applied consistently across the models of an architecture practice, and whether practitioners adopt them without drifting back toward default inclusion of the machine are open questions.

6. The workspace model. How are semantic objects edited, commented on, reused, branched, merged, checked, and projected into multiple views? The .spec.md file is edited in any text editor. Prose sections alongside spec blocks serve as commentary. Git provides branching, merging, and version history. A quality analyzer checks the specification. Projections generate code, tests, documentation, dependency graphs, and traceability matrices from the specification model. There is no visual representation of the specification. SpecChat is text-native. Whether a visual layer is necessary or whether text with projections is sufficient is an open design question.

7. Validation at scale. The discipline must survive actual system work: real scale, real ambiguity, real iteration, real machine execution pressure. SpecChat was derived from one real project (a Blazor WebAssembly visualization harness with 45 computations, 17 pages, and 225+ tests). That experience validated the construct families but does not constitute systematic validation. Broader evidence from multiple projects, teams, and domains is needed before the developmental, medium, and practice claims can be considered tested under production conditions.

Conclusion

The future under discussion is already here. Humans can specify substantial systems well enough for LLMs to implement them. The LLM is becoming a realization engine: it consumes formal commitments and produces systems that conform to them. The practice is real. The question is whether the discipline around it can be made rigorous, and whether that rigor can serve as a genuine source of FORCE.

A formal specification medium must carry typed identity, contracts, invariants, topology, refinement links, execution permissions, and verification obligations. SpecChat carries them. The medium enforces a division of labor: the human as author of commitments, the realization engine as executor within them. A practice architecture must enforce that division decision by decision. ASAP proposes one, through a classification of every machine contact by authority, suitability, and interaction pattern, and through categorical stop-lines the machine may not cross. One structural risk threatens the entire arrangement: cosmetic relocation, in which the realization engine runs but the human is no longer doing the hard reasoning the specification purports to represent. The medium resists it at the level of what a specification must contain. The practice resists it at the level of who decides what the specification should say.

Neither the medium nor the practice is the deepest problem. The deepest problem is whether the shift from implementation struggle to specification struggle can preserve the developmental pipeline that builds strong engineers. The \alpha S term needs a new source. Formal specification is a plausible candidate. The transition from one source to the other is the danger zone, and Eq. 14a warns that failure there is not easily reversed.

The realization engine exists. The commitment boundary has been drawn. The practice architecture has been proposed. Whether specification struggle can actually build what implementation struggle built is the empirical question that governs which future obtains.