Revisions for ML-Draft-004

DP13 – AI Containment

Back to Draft Comments History

Revision History

Revision 01 Current
Approved

Published: 2026-05-04

Pages: 6 | Words: 2899

What changed:

The upgraded DP13 expands containment from a technical safeguard into a comprehensive control system covering both capability and influence. The earlier version focused primarily on bounding what agents can do (tools, scope, execution limits). The new version adds a second, equally important dimension: how agents affect perception, behavior, and collective reality. This explicitly addresses modern risks like persuasion, narrative shaping, and coordinated influence—not just misuse of tools.
Another major shift is the move toward verifiable, policy-bound containment. The upgraded draft requires that containment be inspectable and tied directly to governance policies (DP12), with visible configuration, logs, and attestations (e.g., TEE-backed enforcement). It also introduces cross-system verification, ensuring containment persists—or visibly degrades—when agents move across platforms. This closes a critical gap: containment can no longer disappear quietly when systems interconnect.
Finally, DP13 now explicitly addresses adversarial and emergent failure modes at scale, especially from external agents. It expands threat modeling to include coordinated influence, incentive leakage, containment bypass via integrations, and “containment theater” (where safeguards appear present but aren’t enforced). It also strengthens the notion that containment must be default-on, adaptive, and resilient under interoperability and composability. The result is a shift from static guardrails to a dynamic, system-wide boundary layer that ensures AI remains bounded—even as it scales, integrates, and interacts across environments.

Published: 2026-04-20

Pages: 7 | Words: 2744

Detailed revision history and diff viewing would be implemented in a full datatracker system.
Build 78 | MLGH Datatracker