AI Alignment

Definition

The research challenge of ensuring AI systems behave in ways that are aligned with human values, goals, and intentions. Alignment spans both near-term concerns (models following instructions safely) and long-term concerns (superintelligent AI pursuing goals compatible with humanity). Near-term alignment techniques include RLHF, Constitutional AI, instruction tuning, and safety training.

Why It Matters

Alignment is fundamentally about making AI want the right things. But even perfectly aligned AI can make mistakes — hallucinate, misinterpret context, or encounter situations not covered by training. Alignment is necessary but not sufficient. Production systems need both aligned models AND deterministic enforcement at the execution boundary.

How Exogram Addresses This

Exogram doesn't solve alignment — it solves execution governance. Even an aligned model needs an execution boundary because alignment is probabilistic and context-dependent. Exogram provides the deterministic guarantee that aligned intent translates into admissible actions.

Related Terms

medium severityProduction Risk Level

Key Takeaways

  • This concept is part of the broader AI governance landscape
  • Production AI requires multiple layers of protection
  • Deterministic enforcement provides zero-error-rate guarantees

Governance Checklist

0/4Vulnerable

Frequently Asked Questions