Who judges the judges? Governance from metrics: a runtime framework for continuous LLM compliance monitoring
Researchers have introduced a new framework called "governance from metrics" to continuously monitor AI compliance in production systems, moving beyond binary, audit-time verdicts. This approach uses runtime observability to generate compliance signals, aiming to meet the ongoing oversight demands of regulations like the EU AI Act. The framework, named govllm, employs specialized LLM evaluators as "regulatory judges" for criteria such as GDPR and accessibility, with inter-judge disagreement signaling a need for human arbitration. AI
IMPACT Provides a novel approach to continuous AI compliance monitoring, potentially influencing how LLMs are regulated and deployed.