Further Notes on Our Recent Research on AI Delegation and Long-Horizon Reliability: Understanding AI delegation risks
DELEGATE-52 evaluates whether today’s large language models can be trusted as autonomous delegates for multi-step document work. The study finds consistent long-run corruption across models, with most damage caused by rare but severe failures, and offers practice guidance for safer use.





