SummarySummaryLinks A deeper look at confessions, reward hacking, and monitoring in alignment research.