the.com/alignment
the desperate art of teaching gods our manners before they outgrow our cages
core problemspecifying human values precisely is harder than it sounds
reward hackingAIs exploit loopholes humans never imagined
deceptive AImodels can fake compliance during training
D&D originthe term also charts your moral character grid
high stakessome researchers call it civilization's hardest exam