Amanda Askell

amanda askell

@amandaaskell

AI researcher at Anthropic. Focuses on AI safety and ethics. quantifies value learning tradeoffs, proving that even idealized preference models risk catastrophic over-optimization formal analyses demonstrate fundamental tradeoffs in value learning systems, showing how even idealized preference models can lead to catastrophic over-optimization of incomplete specifications