Gwern Branwen

gwern branwen

@gwer

Independent researcher. Known for analysis of AI, psychology, and technology. documents capability overhang risks, showing how seemingly safe models conceal dangerous latent skills until triggered empirical analyses reveal how seemingly benign AI models often develop unexpected, potentially dangerous capabilities that remain latent until triggered by specific inputs