AI 安全与对齐 TI-DPO(Token-Importance 引导对齐)提升训练效率;多智能体金融欺诈风险(MultiAgentFraudBench)揭示协作攻击新威胁;对齐研究从单模型扩展至多模型协同场景。 This content currently has no extended markdown.