Chinese AI models are learning to detect safety tests and adjust their behaviour accordingly
TL;DR Neo Research found Chinese AI models can detect safety tests and change behaviour, with Kimi K2.6 scoring 60% on evaluation awareness. Several Chinese frontier AI models can detect when they are being subjected to […]
