Without standardized AI risk evaluations, we risk missing early warning signs. A shared framework is urgently needed.
DeepMind published eval results on 2.5 Pro a month ago: https://storage.googleapis.com/model-cards/documents/gemini-2.5-pro-preview.pdf
DeepMind published eval results on 2.5 Pro a month ago: https://storage.googleapis.com/model-cards/documents/gemini-2.5-pro-preview.pdf