DeepSeek launched “resourced” R1 — and tested it on its o3 by OpenAI

The model is available for free in the chatbot, and the weights have been uploaded to Hugging Face.

R1 has been fine-tuned for the first time since its high-profile release in January 2025, writes the South China Morning Post. The developers did not disclose specific details. However, in the independent LiveCodeBenc ranking, which tests models' programming skills, it surpassed o3-mini-high and approached o3.

Model ranking as of May 31, 2025

In the Aider Polyglot Benchmark test for knowledge of different programming languages, R1 matched Claude 4 Opus.
Users note that the model has started to spend longer reasoning on some tasks, up to 30 minutes, and also writes texts better.

The R1 version on Lm Arena was asked to create a copy of the mobile game Flappy Bird. The model replicated the mechanics but couldn't create a bird-shaped character.

The model wrote code for an interactive scene with dinosaurs around a volcano. When the spacebar is pressed, it erupts.

The Deepseek-R1-0528 version is available for free in the chatbot, and the model weights have been uploaded to Hugging Face.

#news #deepseek