This appears to be some variant of Nanbeige but un-attributed

by Tibbnak - opened about 18 hours ago

Discussion

Tibbnak

about 18 hours ago

Did you finetune on top of Nanbeige?

DedeProGames

Orion LLM Labs org about 16 hours ago

I didn't use Nanbeige as a base for GRM2; I don't like using a reasoning model as a base to create a powerful reasoning model, and it was trained differently from Nanbeige. I wanted to explore other bases to see which one performs better.

GRM3 will have a different architecture that is neither Llama nor the qwen2.5 of GRM1 and will compete at the qwen3.5-9b level.

Tibbnak

about 16 hours ago

The structure of the config and the chat template are identical to nanbeige and it even has system instructions specifying that it's nanbeige? There's also a commit 5 days ago that says 'Initial import from Nanbeige/Nanbeige4.1-3B with updated model card'

DedeProGames

Orion LLM Labs org about 10 hours ago

Hello Tibbnak, sorry for the inconvenience.
GRM2 was not based on, merged with, or fine-tuned on Nanbeigue.
Previously, I had published a quantized version in NVFP4 (for personal testing) of Nanbeigue to my account, using a local folder already prepared with the repository containing the quantization that changed the modelcard. I used this code to publish GRM2 on the hub, but I forgot to change the commit name.
As for the system instructions, they were indeed intentionally taken from Nanbeigue for local testing, and I personally tested with instructions from other models like gpt-oss or qwen3.5, but the Nanbeigue one was the best for GRM2.
I hope this helps!

DedeProGames changed discussion status to closed about 10 hours ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment