Reasoning off feature?

#12
by hell0ks - opened

Hello,

I discovered there is "reasoning_effort" feature in chat_template.jinja. When set it to "low" or "minimal", it is designed to turn off reasoning by adding end tokens.

However it doesn't seem to consistent. Sometimes it emit reasoning behavior even with reasoning_effort = low.

I'd like to know if it is "designed feature" or some kind of leftover.

Thanks.

Model responses may fluctuate from time to time (which is not intended).
We recommend running inference with vLLM and logits processors.
Please refer to the vLLM section in the README.

Sign up or log in to comment