Howto create a FP8 quant?

#8
by JochenGebhard - opened

Hello all,

it was easy to create a FP8 quant of the 8b-Embedding Model. The creation of a quant for the Reranker failed for me using Llmcompressor...

The result is technically loadable, but the result of the reranking is always 0.50. Does anybody of you can share the receipt or code to create a FP8 quant of this Model?

Thanks a lot and happy new year πŸ˜€

Sign up or log in to comment