FireRedTeam
/

FireRedChat-turn-detector

Model card Files Files and versions

FireRedTeam commited on Sep 22

Commit

1a53b21

·

verified ·

1 Parent(s): 614e9c1

Update README.md

Files changed (1) hide show

README.md +69 -5

README.md CHANGED Viewed

@@ -1,10 +1,74 @@
 ---
-license: "apache-2.0"
 ---
-### fireredchat-turn-detector
-chinese_best_model_q8.onnx: FireRedChat turn-detector model (Chinese only)
-multilingual_best_model_q8.onnx: FireRedChat turn-detector model (Chinese and English)
 ### Acknowledgment
-Base model: google-bert/bert-base-multilingual-cased (license: "apache-2.0")

 ---
+license: apache-2.0
+language:
+- zh
+- en
+base_model:
+- google-bert/bert-base-multilingual-cased
+tags:
+- agent
 ---
+<div align="center">
+<h1>FireRedChat-turn-detector</h1>
+</div>
+<div align="center">
+  <a href="https://fireredteam.github.io/demos/firered_chat/">Demo</a> •
+  <a href="https://arxiv.org/pdf/2509.06502">Paper</a> •
+  <a href="https://huggingface.co/FireRedTeam">Huggingface</a>
+</div>
+## Descriptions
+Compact end-of-turn detection used in FireRedChat. [livekit plugin available here](https://github.com/fireredchat-submodules/livekit-plugins-fireredchat-turn-detector)
+- chinese_best_model_q8.onnx: FireRedChat turn-detector model (Chinese only)
+- multilingual_best_model_q8.onnx: FireRedChat turn-detector model (Chinese and English)
+## Roadmap
+- [x] 2025/09
+  - [x] Release the onnx checkpoints and livekit plugin.
+## Usage
+```python
+import numpy as np
+import onnxruntime as ort
+from transformers import AutoTokenizer
+def softmax(x):
+    exp_x = np.exp(x - np.max(x, axis=1, keepdims=True))
+    return exp_x / np.sum(exp_x, axis=1, keepdims=True)
+session = ort.InferenceSession(
+    "chinese_best_model_q8.onnx", providers=["CPUExecutionProvider"]
+)
+tokenizer = AutoTokenizer.from_pretrained(
+    "./tokenizer",
+    local_files_only=True,
+    truncation_side="left"
+)
+text = "这是一句没有标点的文本"
+inputs = tokenizer(
+            text,
+            truncation=True,
+            padding='max_length',
+            add_special_tokens=False,
+            return_tensors="np",
+            max_length=128,
+        )
+# Run inference
+outputs = session.run(None,
+                      {
+                          "input_ids": inputs["input_ids"].astype("int64"),
+                          "attention_mask": inputs["attention_mask"].astype("int64")
+                      })
+eou_probability = softmax(outputs[0]).flatten()[-1]
+print(eou_probability, eou_probability>0.5)
+```
 ### Acknowledgment
+- Base model: google-bert/bert-base-multilingual-cased (license: "apache-2.0")