| | --- |
| | base_model: |
| | - arcee-ai/Llama-3.1-SuperNova-Lite |
| | - deepseek-ai/DeepSeek-R1-Distill-Llama-8B |
| | - FuseAI/FuseChat-Llama-3.1-8B-Instruct |
| | library_name: transformers |
| | tags: |
| | - mergekit |
| | - merge |
| | license: llama3.1 |
| | language: |
| | - en |
| | model-index: |
| | - name: Llama3.1-SuperDeepFuse |
| | results: |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: IFEval (0-Shot) |
| | type: wis-k/instruction-following-eval |
| | split: train |
| | args: |
| | num_few_shot: 0 |
| | metrics: |
| | - type: inst_level_strict_acc and prompt_level_strict_acc |
| | value: 77.62 |
| | name: averaged accuracy |
| | source: |
| | url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FLlama3.1-SuperDeepFuse |
| | name: Open LLM Leaderboard |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: BBH (3-Shot) |
| | type: SaylorTwift/bbh |
| | split: test |
| | args: |
| | num_few_shot: 3 |
| | metrics: |
| | - type: acc_norm |
| | value: 29.22 |
| | name: normalized accuracy |
| | source: |
| | url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FLlama3.1-SuperDeepFuse |
| | name: Open LLM Leaderboard |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: MATH Lvl 5 (4-Shot) |
| | type: lighteval/MATH-Hard |
| | split: test |
| | args: |
| | num_few_shot: 4 |
| | metrics: |
| | - type: exact_match |
| | value: 17.75 |
| | name: exact match |
| | source: |
| | url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FLlama3.1-SuperDeepFuse |
| | name: Open LLM Leaderboard |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: GPQA (0-shot) |
| | type: Idavidrein/gpqa |
| | split: train |
| | args: |
| | num_few_shot: 0 |
| | metrics: |
| | - type: acc_norm |
| | value: 3.24 |
| | name: acc_norm |
| | source: |
| | url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FLlama3.1-SuperDeepFuse |
| | name: Open LLM Leaderboard |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: MuSR (0-shot) |
| | type: TAUR-Lab/MuSR |
| | args: |
| | num_few_shot: 0 |
| | metrics: |
| | - type: acc_norm |
| | value: 5.13 |
| | name: acc_norm |
| | source: |
| | url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FLlama3.1-SuperDeepFuse |
| | name: Open LLM Leaderboard |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: MMLU-PRO (5-shot) |
| | type: TIGER-Lab/MMLU-Pro |
| | config: main |
| | split: test |
| | args: |
| | num_few_shot: 5 |
| | metrics: |
| | - type: acc |
| | value: 30.83 |
| | name: accuracy |
| | source: |
| | url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FLlama3.1-SuperDeepFuse |
| | name: Open LLM Leaderboard |
| | --- |
| | # Llama3.1-SuperDeepFuse |
| |
|
| | An 8B parameter language model that merges three high-performance distilled models to boost reasoning, instruction-following, and performance in mathematics and coding. |
| |
|
| | ## Model Highlights |
| |
|
| | - **Size**: 8 billion parameters |
| | - **Base**: [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) |
| | - **Merged Sources**: |
| | - [arcee-ai/Llama-3.1-**Super**Nova-Lite](https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite) |
| | - [deepseek-ai/**Deep**Seek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) |
| | - [FuseAI/**Fuse**Chat-Llama-3.1-8B-Instruct](https://huggingface.co/FuseAI/FuseChat-Llama-3.1-8B-Instruct) |
| | - **Merge Method**: `model_stock` |
| |
|
| | ## Key Capabilities |
| |
|
| | - Enhanced multi-task reasoning |
| | - Improved mathematical and coding performance |
| | - Multilingual support |
| |
|
| | ## Performance Notes |
| |
|
| | - Maintains Llama 3.1 safety standards |
| | - Suitable for consumer GPU deployment |
| | - Balanced performance across diverse tasks |
| |
|
| | ## Considerations |
| |
|
| | - Still being benchmarked |
| | - Capabilities limited compared to larger model variants |
| | - Can give misleading output like all other language models |
| | - Outputs should be independently verified |
| |
|
| | ## Licensing |
| |
|
| | Follows standard Llama 3.1 usage terms. |
| | # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) |
| | Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/agentlans__Llama3.1-SuperDeepFuse-details)! |
| | Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=agentlans%2FLlama3.1-SuperDeepFuse&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)! |
| |
|
| | | Metric |Value (%)| |
| | |-------------------|--------:| |
| | |**Average** | 27.30| |
| | |IFEval (0-Shot) | 77.62| |
| | |BBH (3-Shot) | 29.22| |
| | |MATH Lvl 5 (4-Shot)| 17.75| |
| | |GPQA (0-shot) | 3.24| |
| | |MuSR (0-shot) | 5.13| |
| | |MMLU-PRO (5-shot) | 30.83| |
| |
|
| |
|