SmolLM-1.7B-Instruct_fsdp_qlora_nf4_adapter-plaba

Browse files

Files changed (5) hide show

README.md +10 -18
adapter_config.json +4 -4
adapter_model.safetensors +1 -1
runs/Sep11_14-21-53_algo-2/events.out.tfevents.1726064715.algo-2.67.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [dmariko/SmolLM-1.7B-Instruct_qlora_nf4_merged](https://huggingface.co/dmariko/SmolLM-1.7B-Instruct_qlora_nf4_merged) on the generator dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.6099
 ## Model description
@@ -50,28 +50,20 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.03
-- num_epochs: 20
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| No log        | 0.8   | 1    | 1.7411          |
-| No log        | 1.6   | 2    | 1.7359          |
-| No log        | 2.4   | 3    | 1.7081          |
-| No log        | 4.0   | 5    | 1.6632          |
-| No log        | 4.8   | 6    | 1.6608          |
-| No log        | 5.6   | 7    | 1.6495          |
-| No log        | 6.4   | 8    | 1.6376          |
-| 1.6842        | 8.0   | 10   | 1.6228          |
-| 1.6842        | 8.8   | 11   | 1.6187          |
-| 1.6842        | 9.6   | 12   | 1.6161          |
-| 1.6842        | 10.4  | 13   | 1.6142          |
-| 1.6842        | 12.0  | 15   | 1.6116          |
-| 1.6842        | 12.8  | 16   | 1.6108          |
-| 1.6842        | 13.6  | 17   | 1.6103          |
-| 1.6842        | 14.4  | 18   | 1.6100          |
-| 1.6057        | 16.0  | 20   | 1.6099          |
 ### Framework versions

 This model is a fine-tuned version of [dmariko/SmolLM-1.7B-Instruct_qlora_nf4_merged](https://huggingface.co/dmariko/SmolLM-1.7B-Instruct_qlora_nf4_merged) on the generator dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.6513
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.03
+- num_epochs: 10
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| No log        | 0.8   | 1    | 1.7222          |
+| No log        | 1.6   | 2    | 1.7181          |
+| No log        | 2.4   | 3    | 1.6971          |
+| No log        | 4.0   | 5    | 1.6586          |
+| No log        | 4.8   | 6    | 1.6597          |
+| No log        | 5.6   | 7    | 1.6572          |
+| No log        | 6.4   | 8    | 1.6539          |
+| 1.6809        | 8.0   | 10   | 1.6513          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -20,13 +20,13 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "k_proj",
-    "gate_proj",
     "q_proj",
     "up_proj",
     "v_proj",
-    "down_proj",
-    "o_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "q_proj",
+    "o_proj",
+    "k_proj",
+    "down_proj",
     "up_proj",
     "v_proj",
+    "gate_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:aee3b85d59d8083e504d3121970142b37d8ff4eae9b7aa44a3b2c60d91f8c091
 size 36220744

 version https://git-lfs.github.com/spec/v1
+oid sha256:24171a1ff07b41bc5a87d00271150a324174c2fea950c444e400cebd6b9b09e3
 size 36220744

runs/Sep11_14-21-53_algo-2/events.out.tfevents.1726064715.algo-2.67.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c3975e692cb23b4e2d98591b17bd947982fd4220fa13cc4b96f58a8b994588fb
+size 8193

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:22296f32123fb2582f0364394e067d7535d5d21d1b4aa03abe6d6530f093fe44
 size 5240

 version https://git-lfs.github.com/spec/v1
+oid sha256:a4f3324fbd3679816b52837661e6126832c80e92189d89c21cbd109768fc9930
 size 5240