Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Bolian Li's picture

1 1

Bolian Li

lblaoke

AmberYifan's profile picture

·

https://lblaoke.github.io/

lblaoke
lblaoke
bolian-li-554001297

AI & ML interests

None yet

Organizations

None yet

lblaoke 's collections 4

Preference Data

Dahoas/full-hh-rlhf

Viewer • Updated Feb 23, 2023 • 125k • 681 • 86
HuggingFaceH4/ultrafeedback_binarized

Viewer • Updated Oct 16, 2024 • 187k • 8.34k • 317
PKU-Alignment/PKU-SafeRLHF

Viewer • Updated Oct 18, 2024 • 164k • 6.14k • 170
Skywork/Skywork-Reward-Preference-80K-v0.2

Viewer • Updated Oct 25, 2024 • 77k • 695 • 62

Yifan's PPO Models

lblaoke/llama2-7b-ppo-human

7B • Updated Feb 3, 2025 • 1
lblaoke/llama2-7b-ppo-self

7B • Updated Feb 3, 2025 • 2
lblaoke/llama2-7b-ppo-self-human

7B • Updated Feb 3, 2025 • 1
lblaoke/mistral-v0.1-7b-ppo-human

7B • Updated Feb 4, 2025 • 2

lblaoke/qwama-0.5b-skywork-pref-dpo-llama-factory-v1

0.5B • Updated Mar 19, 2025 • 1
lblaoke/qwama-0.5b-skywork-pref-dpo-trl-v1

0.5B • Updated Mar 19, 2025
lblaoke/qwama-0.5b-skywork-pref-dpo-trl-v2

0.5B • Updated Mar 21, 2025 • 1
lblaoke/qwama-0.5b-skywork-pref-sft-rejected-trl-v3

0.5B • Updated Mar 28, 2025 • 2

lblaoke/mistral-v0.3-7b-rm-self-human

Text Classification • 7B • Updated Jan 14, 2025 • 3
lblaoke/mistral-v0.3-7b-rm-self

Text Classification • 7B • Updated Jan 14, 2025 • 1
lblaoke/mistral-v0.3-7b-rm-human

Text Classification • 7B • Updated Jan 14, 2025 • 3
lblaoke/mistral-v0.1-7b-rm-self-human

Text Classification • 7B • Updated Jan 14, 2025 • 3

Preference Data

Dahoas/full-hh-rlhf

Viewer • Updated Feb 23, 2023 • 125k • 681 • 86
HuggingFaceH4/ultrafeedback_binarized

Viewer • Updated Oct 16, 2024 • 187k • 8.34k • 317
PKU-Alignment/PKU-SafeRLHF

Viewer • Updated Oct 18, 2024 • 164k • 6.14k • 170
Skywork/Skywork-Reward-Preference-80K-v0.2

Viewer • Updated Oct 25, 2024 • 77k • 695 • 62

lblaoke/qwama-0.5b-skywork-pref-dpo-llama-factory-v1

0.5B • Updated Mar 19, 2025 • 1
lblaoke/qwama-0.5b-skywork-pref-dpo-trl-v1

0.5B • Updated Mar 19, 2025
lblaoke/qwama-0.5b-skywork-pref-dpo-trl-v2

0.5B • Updated Mar 21, 2025 • 1
lblaoke/qwama-0.5b-skywork-pref-sft-rejected-trl-v3

0.5B • Updated Mar 28, 2025 • 2

Yifan's PPO Models

lblaoke/llama2-7b-ppo-human

7B • Updated Feb 3, 2025 • 1
lblaoke/llama2-7b-ppo-self

7B • Updated Feb 3, 2025 • 2
lblaoke/llama2-7b-ppo-self-human

7B • Updated Feb 3, 2025 • 1
lblaoke/mistral-v0.1-7b-ppo-human

7B • Updated Feb 4, 2025 • 2

lblaoke/mistral-v0.3-7b-rm-self-human

Text Classification • 7B • Updated Jan 14, 2025 • 3
lblaoke/mistral-v0.3-7b-rm-self

Text Classification • 7B • Updated Jan 14, 2025 • 1
lblaoke/mistral-v0.3-7b-rm-human

Text Classification • 7B • Updated Jan 14, 2025 • 3
lblaoke/mistral-v0.1-7b-rm-self-human

Text Classification • 7B • Updated Jan 14, 2025 • 3

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs