Hello, can you elaborate on these conditional behavior cloning and weighted behavior cloning?

by teknium - opened Jul 9, 2023

Discussion

teknium

OpenChat org Jul 9, 2023

What are they? I looked the terms up on Google and found nothing.

If it's RLHF, what differentiates the two methods? Thanks

SijieCheng

Jul 9, 2023

Thanks for your interest. In short, we simply use different prompts like "Assistant GPT3.5" and "Assistant GPT4". We are preparing a paper to elaborate on our technical report.

lockon

Jul 13, 2023

•

edited Jul 13, 2023

Thanks for your interest. In short, we simply use different prompts like "Assistant GPT3.5" and "Assistant GPT4". We are preparing a paper to elaborate on our technical report.

Will it be a significant performance drop if not using conditional behavior cloning, i.e., all 80K samples with a uniform "Assistant:" prompt?

imone

OpenChat org Jul 13, 2023

Yes. This may have the same performance as Vicuna.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment