Yuandong Tian's picture

5 2

Yuandong Tian

tydsh

·

https://yuandong-tian.com/

AI & ML interests

Reinforcement Learning, Optimization, Representation Learning

Recent Activity

upvoted a paper about 2 months ago

SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models

authored a paper 4 months ago

Deep Think with Confidence

authored a paper 9 months ago

SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks

View all activity

Organizations

None yet

upvoted a paper about 2 months ago

SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models

Paper • 2510.09541 • Published Oct 10 • 14

upvoted a collection over 1 year ago

Llama 2 Family

This collection hosts the transformers and original repos of the Llama 2 and Llama Guard releases • 13 items • Updated Dec 6, 2024 • 92