arxiv:2604.13740
Michal Valko
AI & ML interests
large language models, reasoning, fine-tuning, test-time computation, reinforcement learning with human feedback, world models
Recent Activity
updated a dataset 1 day ago
misovalko/my-research-papers authored a paper 1 day ago
Spectral Thompson sampling authored a paper 1 day ago
Covariance-adapting algorithm for semi-bandits with application to sparse rewards