6 511 908

Habibullah Akbar

ChavyvAkvar

https://chavyv.vercel.app

AI & ML interests

AGI, Ethical-Driven AI, OSS AI

Recent Activity

liked a dataset 1 day ago

LLM360/TxT360-3efforts

liked a dataset 1 day ago

LLM360/TxT360-Midas

liked a model 5 days ago

mistralai/Mistral-Large-3-675B-Instruct-2512

View all activity

Organizations

upvoted a paper 26 days ago

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published 28 days ago • 128

upvoted 3 papers about 2 months ago

upvoted a paper 2 months ago

Soft Tokens, Hard Truths

Paper • 2509.19170 • Published Sep 23 • 15

upvoted 3 papers 3 months ago

Single-stream Policy Optimization

Paper • 2509.13232 • Published Sep 16 • 33

Reverse-Engineered Reasoning for Open-Ended Generation

Paper • 2509.06160 • Published Sep 7 • 149

Every Sample Matters: Leveraging Mixture-of-Experts and High-Quality Data for Efficient and Accurate Code LLM

Paper • 2503.17793 • Published Mar 22 • 23

upvoted a paper 4 months ago

MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning

Paper • 2507.16812 • Published Jul 22 • 63

upvoted a paper 5 months ago

EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes

Paper • 2507.11407 • Published Jul 15 • 58

upvoted an article 5 months ago

Article

OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models

Jul 18

•

upvoted a paper 5 months ago

Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation

Paper • 2504.06225 • Published Apr 8 • 3

upvoted an article 5 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

•

734

upvoted 7 papers 5 months ago

Kwai Keye-VL Technical Report

Paper • 2507.01949 • Published Jul 2 • 131

ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention

Paper • 2507.01004 • Published Jul 1 • 10

Energy-Based Transformers are Scalable Learners and Thinkers

Paper • 2507.02092 • Published Jul 2 • 69

Fast and Simplex: 2-Simplicial Attention in Triton

Paper • 2507.02754 • Published Jul 3 • 26

Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

Paper • 2507.01352 • Published Jul 2 • 56

Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers

Paper • 2506.23918 • Published Jun 30 • 89

Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation

Paper • 2507.02608 • Published Jul 3 • 21