Bartosz Cywiński's picture

Bartosz Cywiński

bcywinski

·

https://cywinski.github.io/

AI & ML interests

Mechanistic Interpretability

Recent Activity

upvoted a collection 4 days ago

updated a collection 4 days ago

Llama-3.1-8B-Instruct-taboo

updated a collection 4 days ago

Llama-3.1-8B-Instruct-taboo

View all activity

Organizations

None yet

upvoted a collection 4 days ago

Olmo 3

Artifacts for the Olmo 3 release. • 9 items • Updated 17 days ago • 159

upvoted a collection 2 months ago

Open Character Training

https://arxiv.org/abs/2511.01689 • 8 items • Updated Nov 4, 2025 • 4

upvoted a collection 5 months ago

Dream 7B

https://hkunlp.github.io/blog/2025/dream/ • 2 items • Updated Jul 16, 2025 • 6

upvoted a paper 8 months ago

Towards eliciting latent knowledge from LLMs with mechanistic interpretability

Paper • 2505.14352 • Published May 20, 2025 • 9

upvoted an article 8 months ago

Article

Vision Language Models (Better, faster, stronger)

+3

May 12, 2025

•

583

upvoted 3 papers 11 months ago

Precise Parameter Localization for Textual Generation in Diffusion Models

Paper • 2502.09935 • Published Feb 14, 2025 • 12

No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces

Paper • 2502.04959 • Published Feb 7, 2025 • 11

SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders

Paper • 2501.18052 • Published Jan 29, 2025 • 8

upvoted a paper about 1 year ago

Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Paper • 2410.22366 • Published Oct 28, 2024 • 84

upvoted a collection about 1 year ago

🔍 Interpretability & Analysis of LMs

Outstanding research in LM interpretability and evaluation, summarized • 135 items • Updated 22 days ago • 116