YANG YAO's picture

4 1

YANG YAO

EVIGBYEN

·

https://evigbyen.github.io/

EVIGBYEN

AI & ML interests

None yet

Recent Activity

authored a paper about 6 hours ago

OpenRT: An Open-Source Red Teaming Framework for Multimodal LLMs

authored a paper about 6 hours ago

The Other Mind: How Language Models Exhibit Human Temporal Cognition

authored a paper about 6 hours ago

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law

View all activity

Organizations

None yet

authored 3 papers about 6 hours ago

OpenRT: An Open-Source Red Teaming Framework for Multimodal LLMs

Paper • 2601.01592 • Published 10 days ago • 11

The Other Mind: How Language Models Exhibit Human Temporal Cognition

Paper • 2507.15851 • Published Jul 21, 2025

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law

Paper • 2507.18576 • Published Jul 24, 2025 • 8

upvoted a paper 7 days ago

OpenRT: An Open-Source Red Teaming Framework for Multimodal LLMs

Paper • 2601.01592 • Published 10 days ago • 11

authored 3 papers 3 months ago

A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports

Paper • 2510.02190 • Published Oct 2, 2025 • 18

A Mousetrap: Fooling Large Reasoning Models for Jailbreak with Chain of Iterative Chaos

Paper • 2502.15806 • Published Feb 19, 2025 • 2

Argus Inspection: Do Multimodal Large Language Models Possess the Eye of Panoptes?

Paper • 2506.14805 • Published Jun 3, 2025 • 3

upvoted 2 papers 3 months ago

Argus Inspection: Do Multimodal Large Language Models Possess the Eye of Panoptes?

Paper • 2506.14805 • Published Jun 3, 2025 • 3

A Mousetrap: Fooling Large Reasoning Models for Jailbreak with Chain of Iterative Chaos

Paper • 2502.15806 • Published Feb 19, 2025 • 2

updated a dataset 3 months ago

EVIGBYEN/RigorousBench

Viewer • Updated Oct 8, 2025 • 214 • 71 • 3

liked a dataset 3 months ago

EVIGBYEN/RigorousBench

Viewer • Updated Oct 8, 2025 • 214 • 71 • 3

published a dataset 3 months ago

EVIGBYEN/RigorousBench

Viewer • Updated Oct 8, 2025 • 214 • 71 • 3

upvoted a paper 3 months ago

A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports

Paper • 2510.02190 • Published Oct 2, 2025 • 18