Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
OpenMOSS-Team 's Collections
MOSS-Audio
MOSS-Video-Preview
MOSS-VL
MOSS-TTS
AI Can Learn Scientific Taste
Llama Scope 2
MOVA
MOSS-TTSD
MOSS Transcribe Diarize
MOSS-Speech
ABC-Bench
FutureOmni
Game-RL
FRoM-W1
DiRL
RoboOmni
MOSS Embodied Planner
Low Rank Sparse Attention
MHA2MLA-refactor
MHA2MLA
MOSS

MOSS Transcribe Diarize

updated Apr 20

A unified multimodal large language model for end-to-end speaker-attributed, time-stamped transcription.

Upvote
7

  • MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization

    Paper • 2601.01554 • Published Jan 4 • 61

  • Running
    Agents
    Featured
    60

    MOSS Transcribe Diarize

    🏢
    60

    Transcribe audio/video with speaker diarization

Upvote
7
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs