AI & ML interests
NLP, ASR, Text-To-Speech
Recent Activity
-
floschne/m5b_vlod
Viewer • Updated • 1.42k • 12 • 1 -
floschne/m5b_vgr
Viewer • Updated • 1.43k • 11 • 1 -
M5 -- A Diverse Benchmark to Assess the Performance of Large Multimodal Models Across Multilingual and Multicultural Vision-Language Tasks
Paper • 2407.03791 • Published • 2 -
Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model
Paper • 2501.05122 • Published • 19
-
nvidia/stt_kab_conformer_transducer_large
Automatic Speech Recognition • Updated • 19 • 2 -
facebook/omniASR-CTC-300M
Automatic Speech Recognition • Updated • 6 -
facebook/omniASR-LLM-300M
Automatic Speech Recognition • Updated • 4 -
ayymen/stt_zgh_fastconformer_ctc_small
Automatic Speech Recognition • Updated • 124 • 1
-
nvidia/stt_kab_conformer_transducer_large
Automatic Speech Recognition • Updated • 19 • 2 -
facebook/omniASR-CTC-300M
Automatic Speech Recognition • Updated • 6 -
facebook/omniASR-LLM-300M
Automatic Speech Recognition • Updated • 4 -
ayymen/stt_zgh_fastconformer_ctc_small
Automatic Speech Recognition • Updated • 124 • 1
-
floschne/m5b_vlod
Viewer • Updated • 1.42k • 12 • 1 -
floschne/m5b_vgr
Viewer • Updated • 1.43k • 11 • 1 -
M5 -- A Diverse Benchmark to Assess the Performance of Large Multimodal Models Across Multilingual and Multicultural Vision-Language Tasks
Paper • 2407.03791 • Published • 2 -
Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model
Paper • 2501.05122 • Published • 19