Running on CPU Upgrade Featured 2.54k The Smol Training Playbook π 2.54k The secrets to building world-class LLMs
view article Article OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models Jul 18 β’ 50
Hf-native ColVision Models Collection Models that can be used with the native transformers π€ implementation instead of colpali-engine. β’ 4 items β’ Updated Sep 29 β’ 8
view article Article Training and Finetuning Reranker Models with Sentence Transformers v4 Mar 26 β’ 175
Running 3.55k The Ultra-Scale Playbook π 3.55k The ultimate guide to training LLM on large GPU Clusters
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper β’ 2502.02737 β’ Published Feb 4 β’ 249
ColPali: Efficient Document Retrieval with Vision Language Models Paper β’ 2407.01449 β’ Published Jun 27, 2024 β’ 50
view post Post 5711 I have put together a notebook on Multimodal RAG, where we do not process the documents with hefty pipelines but natively use:- vidore/colpali for retrieval π it doesn't need indexing with image-text pairs but just images!- Qwen/Qwen2-VL-2B-Instruct for generation π¬ directly feed images as is to a vision language model with no processing to text! I used ColPali implementation of the new π Byaldi library by @bclavie π€https://github.com/answerdotai/byaldiLink to notebook: https://github.com/merveenoyan/smol-vision/blob/main/ColPali_%2B_Qwen2_VL.ipynb π₯ 23 23 π 10 10 β€οΈ 4 4 + Reply
view article Article ColPali: Efficient Document Retrieval with Vision Language Models π Jul 5, 2024 β’ 303