AI & ML interests

None defined yet.

danielhanchenĀ 
posted an update 4 days ago
view post
Post
3049
Mistral's new Ministral 3 models can now be Run & Fine-tuned locally! (16GB RAM)
Ministral 3 have vision support and the best-in-class performance for their sizes.
14B Instruct GGUF: unsloth/Ministral-3-14B-Instruct-2512-GGUF
14B Reasoning GGUF: unsloth/Ministral-3-14B-Reasoning-2512-GGUF

🐱 Step-by-step Guide: https://docs.unsloth.ai/new/ministral-3
All GGUFs, BnB, FP8 etc. variants uploads: https://huggingface.co/collections/unsloth/ministral-3
Ā·
danielhanchenĀ 
posted an update 9 days ago
danielhanchenĀ 
posted an update 29 days ago
view post
Post
4166
You can now run Kimi K2 Thinking locally with our Dynamic 1-bit GGUFs: unsloth/Kimi-K2-Thinking-GGUF

We shrank the 1T model to 245GB (-62%) & retained ~85% of accuracy on Aider Polyglot. Run on >247GB RAM for fast inference.

We also collaborated with the Moonshot AI Kimi team on a system prompt fix! 🄰

Guide + fix details: https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally
danielhanchenĀ 
posted an update 4 months ago
view post
Post
6440
Run DeepSeek-V3.1 locally on 170GB RAM with Dynamic 1-bit GGUFs!šŸ‹
GGUFs: unsloth/DeepSeek-V3.1-GGUF

The 715GB model gets reduced to 170GB (-80% size) by smartly quantizing layers.

The 1-bit GGUF passes all our code tests & we fixed the chat template for llama.cpp supported backends.

Guide: https://docs.unsloth.ai/basics/deepseek-v3.1
danielhanchenĀ 
posted an update 4 months ago
danielhanchenĀ 
posted an update 5 months ago
MaziyarPanahiĀ 
posted an update 5 months ago
view post
Post
11961
🧬 Breaking news in Clinical AI: Introducing the OpenMed NER Model Discovery App on Hugging Face šŸ”¬

OpenMed is back! šŸ”„ Finding the right biomedical NER model just became as precise as a PCR assay!

I'm thrilled to unveil my comprehensive OpenMed Named Entity Recognition Model Discovery App that puts 384 specialized biomedical AI models at your fingertips.

šŸŽÆ Why This Matters in Healthcare AI:
Traditional clinical text mining required hours of manual model evaluation. My Discovery App instantly connects researchers, clinicians, and data scientists with the exact NER models they need for their biomedical entity extraction tasks.

šŸ”¬ What You Can Discover:
āœ… Pharmacological Models - Extract "chemical compounds", "drug interactions", and "pharmaceutical" entities from clinical notes
āœ… Genomics & Proteomics - Identify "DNA sequences", "RNA transcripts", "gene variants", "protein complexes", and "cell lines"
āœ… Pathology & Disease Detection - Recognize "pathological formations", "cancer types", and "disease entities" in medical literature
āœ… Anatomical Recognition - Map "anatomical systems", "tissue types", "organ structures", and "cellular components"
āœ… Clinical Entity Extraction - Detect "organism species", "amino acids", 'protein families", and "multi-tissue structures"

šŸ’” Advanced Features:
šŸ” Intelligent Entity Search - Find models by specific biomedical entities (e.g., "Show me models detecting CHEM + DNA + Protein")
šŸ„ Domain-Specific Filtering - Browse by Oncology, Pharmacology, Genomics, Pathology, Hematology, and more
šŸ“Š Model Architecture Insights - Compare BERT, RoBERTa, and DeBERTa implementations
⚔ Real-Time Search - Auto-filtering as you type, no search buttons needed
šŸŽØ Clinical-Grade UI - Beautiful, intuitive interface designed for medical professionals

Ready to revolutionize your biomedical NLP pipeline?

šŸ”— Try it now: OpenMed/openmed-ner-models
🧬 Built with: Gradio, Transformers, Advanced Entity Mapping
Ā·
danielhanchenĀ 
posted an update 5 months ago
danielhanchenĀ 
posted an update 5 months ago
danielhanchenĀ 
posted an update 5 months ago
arthurbresnuĀ 
posted an update 5 months ago
view post
Post
2343
ā€¼ļøSentence Transformers v5.0 is out! The biggest update yet introduces Sparse Embedding models, encode methods improvements, Router module & much more. Sparse + Dense = šŸ”„ hybrid search performance!

1ļøāƒ£ Sparse Encoder Models - New support for sparse embeddings (30k+ dims, <1% non-zero)

* Full SPLADE, Inference-free SPLADE, CSR support
* 4 new modules, 12 losses, 9 evaluators
* Integration with elastic, opensearch-project, Qdrant, ibm-granite
* Decode interpretable embeddings
* Hybrid search integration

2ļøāƒ£ Enhanced Encode Methods

* encode_query & encode_document with auto prompts
* Direct device list passing to encode()
* Cleaner multi-processing

3ļøāƒ£ Router Module & Training

* Different paths for queries vs documents
* Custom learning rates per parameter group
* Composite loss logging
* Perfect for two-tower architectures

4ļøāƒ£ Documentation & Training

* New Training/Loss Overview docs
* 6 training example pages
* Search engine integration examples

Read the comprehensive blogpost about training sparse embedding models: https://huggingface.co/blog/train-sparse-encoder

See the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/v5.0.0

What's next? We would love to hear from the community! What sparse encoder models would you like to see? And what new capabilities should Sentence Transformers handle - multimodal embeddings, late interaction models, or something else? Your feedback shapes our roadmap!

I'm incredibly excited to see the community explore sparse embeddings and hybrid search! The interpretability alone makes this a game-changer for understanding what your models are actually doing.

šŸ™ Thanks to @tomaarsen for this incredible opportunity!

bartowskiĀ 
posted an update 6 months ago
view post
Post
58023
Was going to post this on /r/LocalLLaMa, but apparently it's without moderation at this time :')

bartowski/mistralai_Mistral-Small-3.2-24B-Instruct-2506-GGUF

Was able to use previous mistral chat templates, some hints from Qwen templates, and Claude to piece together a seemingly working chat template, tested it with llama.cpp server and got perfect results, though lmstudio still seems to be struggling for some reason (don't know how to specify a jinja file there)

Outlined the details of the script and results in my llama.cpp PR to add the jinja template:

https://github.com/ggml-org/llama.cpp/pull/14349

Start server with a command like this:

./llama-server -m /models/mistralai_Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M.gguf --jinja --chat-template-file /models/Mistral-Small-3.2-24B-Instruct-2506.jinja


and it should be perfect! Hoping it'll work for ALL tools if lmstudio gets an update or something, not just llama.cpp, but very happy to see it works flawlessly in llama.cpp

In the meantime, will try to open a PR to minja to make the strftime work, but no promises :)
danielhanchenĀ 
posted an update 6 months ago
danielhanchenĀ 
posted an update 6 months ago
view post
Post
2513
Mistral releases Magistral, their new reasoning models! šŸ”„
GGUFs to run: unsloth/Magistral-Small-2506-GGUF

Magistral-Small-2506 excels at mathematics and coding.

You can run the 24B model locally with just 32GB RAM by using our Dynamic GGUFs.