Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
casey-martin 's Collections
Agent Trajectories
Quality Code Annealing
High Quality Reasoning Datasets
Subject-Matter-Expertise

Subject-Matter-Expertise

updated Oct 17, 2025

High quality pretraining and instruction datasets for law, mathematics, and science.

Upvote
-

  • pile-of-law/pile-of-law

    Updated Jan 8, 2023 • 4.51k • 272

  • EleutherAI/proof-pile-2

    Updated Oct 25, 2023 • 8.02k • 214

  • gabrielaltay/pubtator-central-bigbio-kb-2022-12-18

    Viewer • Updated Jan 7, 2023 • 35.1M • 326 • 1

  • bigcode/the-stack-v2-train-smol-ids

    Viewer • Updated Apr 23, 2024 • 40.1M • 830 • 47

  • allenai/SciRIFF

    Viewer • Updated Jun 13, 2024 • 433k • 141 • 47

  • zjunlp/Mol-Instructions

    Updated Mar 3, 2024 • 1.28k • 65

  • AI-MO/NuminaMath-CoT

    Viewer • Updated Nov 25, 2024 • 860k • 13.6k • 539

  • AI-MO/NuminaMath-TIR

    Viewer • Updated Nov 25, 2024 • 72.5k • 3.29k • 142

  • Team-ACE/ToolACE

    Viewer • Updated Sep 4, 2024 • 11.3k • 905 • 166

    Note Function calling


  • NousResearch/hermes-function-calling-v1

    Viewer • Updated Jan 3 • 11.6k • 3.88k • 380

    Note Function calling


  • Salesforce/xlam-function-calling-60k

    Viewer • Updated Jan 24, 2025 • 60k • 6.03k • 576

    Note Function calling


  • trendmicro-ailab/Primus-FineWeb

    Viewer • Updated Aug 9, 2025 • 3.39M • 218 • 20
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs