ViT-JLNN: Neuro-Symbolic Vision Transformer

This model combines a Vision Transformer (ViT) backbone with a JAX Logical Neural Network (JLNN) layer. It provides high-accuracy image classification with built-in interpretability and uncertainty quantification.

JLNN documentation

Key Features

  • Neuro-Symbolic Architecture: Hybrid model using Flax/JAX.
  • Interpretable Logic: Uses Łukasiewicz t-norm logic to process fuzzy predicates.
  • Uncertainty Quantification: Provides [L, U] intervals (Lower and Upper bounds) for every prediction.
  • Logical Audit: Each classification includes an audit trail of which rules were triggered.

Model Grounding

To ensure stable training and prevent binary collapse, we use a custom FuzzyGrounding layer with the following parameters:

  • Temperature Scaling (tau): 1.4 (softens the sigmoid gradients)
  • Centered Bias: -1.2 (starts predicates in a cautious, non-binary state)
  • Logic: Łukasiewicz t-norm for fuzzy operations

This configuration ensures that the Vision Transformer and the Logical layer converge smoothly, as seen in the training logs.

Usage

Refer to the official JLNN Repository for inference scripts.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train KRadim/vit-jlnn-cifar10