Models

The Models page is a browser for all models installed in your local Ollama instance, with detailed metadata and capability detection.

Sidebar item: Models Route: /settings?tab=models

Interface Overview

The Models page (accessed via the Settings > Models tab) displays a card or list for each installed model with key information at a glance.

Model Details

Each model card shows:

Field	Description
Name	Model identifier (e.g., `llama3.2:latest`)
Parameter count	Number of parameters (e.g., 3B, 7B, 70B)
Quantization	Compression level (e.g., Q4_0, Q8_0, F16)
Family	Model family (e.g., llama, mistral, gemma)
Format	Model format (e.g., gguf)

Architecture Diagrams

Click on a model to see an architecture diagram showing the model's internal structure — layer count, attention heads, embedding dimensions, and more. This is parsed from Ollama's model metadata.

Capability Detection

LLMxRay automatically detects what each model can do:

Capability	How it's detected	UI effect
Thinking	Model name patterns (e.g., `deepseek-r1`) or Ollama capabilities	Enables reasoning tab in Chat Diagnostics
Vision	Ollama capabilities or model family	Enables image attachment in chat
Embedding	Ollama capabilities or model family	Appears in Embeddings and Knowledge Base model selectors
Tool use	Ollama capabilities	Enables tool calling in chat

Models that support only embeddings are automatically filtered out of chat model selectors.

Model Catalog

The catalog provides a curated view of available models with comparison tables, helping you choose the right model for your use case.

Tips

Quantization trade-off — Lower quantization (Q4) uses less RAM but slightly reduces quality. Q8 and F16 are higher quality but need more memory.
Pull more models from the terminal: ollama pull <model-name>
The capability detection adapts the entire UI — you don't need to configure anything manually.

Models ​

Interface Overview ​

Model Details ​

Architecture Diagrams ​

Capability Detection ​

Model Catalog ​

Tips ​