Model Guide

Last updated April 10, 2026

TuneSalon supports 11 models, from lightweight 4B parameter models to powerful 35B models. Smaller models train faster and respond quicker. Larger models produce more nuanced output. Use this guide to find the right model for your use case.

Find Your Model

Step 1: What do you want to build?

All Models

Full comparison of all 11 supported models. VRAM listed is for running the model locally at full precision (FP16) without quantization.

Model	Capability	Best for	GPU	Local VRAM	GGUF Export
Qwen3-4B 4B	Lightweight	Multilingual tasks, concise writing, lightweight assistants.	A100	~10 GB	✓
Mistral-7B 7B	Capable	Content writing, general knowledge, customer-facing text.	A100	~17 GB	✓
Qwen3-8B 8B	Capable	Balanced quality and speed. Strong all-rounder for most use cases.	A100	~20 GB	✓
Ministral-8B 8B	Capable	Structured output, data extraction, technical writing.	A100	~22 GB	✓
Qwen3-14B 14B	Advanced	Deep domain knowledge, nuanced writing, multilingual fluency.	A100	~36 GB	✓
Mistral-Small-24B 24B	Advanced	Professional-grade content, complex reasoning, specialist fields.	A100	~55 GB	✓
Qwen3-32B 32B	Most Powerful	Maximum quality for demanding tasks. Legal, medical, research.	H200 Only	~80 GB	✓
Qwen3.5-35B 35B	Most Powerful	Qwen3.5 MoE flagship. 35B total, 3B active params per token. Beta (Engine B).	H200 Only	~90 GB	✓
Qwen3.5-27B 27B	Advanced	Qwen's latest dense model. Strong reasoning and instruction following. Beta (Engine B).	A100	~56 GB	✓
Gemma-4-26B-A4B 26B (MoE)	Advanced	Google's Gemma 4 MoE. 26B total, 4B active. Efficient and capable. Beta (Engine B).	A100	~52 GB	✓
Gemma-4-31B 31B	Most Powerful	Google's Gemma 4 dense flagship. Top-tier instruction following. Beta (Engine B).	H200 Only	~62 GB	✓

Qwen3-4B 4BLightweight

Multilingual tasks, concise writing, lightweight assistants.

A100|VRAM: ~10 GB|GGUF: Yes

Mistral-7B 7BCapable

Content writing, general knowledge, customer-facing text.

A100|VRAM: ~17 GB|GGUF: Yes

Qwen3-8B 8BCapable

Balanced quality and speed. Strong all-rounder for most use cases.

A100|VRAM: ~20 GB|GGUF: Yes

Ministral-8B 8BCapable

Structured output, data extraction, technical writing.

A100|VRAM: ~22 GB|GGUF: Yes

Qwen3-14B 14BAdvanced

Deep domain knowledge, nuanced writing, multilingual fluency.

A100|VRAM: ~36 GB|GGUF: Yes

Mistral-Small-24B 24BAdvanced

Professional-grade content, complex reasoning, specialist fields.

A100|VRAM: ~55 GB|GGUF: Yes

Qwen3-32B 32BMost Powerful

Maximum quality for demanding tasks. Legal, medical, research.

H200 Only|VRAM: ~80 GB|GGUF: Yes

Qwen3.5-35B 35BMost Powerful

Qwen3.5 MoE flagship. 35B total, 3B active params per token. Beta (Engine B).

H200 Only|VRAM: ~90 GB|GGUF: Yes

Qwen3.5-27B 27BAdvanced

Qwen's latest dense model. Strong reasoning and instruction following. Beta (Engine B).

A100|VRAM: ~56 GB|GGUF: Yes

Gemma-4-26B-A4B 26B (MoE)Advanced

Google's Gemma 4 MoE. 26B total, 4B active. Efficient and capable. Beta (Engine B).

A100|VRAM: ~52 GB|GGUF: Yes

Gemma-4-31B 31BMost Powerful

Google's Gemma 4 dense flagship. Top-tier instruction following. Beta (Engine B).

H200 Only|VRAM: ~62 GB|GGUF: Yes

A100 vs H200 GPU Guide How LoRA Training Works After You Export

Model Guide

Find Your Model

Step 1: What do you want to build?

All Models

Related