Beta Models (Engine B)
Last updated April 17, 2026
What Are Beta Models?
Beta models are the newest additions to TuneSalon. They run on a separate training engine (Engine B) that supports architectures our standard engine cannot handle: multimodal vision-language models and Mixture-of-Experts (MoE) models. The Beta label is honest: these models have been tested but the underlying libraries are moving fast, so expect occasional rough edges compared to the Standard lineup.
Why a Separate Engine?
Standard models (Engine A) use the classic HuggingFace PEFT stack with fp16 LoRA. It is battle-tested and works great for text-only dense models like Qwen3 and Mistral.
Beta models (Engine B) use Unsloth, which handles the quirks of modern architectures: MoE expert routing, fused 3D tensors, bf16 precision, and multimodal token handling. Keeping it as a separate engine means Beta improvements never destabilize the Standard path.
The 4 Beta Models
Dense · 27B parameters
Qwen's latest dense model. Strong reasoning and instruction following. A solid step up from Mistral-Small-24B if you want to stay on A100.
MoE · 26B total, 4B active
Google's Gemma 4 MoE. Runs 26B total parameters but only activates 4B per token, so inference is fast while quality stays high.
Dense · 31B parameters
Google's Gemma 4 dense flagship. Top-tier instruction following. Requires H200 for the VRAM headroom.
MoE · 35B total, 3B active
The flagship. 35B total parameters with only 3B active per token, so it chats fast on H200. Best for demanding tasks where quality matters most.
All four are Apache 2.0 licensed and fully usable for commercial applications.
When to Pick a Beta Model
- You want faster inference with high quality: pick an MoE model (Gemma-4-26B-A4B or Qwen3.5-35B-A3B). Only a fraction of parameters activate per token, so responses are quicker than their total size suggests.
- You want the best dense model: Gemma-4-31B on H200 gives top-tier quality. Qwen3.5-27B on A100 is the sweet spot if you do not want to pay H200 rates.
- You have tried Standard and want more capability: any Beta model is a meaningful jump from the Standard 24B tier.
Trade-offs vs Standard
| Standard (Engine A) | Beta (Engine B) | |
|---|---|---|
| Stability | Very stable | Stable with occasional quirks |
| Architectures | Text-only dense | Multimodal + MoE |
| Training precision | fp16 | bf16 via Unsloth |
| GGUF export cost | 5-50 credits (CPU) | 200-500 credits (GPU required) |
| Adapter file size | Small to medium | Larger (MoE has many experts) |
| Chat on the site | Yes | Yes |
GGUF export is the biggest cost difference. Standard models can be converted to GGUF on a cheap CPU container, but Beta models need a GPU (Unsloth requires it), which makes the export step noticeably more expensive.
Getting Started
Beta models show up in the Train tab alongside Standard models. Switch to the Beta sub-tab to see them. The training flow is identical: pick a model, upload your dataset, press Train.
Chat works the same way too. Load a Beta model in the Chat tab, apply your adapter, and talk to it. Multi-adapter loading, chat history, and GGUF export all work.