Magnum Enterprise

AI Model Inference
that stays in
your private cloud

Magnum 1 is our private cloud inference engine for industrial software. Built on auditable open-weight foundation models, it runs entirely inside your own environment. Your code and your prompts are never sent to a third-party model, and you pay for capacity, not per token.

Talk to SalesSee Magnum Coder

0

Bytes sent to a
third-party model.

Magnum 1 is provisioned in your own private cloud. Source code, prompts, and model output stay inside your environment. There is no third-party cloud model in the loop, so there is nothing to leak, log, or subpoena elsewhere.

Fixed

Cost, not
per-token metering.

Magnum Enterprise buys a pool of concurrent sessions, not a credit meter. Your engineers can generate as much code as they need without a per-token bill that scales with adoption. Budgeting stops being a guess.

Open

Weights you
can audit.

Magnum 1 is built on open-weight foundation models. The weights can be inspected and the deployment can be validated against your own security and compliance requirements. No closed API, no vendor black box in your supply chain.

Measured, not marketed

Characterised on real
reference hardware

These are our own measurements from a single reference endpoint, not vendor headline figures. After the first message in a conversation, the engine reuses the resident context, so follow-up turns start responding in about 150 milliseconds.

~150ms

Warm response time on follow-up turns

~85 tok/s

Sustained single-session generation

8+

Concurrent warm conversations, scaling with hardware

200K

Token context window per conversation

We also tried to break it. Across a deliberate overload run, the endpoint did not crash: 32 simultaneous requests, four times its concurrent capacity, all completed, and over-length requests were rejected cleanly rather than corrupting state. Under extreme load it degrades to latency and queueing, never to failure.

Two engines.
One workflow.

Choose per conversation

Magnum Coder exposes a model selector. Pick Claude Sonnet 4.6 for the standard cloud path, or Magnum 1 when the engagement calls for a private cloud and fixed cost. The choice is per conversation and you can switch mid-conversation. Both write, version, and maintain industrial code in standard languages.

Server-side provider selection. Endpoints never reach the browser.
Same auditable code generation workflow, either way.
Magnum 1 unlocks with the Magnum Enterprise tier.

Standard cloud path

Claude Sonnet 4.6

Frontier hosted model, billed on usage. The default for most teams.

Private cloud, no egress

Magnum 1

Open-weight engine inside your environment, billed on concurrency.

How it compares

Frontier-class agentic
coding, private cloud

Magnum 1 will not always out-score a leading hosted model on a raw benchmark. What it gives you is frontier-class agentic coding that runs entirely inside your own environment. Here is where each model lands on two standard agentic-coding evaluations, using each provider's published results.

BenchmarkMagnum 1Claude Sonnet 4.6
SWE-bench Verified (real GitHub issue resolution)73.4%78.9%
Terminal-Bench 2.0 (agentic shell use)51.5%59.1%

Each score is the model provider's own published result and evaluation conditions differ between vendors, so treat this as directional rather than a controlled head-to-head. Sources: SWE-bench Verified and Terminal-Bench 2.0 as reported by the respective model providers. For most teams the deciding factor is not the last few benchmark points; it is whether the code is allowed to leave your environment at all.

Talk to Sales

Keep your code in
your private cloud

Magnum 1 is available with the Magnum Enterprise tier. Talk to us about provisioning a dedicated, private cloud inference engine for your team.

Talk to SalesSee Pricing