Product

Introducing Magnum 1: private cloud AI for industrial code

Milorad SrdicJun 22, 20268 min read

Every industrial team I talk to wants AI to write more of their code. Almost none of them are allowed to send that code anywhere.

That is the tension. The plant's control logic, the recipe parameters, the safety interlocks, the integration glue that took a decade to get right: this is the most sensitive intellectual property a manufacturer owns. Pasting it into a third-party cloud model is, for a lot of these organisations, simply off the table. Procurement says no. The security team says no. Sometimes a regulator or a customer contract says no.

So the industrial sector has been stuck watching the rest of software get faster while being told the price of entry is to hand over the one thing they cannot hand over.

Today we are shipping our answer to that. It is called Magnum 1, and it is mutexer's private cloud AI inference engine for industrial code generation.

What Magnum 1 is

Magnum 1 is the model that powers code generation inside Magnum Coder for Magnum Enterprise customers. The important part is where it runs and how you pay for it.

It is built on auditable open-weight foundation models. The weights can be inspected and the deployment can be validated against your own security and compliance requirements. There is no closed API and no vendor black box sitting in your supply chain.
It runs entirely inside your environment. Magnum 1 is provisioned on dedicated hardware in your own private cloud. Source code, prompts, and model output stay inside your environment. Nothing is sent to an external model provider, because there is no external model provider in the loop.
It is billed on fixed concurrency, not per token. A Magnum Enterprise tenancy buys a pool of concurrent sessions. Your engineers generate as much code as they need without a per-token meter that climbs every time the team adopts the tool more heavily.

This is the same idea that has always sat under mutexer: keep mission-critical work on hardware you control, on your terms. We did it for real-time control with the mutexer Agent and PREEMPT_RT Linux. Magnum 1 does it for AI.

Why we did not just resell a cloud API

The default path in Magnum Coder runs on a leading hosted model, Claude Sonnet 4.6, billed per token. For most teams that is a great default and it stays available. We are not removing it. Magnum Coder lets you pick the engine per conversation, and you can switch mid-conversation.

But "per token, in someone else's cloud" has two structural problems for our specific market.

The first is data residency. No amount of contract language changes the fact that the code left your environment. For a defence-adjacent supplier, a regulated pharma line, or a utility, "it left our environment" can be the end of the conversation.

The second is cost shape. Per-token billing punishes success. The more your team relies on AI, the bigger the bill, and the bill is variable in a way that makes industrial budgeting genuinely hard. A fixed-capacity model turns an unpredictable operating expense into a known one.

Owning the inference layer fixes both. Your data stays put, and your cost stops scaling with your own adoption.

Measured, not marketed

We do not like AI performance claims that cannot be reproduced, so here are ours from a single reference endpoint. These are API-side measurements we took ourselves, not headline numbers from a spec sheet.

Metric	Result
Warm response time (follow-up turns)	about 150 ms
Sustained single-session generation	about 85 tokens per second
Concurrent warm conversations per endpoint	8 and up, scaling with hardware
Context window per conversation	200,000 tokens

The number that matters most for an agentic coding workload is that first one. After the opening message in a conversation, the engine keeps the conversation's context resident and reprocesses only the new part of each turn. In practice that took our warm follow-up turns from roughly 13 seconds down to about 150 milliseconds, which is around a 25 times speed-up on every turn after the first. The cold cost is paid once, at the start, and then the conversation feels immediate.

It does not fall over

We also ran a deliberate overload test, because "fast on a good day" is not the same as "safe on a bad one". Across the entire stress run the endpoint never crashed and its health check never went down once across thousands of consecutive checks.

A single request larger than the context window is rejected cleanly with an immediate error, rather than silently corrupting state.
Thirty-two simultaneous requests, four times its concurrent capacity, all completed. The extra requests queued and drained in order.
Under the heaviest load we threw at it, the system degraded to latency and queueing, never to failure.

For a single-tenant, one-engine-per-customer deployment, that is exactly the posture you want: if someone over-drives their own endpoint, they see slower responses on their own hardware, not a downed service.

How it compares on agentic coding

Here is the honest part. Magnum 1 will not always out-score a leading hosted model on a raw benchmark. What it gives you is frontier-class agentic coding that runs entirely on your own hardware, and for most industrial teams that trade is the whole point.

Using each provider's own published results on two standard agentic-coding evaluations:

Benchmark	Magnum 1	Claude Sonnet 4.6
SWE-bench Verified (real GitHub issue resolution)	73.4%	78.9%
Terminal-Bench 2.0 (agentic shell use)	51.5%	59.1%

A few points of context, because numbers without conditions are noise. These are each model provider's published figures, measured under their own conditions, so read them as directional rather than as a controlled head-to-head. A SWE-bench Verified score in the low seventies puts Magnum 1 firmly in frontier-class territory for a privately hosted engine: it resolves real, unmodified GitHub issues nearly three times out of four. The hosted model is a few points ahead on both benchmarks. If your only constraint were the leaderboard, you would pick it.

But that is rarely the only constraint in industrial automation. The deciding factor is usually not the last few benchmark points. It is whether the code is allowed to leave your environment at all. If it is not, a frontier-class model you can run in your own private cloud beats a slightly stronger one you are not permitted to use.

Where this fits in the bigger picture

mutexer's bet is that as AI drives the cost of writing software toward zero, the value moves to the infrastructure that runs it: reliable, observable, secure, and under the customer's control. Magnum 1 is that bet applied to the model layer. It turns AI code generation from a metered external dependency into a fixed, owned asset that sits inside your environment alongside your control systems.

That is the same shape as the rest of the platform. One place to build industrial software, run it on hardware you choose, and now generate it with AI that stays in your private cloud.

Frequently asked questions

What is Magnum 1? Magnum 1 is mutexer's private cloud AI inference engine for generating and maintaining industrial software. It powers code generation in Magnum Coder for Magnum Enterprise customers and runs on hardware inside the customer's own environment.

Does Magnum 1 send my code to a third-party cloud? No. Magnum 1 runs inside your own private cloud. Your source code, prompts, and the model's output stay inside your environment, with no egress to an external model provider.

How is Magnum 1 priced? Magnum 1 is billed on a fixed pool of concurrent sessions through the Magnum Enterprise tier, not per token. Cost does not scale with how much your team uses it.

Is Magnum 1 as good as a leading cloud model? On published agentic-coding benchmarks it scores a few points below the leading hosted model (73.4% versus 78.9% on SWE-bench Verified), while running entirely in your own private cloud. For teams that cannot send code outside their own environment, that is a trade worth making.

Can I still use a hosted model in Magnum Coder? Yes. Magnum Coder lets you choose the engine per conversation, including Claude Sonnet 4.6 for the standard cloud path, and you can switch mid-conversation.

Magnum 1 is available now with Magnum Enterprise. If your code cannot leave your environment, talk to us about a private cloud deployment for your team.

Platform

Industries

Resources