What's your mintokTM?Your cost per million tokens — any model, any chip, for your fleet or your customer's. 10 seconds, no signup.

Workload

live · no signup
Your mintokTM
$0.487/ M-tok
See full plan →7-flow optimization
Modehosted-api baseline
Bound byBandwidth
Fleet8 × TPU 8i (raw 2)

Sizing breakdown

3-constraint
Compute
1 chips
Memory
1 chips
Bandwidth
2 chips
Chips required
2
raw requirement
Topology (rounded)
8
supported pod size
Tokens / chip / day
47.40M
effective ceiling
Power draw
11 kW
incl. server overhead

3-year TCO

cloud · lease · capex · hosted-API
ModeMonthly3-yr TCO$/M-tok
Cloud (on-demand)$35.0K$1.26M$13.52
Lease (committed HaaS)
CapEx (owned)
Hosted-API baseline$1.3K$45.5K$0.487

Hosted-API row is the "do nothing — pay per token" comparison. Output billed at this model's representative hosted rate; input estimated at 25% of output throughput (chat-typical). Input-heavy workloads (RAG, doc analysis) will land slightly higher than shown.

Calculated. Now optimise.
The full Mintok platform — continuous $/M-token optimisation across every workload, every chip, every contract.
Sign up →

How it works

From workload to sized fleet in three steps

01

Pick your workload

Model, chip, throughput. Optional: precision, batch size, context length, MFU. No signup, live recompute.

02

See your mintokTM

Three-constraint sizing (compute · memory · bandwidth), cluster topology, monthly + 3-year TCO across cloud, lease, and CapEx.

03

Size the full fleet

Sign up to size every workload across the fleet, run vendor RFP scenarios, and watch $/M-token shrink in real time.

What Mintok optimises

Sizing + economics across your fleet.

The calculator above covers one workload. The platform covers your entire fleet — every chip, every model, every contract.

Capacity sizing

Pin two of {workload, hardware, site}; solve the third.

Site Sizing

Power-envelope-anchored. Pin your MW budget + facility constraints; plan capacity within them.

Inference Sizing

Token-throughput-anchored. Forecast utilisation per model, surface exhaustion dates, plan redeployment for retired silicon.

Reference Architecture Sizing

Workload-anchored. Pick a workload; compare every silicon — NVIDIA, AMD, Google TPU, Cerebras, Groq, AWS — head-to-head.

Rack Sizing

Hardware-config-anchored. Pin chip + MW; explode into the full DC BOM with compute / memory / latency constraint analysis.

Compute Economics

Project the cost axis at every unit of granularity.

Chip Economics

$/chip-hour, depreciation, $/FLOP. Compare any silicon under CapEx vs cloud, with reseller margin.

Model Economics

$/M-token per model, fleet optimisation, what-if sensitivity to utilisation, depreciation, contract structure.

Cluster Economics

Cluster TCO, $/cluster-hour, $/MW. What-if comparator across rack and cluster shapes.

FDE — Forward Deployment Engagement

Mintok for advising teams.

Same physics, wrapped in the workflow consultancies and in-house AI platform teams use to author recommendations: workload capture, sizing, cost projection, customer-shareable brief.

Engagement Brief

Versioned, customer-shareable working doc. Draft → in-review → approved → shared. Cites the methodology version it was authored against.

Workload Library

Mintok-curated archetypes (chatbot, RAG, agentic, batch). Forkable per tenant; pre-fills Tier-3 numerics so workload spec isn’t a blank page.

Customer Portal

Token-gated read-only view of the brief, workloads, and open questions. No login required — your customer bookmarks a link.

Open Questions

Async replacement for status meetings. Customers comment, you reply, everything threaded under the engagement. Emails fire on each reply.

Every recommendation in a brief traces back to the published methodology — no black box, no proprietary “AI optimization.”

Beyond sizing

From sized plan to delivered capacity

The numbers above are the answer. Below is where you run them — orders, supply, fulfilment, and live agent watchers. One platform, end to end.

01

Size

Capacity, Inference, Reference Architecture, and Rack sizing. $/M-token across every silicon, every workload.

02

Plan

Convert sized plans into orders. ATP per order, priority allocation against committed supply.

03

Deliver

Demand, supply ledger, POs, AVL, vendors. PROPOSED → COMMITTED → RELEASED allocation lifecycle.

04

Monitor

Order Health, Supply Constraints, Coverage, Schedule. Agent watchers ping you when reality drifts from plan.

3
Binding constraints
3+
Contract structures
2
Workflows: ops + FDE
$/M-tok
Unit of measure

Stop guessing your $/M-token.

Mintok is invite-only during private alpha. Tell us about your fleet — chips you're evaluating, target $/M-token, contract mix — and we'll get you set up.