From [$/M-token] to delivered capacity

What's your mintok^TM?Your cost per million tokens — any model, any chip, for your fleet or your customer's. 10 seconds, no signup.

Workload

live · no signup

Model

Precision

Chip

Throughput (tok/s)

Your mintok^TM

$0.487/ M-tok

See full plan →7-flow optimization

Modehosted-api baseline

Bound byBandwidth

Fleet8 × TPU 8i (raw 2)

Sizing breakdown

3-constraint

Compute

1 chips

Memory

1 chips

Bandwidth

2 chips

Chips required

raw requirement

Topology (rounded)

supported pod size

Tokens / chip / day

47.40M

effective ceiling

Power draw

11 kW

incl. server overhead

3-year TCO

cloud · lease · capex · hosted-API

Mode	Monthly	3-yr TCO	$/M-tok
Cloud (on-demand)	$35.0K	$1.26M	$13.52
Lease (committed HaaS)	—	—	—
CapEx (owned)	—	—	—
Hosted-API baseline✓	$1.3K	$45.5K	$0.487

Hosted-API row is the "do nothing — pay per token" comparison. Output billed at this model's representative hosted rate; input estimated at 25% of output throughput (chat-typical). Input-heavy workloads (RAG, doc analysis) will land slightly higher than shown.

How it works

From workload to sized fleet in three steps

Pick your workload

Model, chip, throughput. Optional: precision, batch size, context length, MFU. No signup, live recompute.

See your mintok^TM

Three-constraint sizing (compute · memory · bandwidth), cluster topology, monthly + 3-year TCO across cloud, lease, and CapEx.

Size the full fleet

What Mintok optimises

Sizing + economics across your fleet.

The calculator above covers one workload. The platform covers your entire fleet — every chip, every model, every contract.

Capacity sizing

Pin two of {workload, hardware, site}; solve the third.

Site Sizing

Power-envelope-anchored. Pin your MW budget + facility constraints; plan capacity within them.

Inference Sizing

Token-throughput-anchored. Forecast utilisation per model, surface exhaustion dates, plan redeployment for retired silicon.

Reference Architecture Sizing

Workload-anchored. Pick a workload; compare every silicon — NVIDIA, AMD, Google TPU, Cerebras, Groq, AWS — head-to-head.

Rack Sizing

Hardware-config-anchored. Pin chip + MW; explode into the full DC BOM with compute / memory / latency constraint analysis.

Compute Economics

Project the cost axis at every unit of granularity.

Chip Economics

$/chip-hour, depreciation, $/FLOP. Compare any silicon under CapEx vs cloud, with reseller margin.

Model Economics

$/M-token per model, fleet optimisation, what-if sensitivity to utilisation, depreciation, contract structure.

Cluster Economics

Cluster TCO, $/cluster-hour, $/MW. What-if comparator across rack and cluster shapes.

FDE — Forward Deployment Engagement

Mintok for advising teams.

Same physics, wrapped in the workflow consultancies and in-house AI platform teams use to author recommendations: workload capture, sizing, cost projection, customer-shareable brief.

Engagement Brief

Versioned, customer-shareable working doc. Draft → in-review → approved → shared. Cites the methodology version it was authored against.

Workload Library

Mintok-curated archetypes (chatbot, RAG, agentic, batch). Forkable per tenant; pre-fills Tier-3 numerics so workload spec isn’t a blank page.

Customer Portal

Token-gated read-only view of the brief, workloads, and open questions. No login required — your customer bookmarks a link.

Open Questions

Async replacement for status meetings. Customers comment, you reply, everything threaded under the engagement. Emails fire on each reply.

Every recommendation in a brief traces back to the published methodology — no black box, no proprietary “AI optimization.”

Beyond sizing

From sized plan to delivered capacity

The numbers above are the answer. Below is where you run them — orders, supply, fulfilment, and live agent watchers. One platform, end to end.

Size

Capacity, Inference, Reference Architecture, and Rack sizing. $/M-token across every silicon, every workload.

Plan

Convert sized plans into orders. ATP per order, priority allocation against committed supply.

Deliver

Demand, supply ledger, POs, AVL, vendors. PROPOSED → COMMITTED → RELEASED allocation lifecycle.

Monitor

Order Health, Supply Constraints, Coverage, Schedule. Agent watchers ping you when reality drifts from plan.

Binding constraints

Contract structures

Workflows: ops + FDE

$/M-tok

Unit of measure

Stop guessing your $/M-token.

Mintok is invite-only during private alpha. Tell us about your fleet — chips you're evaluating, target $/M-token, contract mix — and we'll get you set up.

Request invite →Sign in

What's your mintokTM?Your cost per million tokens — any model, any chip, for your fleet or your customer's. 10 seconds, no signup.