Ling 1T: Open-Source Trillion-Parameter Intelligence
A production-grade foundation model from Ant Group's Bailing team for advanced reasoning workloads.
Mixture-of-Experts architecture with 1/32 routing · 50B active parameters per token · FP8 training for rapid convergence · 32K–128K context via YaRN
Deploy Ling 1T under the MIT license to power quantitative research, enterprise analytics, multilingual assistants, and secure agent ecosystems.
Or launch a specialized Ling 1T assistant
数学导师
擅长解决复杂数学问题,提供详细步骤和证明
代码专家
精通多种编程语言,提供高质量代码实现和优化建议
前端设计师
创建美观实用的前端组件和页面布局
文档分析师
处理长文档的阅读、摘要、翻译和分析
创意作家
创作各类文案、故事和营销内容
通用助手
处理各类通用问题和日常对话
Explore curated Ling 1T prompts
Architecture Highlights
What sets Ling 1T apart from conventional large models
Engineered as a sparse Mixture-of-Experts system, Ling 1T balances trillion-parameter capacity with practical latency and enterprise governance.
Sparse MoE efficiency
Activates ~50B parameters per token with sigmoid expert gating to maintain speed without sacrificing depth.
- 1/32 expert routing keeps inference efficient without sacrificing comprehension.
- Only ~50B parameters activate per token while the full MoE remains available.
- Maintains latency comparable to 50B dense models across production workloads.
FP8 mixed precision
15% faster end-to-end training versus BF16 with negligible loss drift across 1T tokens.
- FP8 stack delivers ~15% faster training vs BF16 with negligible loss drift.
- YaRN scaling extends context from 32K to 128K without destabilising gradients.
- 1F1B pipeline maximises GPU utilisation while keeping memory predictable.
Extended context
32K token default context extends to 128K with YaRN so hierarchical documents remain intact.
- QK normalisation and fused kernels stabilise trillion-parameter optimisation.
- Checkpoint merging plus WSM scheduling smooths long-run convergence.
- Integrated telemetry surfaces routing and safety signals for auditing.
Stable optimization stack
WSM scheduling, checkpoint merging, and fused kernels maximize GPU utilization for large-scale runs.
- Sentence-level LPO alignment keeps reasoning grounded and transparent.
- Supports tool plans, long-context recall, and multilingual instruction following.
- MIT licence lets teams extend or self-host with full commercial freedom.
Where Ling 1T Excels
Operational scenarios proven by evaluation and production rollouts
Pair Ling 1T's trillion-parameter capacity with efficient 50B active routing to deliver disciplined reasoning, precise generation, and resilient tool use.
Quantitative Research
Reach 70.42% on AIME 2025 with concise derivations and reliable numeric reasoning across competition-grade problems.
Software Engineering Automation
Top LiveCodeBench performance enables Ling 1T to draft, refactor, and review production services with clear rationales.
Enterprise Knowledge Analysis
128K token context with YaRN unlocks long-form document parsing, synthesis, and regulatory reporting in one pass.
Financial Risk Intelligence
In production at Ant Group, Ling 1T combines market data interpretation with scenario planning for high-frequency decision support.
Multilingual Compliance
Ling 1T scores 92.19 on C-Eval, enabling policy reviews, training content, and localization across regulated markets.
Agent & Tool Orchestration
With ~70% accuracy on BFCL V3, Ling 1T coordinates tool calls, APIs, and workflows without heavy instruction fine-tuning.
How teams ship with Ling 1T
From prompt to production in three simple steps
Provision API access, instrument usage, and keep your budgets predictable with real-time metering.
Request API credentials, drop the SDK into your stack, and tune decoding parameters per workspace.
- Issue keys for staging and production from the console.
- Set default temperature/top-p for your agents or copilots.
- Monitor live latency and success metrics in the dashboard.
Every request logs input/output tokens, cost in USD, and request metadata for later analysis.
- Usage records stream into your billing view in real time.
- Export CSV/JSON to plug into internal analytics or invoicing.
- Trigger budget alerts or webhooks based on thresholds.
Monthly statements aggregate compute units, taxes, and payments so finance stays in control.
- Top up balance via Creem/Stripe or wire transfer.
- Download tax-compliant invoices for each billing cycle.
- Drill into per-conversation costs to optimise workloads.
Pay only for what you use
Transparent metered billing: $1.40 per million input compute units and $5.60 per million output compute units.
Ling 1T Pricing
Transparent metered billing designed for experimentation and scale. Usage is billed monthly in USD based on compute units (≈1M tokens).
Why teams choose Ling 1T metered pricing
- Single rate for chat, tool use, and API completions—no plan tiers or commitments.
- Usage is aggregated hourly with a 10K compute unit minimum (~10K tokens). Pause or resume whenever you need.
- Dashboards, usage webhooks, and budget alerts are included out of the box.
Cost snapshots
- 0.5M input compute units (~500K tokens) → $0.70.
- 0.25M input + 0.25M output compute units (~250K tokens each) → $1.75.
Prices exclude taxes. Monthly invoices are issued in USD.
Ling 1T FAQ
Answers to the most common deployment and capability questions
Ready to build with Ling 1T?
Download the weights, connect to a managed provider, or embed Ling 1T into your stack today.