Ling 1T: Open-Source Trillion-Parameter Intelligence

A production-grade foundation model from Ant Group's Bailing team for advanced reasoning workloads.

Mixture-of-Experts architecture with 1/32 routing · 50B active parameters per token · FP8 training for rapid convergence · 32K–128K context via YaRN

Deploy Ling 1T under the MIT license to power quantitative research, enterprise analytics, multilingual assistants, and secure agent ecosystems.

Or launch a specialized Ling 1T assistant

热门

数学导师

擅长解决复杂数学问题,提供详细步骤和证明

热门

代码专家

精通多种编程语言,提供高质量代码实现和优化建议

前端设计师

创建美观实用的前端组件和页面布局

文档分析师

处理长文档的阅读、摘要、翻译和分析

创意作家

创作各类文案、故事和营销内容

通用助手

处理各类通用问题和日常对话

Explore curated Ling 1T prompts

Architecture Highlights

What sets Ling 1T apart from conventional large models

Engineered as a sparse Mixture-of-Experts system, Ling 1T balances trillion-parameter capacity with practical latency and enterprise governance.

Sparse MoE efficiency

Activates ~50B parameters per token with sigmoid expert gating to maintain speed without sacrificing depth.

  • 1/32 expert routing keeps inference efficient without sacrificing comprehension.
  • Only ~50B parameters activate per token while the full MoE remains available.
  • Maintains latency comparable to 50B dense models across production workloads.

FP8 mixed precision

15% faster end-to-end training versus BF16 with negligible loss drift across 1T tokens.

  • FP8 stack delivers ~15% faster training vs BF16 with negligible loss drift.
  • YaRN scaling extends context from 32K to 128K without destabilising gradients.
  • 1F1B pipeline maximises GPU utilisation while keeping memory predictable.

Extended context

32K token default context extends to 128K with YaRN so hierarchical documents remain intact.

  • QK normalisation and fused kernels stabilise trillion-parameter optimisation.
  • Checkpoint merging plus WSM scheduling smooths long-run convergence.
  • Integrated telemetry surfaces routing and safety signals for auditing.

Stable optimization stack

WSM scheduling, checkpoint merging, and fused kernels maximize GPU utilization for large-scale runs.

  • Sentence-level LPO alignment keeps reasoning grounded and transparent.
  • Supports tool plans, long-context recall, and multilingual instruction following.
  • MIT licence lets teams extend or self-host with full commercial freedom.

Where Ling 1T Excels

Operational scenarios proven by evaluation and production rollouts

Pair Ling 1T's trillion-parameter capacity with efficient 50B active routing to deliver disciplined reasoning, precise generation, and resilient tool use.

Quantitative Research

Reach 70.42% on AIME 2025 with concise derivations and reliable numeric reasoning across competition-grade problems.

Software Engineering Automation

Top LiveCodeBench performance enables Ling 1T to draft, refactor, and review production services with clear rationales.

Enterprise Knowledge Analysis

128K token context with YaRN unlocks long-form document parsing, synthesis, and regulatory reporting in one pass.

Financial Risk Intelligence

In production at Ant Group, Ling 1T combines market data interpretation with scenario planning for high-frequency decision support.

Multilingual Compliance

Ling 1T scores 92.19 on C-Eval, enabling policy reviews, training content, and localization across regulated markets.

Agent & Tool Orchestration

With ~70% accuracy on BFCL V3, Ling 1T coordinates tool calls, APIs, and workflows without heavy instruction fine-tuning.

How teams ship with Ling 1T

From prompt to production in three simple steps

Provision API access, instrument usage, and keep your budgets predictable with real-time metering.

1Connect & configure

Request API credentials, drop the SDK into your stack, and tune decoding parameters per workspace.

  • Issue keys for staging and production from the console.
  • Set default temperature/top-p for your agents or copilots.
  • Monitor live latency and success metrics in the dashboard.
2Track usage automatically

Every request logs input/output tokens, cost in USD, and request metadata for later analysis.

  • Usage records stream into your billing view in real time.
  • Export CSV/JSON to plug into internal analytics or invoicing.
  • Trigger budget alerts or webhooks based on thresholds.
3Settle invoices with confidence

Monthly statements aggregate compute units, taxes, and payments so finance stays in control.

  • Top up balance via Creem/Stripe or wire transfer.
  • Download tax-compliant invoices for each billing cycle.
  • Drill into per-conversation costs to optimise workloads.

Pay only for what you use

Transparent metered billing: $1.40 per million input compute units and $5.60 per million output compute units.

Pay-as-you-go

Ling 1T Pricing

Transparent metered billing designed for experimentation and scale. Usage is billed monthly in USD based on compute units (≈1M tokens).

Input usage
$1.40per million compute units
Output usage
$5.60per million compute units

Why teams choose Ling 1T metered pricing

  • Single rate for chat, tool use, and API completions—no plan tiers or commitments.
  • Usage is aggregated hourly with a 10K compute unit minimum (~10K tokens). Pause or resume whenever you need.
  • Dashboards, usage webhooks, and budget alerts are included out of the box.

Cost snapshots

  • 0.5M input compute units (~500K tokens) → $0.70.
  • 0.25M input + 0.25M output compute units (~250K tokens each) → $1.75.

Prices exclude taxes. Monthly invoices are issued in USD.

Ling 1T FAQ

Answers to the most common deployment and capability questions

Ready to build with Ling 1T?

Download the weights, connect to a managed provider, or embed Ling 1T into your stack today.

Contact the team