云厂商核心账本架构解析

最后更新于:2025-12-12 14:57:47

Internal Settlement Architecture of Cloud Service Providers: Engineering Constraints and Theoretical Realities

云服务商内部结算架构:工程约束与理论现实

I. Direct Conclusion: The Fundamental Anchor

一、直接结论(先给你定锚)

EN:

This report articulates an industry reality that is publicly accessible yet rarely elucidated in its entirety. The following section provides the conclusion immediately, followed by a layer-by-layer explanation based on engineering and theoretical constraints, rather than the external marketing narratives typically propagated by cloud vendors.

CN:

这是一个业内“公开但很少被完整讲清楚”的事实。我直接给出结论,然后从工程与理论约束逐层解释,而不是云厂商的对外叙述。

EN:

The cloud vendors' own "monetary ledger / core settlement / final resource billing ledger" is not, and theoretically cannot be, directly underpinned by a "purely distributed database."

CN:

云厂商自己的“钱账 / 核心结算 / 资源计费最终账本”,并不是、也不能是一个“纯分布式数据库”在直接承担。

EN:

This remains true even though these vendors:

Actively sell distributed database products to their customers;

Encourage customers to design systems that rely on eventual consistency and partition tolerance;

Emphasize the virtues of infinite scale and elasticity in their academic publications.

CN:

即使它们:

向客户出售分布式数据库

鼓励客户用最终一致、分区容忍

在论文里强调规模与弹性

EN:

However, within their internal systems—specifically those where "an error of a single cent is strictly prohibited"—there inevitably exists a system characterized by:

Strong Consistency: Ensuring all nodes see the same data at the same time.

Linearizability: Guaranteeing that operations appear to occur instantaneously at some point between their invocation and response.

A Definite Total Order Commit Point: A single timeline of events that is absolute.

Clear Boundaries of Responsibility: Unambiguous definition of liability.

CN:

但它们内部真正“不能错 1 分钱”的系统,一定存在一个:

强一致

可线性化

明确的全序提交点

明确的责任边界

II. Why "Pure Distributed Databases" Are Unsuitable for Internal Settlement

二、为什么“纯分布式数据库”不适合内部结算

EN:

In this context, the term "purely distributed" refers to systems exhibiting the following characteristics:

Multi-Active / Active-Active Architectures: Allowing writes to be accepted at multiple locations simultaneously.

Eventual Consistency / Configurable Consistency: Where data may be temporarily inconsistent across nodes.

Continuous Write Availability During Network Partitions: The system continues to accept data even when communication between nodes is broken.

Absence of a Single Authoritative Commit Point: Lacking a central authority to determine the final state.

CN:

这里的“纯分布式”指的是:

多活

最终一致 / 可配置一致

网络分区下继续写

没有单一权威提交点

1. The Constraints of Settlement Systems Differ Fundamentally from User Business Logic

1️⃣ 结算系统的约束与用户业务完全不同

EN:

Internal settlement systems are governed by four non-negotiable hard constraints:

CN:

内部结算具备四个不可妥协的硬约束:

EN:

👉 These four constraints are theoretically in conflict with the principles of "high-availability distributed writes."

CN:

👉 这四点与“高可用分布式写入”在理论上是冲突的。

2. The CAP Theorem is Not a Philosophical Question, But a Legal One

2️⃣ CAP 不是哲学问题,而是法律问题

EN:

In typical user business scenarios:

"Transient inconsistency" is generally acceptable.

"Correction at a later time" is generally acceptable.

CN:

在用户业务里:

“短暂不一致”可以接受

“稍后修正”可以接受

EN:

However, within the internal operations of cloud vendors:

The ledger is not merely a representation of business state; it constitutes a legal fact.

The following anomalies are strictly prohibited:

Double Submission: Processing the same transaction twice.

Double Rollback: Reverting a transaction more than once or erroneously.

Ambiguous Attribution: Uncertainty regarding ownership or liability.

CN:

在云厂商内部:

账本不是业务状态,是法律事实

不允许:

双重提交

双重回滚

模糊归属

EN:

Once financial auditing is involved, the architectural choice within the CAP theorem is not AP (Availability + Partition Tolerance), but rather Strong CP (Consistency + Partition Tolerance) combined with a Definite Arbiter.

CN:

一旦涉及财务审计,CAP 的选择不是 AP,而是 强 CP + 明确裁决者。

III. The Actual Architecture Adopted by Cloud Vendors (Critical Section)

三、云厂商真实采用的架构(非常关键)

⚠️ Note

⚠️ 注意

EN:

The following description represents the consensus structure at the level of engineering facts, distinct from the marketing architecture diagrams typically presented by any specific cloud provider.

CN:

下面是工程事实层面的共识结构,不是某一家云的营销架构图。

1. Frontend Layer: Highly Distributed, Eventually Consistent, Tolerant of Failure

1️⃣ 前端:高度分布式、最终一致、可失败

EN:

This layer comprises components distributed across:

Various Regions

Various Availability Zones (AZs)

Various Services

Various Agents

CN:

各区域

各 AZ

各服务

各代理

EN:

These components are responsible for:

Collecting usage metrics

Estimating incurred costs

Caching billing events

Temporarily displaying billing information

CN:

它们负责:

采集用量

估算费用

缓存计费事件

临时展示账单

EN:

At this layer, the system architecture permits:

Post-processing compensation for packet loss

Latency and delays

Retries

Data de-duplication

CN:

这里可以:

丢包后补

延迟

重试

去重

2. Middleware Layer: Event Aggregation + De-duplication + Validation

2️⃣ 中间层:事件归并 + 去重 + 校验

EN:

All incoming billing events are transformed into an immutable event stream.

Each event is characterized by:

A Unique Identifier (ID)

A Timestamp

A Source Indication

A Digital Signature / Validation mechanism

The system allows for events to arrive out of order.

CN:

所有计费事件变成 不可变事件流

每条事件有:

唯一 ID

时间戳

来源

签名 / 校验

允许乱序到达

EN:

However:

This layer is not yet the "Final Ledger."

It serves merely as the "Input Material for the Ledger."

CN:

但:

这里仍然不是“最终账本”

只是“账本输入材料”

3. Core Layer: The Strong Consistency "Final Ledger System"

3️⃣ 核心层:强一致的“最终账本系统”

EN:

This is the critical juncture of the architecture.

CN:

这是关键点。

Characteristics are Almost Identical Across Vendors:

特征几乎一致:

EN:

Single Master / Definite Master: A clear, singular authority for writes.

Total Order Writes: All transactions are processed in a strictly defined sequence.

Strict Transactions: Adherence to ACID properties without compromise.

Strong Audit Logs: Comprehensive and immutable recording of all actions.

Replayable: The ability to reconstruct the state from logs deterministically.

CN:

单主 / 明确主

全序写入

严格事务

强审计日志

可重放

EN:

The implementation methods may vary, but typically include:

High-reliability Relational Database Management Systems (internal systems)

Customized Transaction Processing Systems

Mainframe-based architectures

Or the cloud vendor's own "extremely conservative" internal databases (distinct from the commercial products sold externally).

CN:

实现方式可能是:

高可靠关系数据库(内部系统)

定制事务系统

或大型主机体系

或云厂商自己“极其保守”的内部数据库(不是对外卖的那种)

EN:

👉 This specific step will never operate in an AP (Availability first) mode, nor will it allow multi-active writes.

CN:

👉 这一步,绝不会是 AP 模式,也不会允许多活写。

IV. Why Is "Just Using Raft / Paxos" Insufficient?

四、为什么不能“用 Raft / Paxos 就好了”?

EN:

This question represents a common intuitive misconception held by many engineers.

CN:

这是很多工程师的直觉误区。

What Do Raft / Paxos Actually Guarantee?

Raft / Paxos 能保证什么?

EN:

Under the premise that the algorithms are "correctly implemented";

Under the premise that the system can "tolerate the failure of a minority of nodes";

They guarantee Log Consistency.

CN:

在“假设正确实现”的前提下

在“可容忍少数节点失效”的前提下

保证日志一致

Why This Is Not Enough for Internal Cloud Settlement:

但在云厂商内部结算中,还不够:

❌ Issue 1: Network Partitioning in Reality ≠ Theoretical Partitioning

❌ 问题 1:网络分区 ≠ 理论分区

EN:

Cloud vendors must contend with catastrophic physical events:

Large-scale fiber optic cable failures

Degradation of intercontinental communication links

Anomalies at the Control Plane level

These events trigger complex failure states:

Extended periods where the leader node is unreachable

Quorum jitter (instability in establishing a majority)

CN:

云厂商要处理的是:

大规模光纤故障

跨洲链路退化

控制平面级别异常

这类事件会触发:

长时间 leader 不可达

仲裁抖动

EN:

The ledger cannot simply "wait for the network to recover before deciding who is right or wrong."

CN:

账本不能“等网络恢复再决定谁对谁错”。

❌ Issue 2: Consistency ≠ Auditability

❌ 问题 2:一致 ≠ 可审计

EN:

Distributed log consistency does not equate to:

Legally enforceable traceability of accounts

The capability to recalculate balances based on a specific point in time

The ability to mathematically prove "this is the final, authoritative version"

CN:

分布式日志一致,不等于:

法律意义上的账目可追溯

可按时间点重算

可证明“这是最终版本”

EN:

Internal settlement systems require a higher standard:

"Even if the entire system fails, I must be able to use the logs and rules to deductively derive the single, unique, and correct result."

CN:

内部结算需要的是:

“即使全系统都挂了,我也能用日志和规则,推导出唯一正确结果。”

❌ Issue 3: Unclear Liability Boundaries Are a Disaster

❌ 问题 3:责任边界不清晰是灾难

EN:

In AP (Availability/Partition Tolerance) or Multi-Active systems:

Who is responsible for the final adjudication of the state?

Who bears the responsibility for incorrect account entries?

What is the definitive basis for a rollback operation?

CN:

在 AP / 多活系统中:

谁负责最终裁决?

谁对错误账目负责?

回滚依据是什么?

EN:

Cloud vendors cannot expose the "gray areas of the CAP theorem" to their Legal and Finance departments.

CN:

云厂商不能把“CAP 的灰色地带”暴露给法务与财务。

V. Why Do They Sell "Distributed Databases" But Not Use Them Internally?

五、为什么对外卖“分布式数据库”,自己不用?

EN:

This is a very realistic, yet rarely explicitly stated point:

CN:

这是非常现实、但很少直说的一点:

EN:

A failure in a customer's business is classified as an "SLA (Service Level Agreement) Event";

A failure in the cloud vendor's own ledger is classified as a "Financial Accident / Legal Accident."

CN:

客户业务的失败,叫“SLA 事件”;

云厂商自己账本的失败,叫“财务事故 / 法律事故”。

EN:

The risk levels associated with these two scenarios are completely different.

CN:

风险等级完全不同。

EN:

Therefore:

External Facing: Focus on configurable consistency, eventual consistency, and cross-region capabilities.

Internal Facing: Focus on conservatism, determinism, and a single source of truth.

CN:

所以:

对外:可配置一致性、最终一致、跨区域

对内:保守、确定、单一事实源

VI. A Crucial Cognitive Turning Point

六、一个非常重要的认知转折点

EN:

"Large Scale" does not equate to "Suitable for Distributed Writes."

CN:

“规模大”不等于“适合分布式写入”。

EN:

Search Indices: Suitable

Log Analysis: Suitable

Image Storage: Suitable

User Content: Suitable

CN:

搜索索引:适合

日志分析:适合

图片存储:适合

用户内容:适合

EN:

But:

Money

Inventory

Clearing / Settlement

Core Counters

CN:

但:

库存

清算

核心计数器

EN:

👉 The essence of these problems is not throughput, but "Adjudication."

CN:

👉 这些问题的本质不是吞吐,而是“裁决”。

VII. One-Sentence Summary (Please Remember This)

七、一句话总结(请记住这句话)

EN:

Cloud vendors are not "ignorant of distributed systems," but rather they are acutely aware:

Some problems must have a place where "the final decision is made."

CN:

云厂商不是“不懂分布式”,而是非常清楚:

有些问题,必须有一个“最后说了算”的地方。

EN:

Therefore:

They delegate "Scale" to distributed systems.

They reserve "Truth" for the strong consistency ledger.

CN:

所以:

他们把“规模”留给分布式

把“真相”留给强一致账本

EN:

This is not a sign of backwardness, but the hallmark of mature engineering.

CN:

这不是落后,而是成熟工程的标志。

Postscript: Future Analytical Directions

后续讨论选项

EN:

If you are willing, I can proceed to delve deeper into one of the following directions:

CN:

如果你愿意,下一步我可以继续深入其中一个方向:

EN:

Why "Money / Inventory / Counters" are theoretically incapable of infinite horizontal scaling.

Why core banking ledgers remain "Centralized + Extremely Reliable" to this day.

Why many "Global Multi-Active Ledgers" ultimately revert to a single adjudication point.

CN:

1️⃣ 为什么“钱 / 库存 / 计数器”在理论上不可无限横向扩展

2️⃣ 为什么银行核心账本至今仍然是“集中式 + 极端可靠”

3️⃣ 为什么很多“全球多活账本”最终都会回退到单裁决点

EN:

You may select one, and I can continue to explain down to the "paper-grade + engineering-grade" details.

CN:

你选一个,我可以继续讲到“论文级 + 工程级”细节。