NoSQL 企业应用与集成指南

最后更新于:2025-11-28 22:45:06

Comprehensive Analysis of Converged Database Architectures and NoSQL Interoperability

融合数据库架构与NoSQL互操作性的综合分析

Executive Summary

执行摘要

[EN]

The contemporary landscape of data management is undergoing a profound structural shift, moving away from the rigid dichotomy of Relational Database Management Systems (RDBMS) and isolated NoSQL silos toward "converged" and "multi-model" architectures. This report provides an exhaustive technical analysis of this evolution, synthesizing data from Oracle, MongoDB, Redis, ArangoDB, Couchbase, DataStax, and IBM. The analysis reveals a distinct trend: established enterprise vendors are aggressively integrating NoSQL paradigms—such as JSON document storage, eventual consistency, and sharding—directly into their mature SQL engines. Simultaneously, specialized NoSQL vendors are fortifying their platforms with enterprise-grade features like ACID transactions and SQL-compatible query languages. This convergence is driven by the necessity to reduce operational complexity, eliminate data silos, and support diverse workloads (document, graph, relational, key-value) within a unified, secure infrastructure.

[CN]

当代数据管理格局正经历着深刻的结构性转变,逐渐远离关系型数据库管理系统(RDBMS)与孤立的 NoSQL 数据孤岛之间的僵化二元对立,转向“融合”和“多模型”架构。本报告对这一演变进行了详尽的技术分析,综合了来自 Oracle、MongoDB、Redis、ArangoDB、Couchbase、DataStax 和 IBM 的数据。分析揭示了一个明显的趋势:成熟的企业级供应商正在积极将 NoSQL 范式——例如 JSON 文档存储、最终一致性和分片——直接整合到其成熟的 SQL 引擎中。与此同时,专业的 NoSQL 供应商正在通过 ACID 事务和兼容 SQL 的查询语言等企业级功能来强化其平台。这种融合是由减少操作复杂性、消除数据孤岛以及在统一、安全的基础设施中支持多样化工作负载(文档、图、关系、键值)的需求所驱动的。

1. Oracle Database API for MongoDB: Architecture and Implementation

1. Oracle Database API for MongoDB:架构与实现

[EN]

The Oracle Database API for MongoDB represents a strategic architectural bridge designed to enable seamless interoperability between the developer-centric MongoDB ecosystem and the operationally robust Oracle Database environment. This is not merely a data migration utility; rather, it is a sophisticated protocol emulation layer that allows existing applications and drivers to communicate with an Oracle Database as if it were a MongoDB cluster.1 By intercepting and translating standard MongoDB wire protocol commands into Oracle's native SQL and JSON processing operations, the API preserves the developer experience while leveraging the underlying storage and security mechanisms of the Oracle Autonomous Database.

[CN]

Oracle Database API for MongoDB 代表了一座战略性的架构桥梁,旨在实现以开发者为中心的 MongoDB 生态系统与运行稳健的 Oracle 数据库环境之间的无缝互操作性。这不仅仅是一个数据迁移工具;相反,它是一个复杂的协议仿真层,允许现有的应用程序和驱动程序像与 MongoDB 集群通信一样与 Oracle 数据库进行通信 1。通过拦截标准的 MongoDB 有线协议命令并将其转换为 Oracle 原生的 SQL 和 JSON 处理操作,该 API 在保留开发者体验的同时,利用了 Oracle 自治数据库(Oracle Autonomous Database)底层存储和安全机制。

1.1 Architectural Mechanism and Protocol Translation

1.1 架构机制与协议转换

[EN]

The core functionality of the Oracle Database API for MongoDB relies on the "converged database" capabilities of the Oracle Autonomous Database, which can manage multiple data types—including Relational, JSON, Graph, and Spatial—within a single database kernel.2 This integration allows for high-fidelity SQL interoperability, enabling users to execute SQL queries or updates against JSON data that was originally ingested via MongoDB drivers.2 When a MongoDB command is received, the API translates it. For example, complex aggregation pipelines are analyzed; stages like $match, $project, $limit, and $sort are converted into equivalent SQL execution plans. However, distinct limitations exist based on the database version: if the Oracle Database parameter compatible is set to less than 23, the API cannot utilize the native translation for MongoDB aggregation pipelines and must resort to alternative, potentially less optimized execution paths.3 This highlights the deep dependency of the API on the evolving features of the Oracle Database kernel.

[CN]

Oracle Database API for MongoDB 的核心功能依赖于 Oracle 自治数据库的“融合数据库”能力,该数据库可以在单个数据库内核中管理多种数据类型——包括关系型、JSON、图和空间数据 2。这种集成实现了高保真的 SQL 互操作性,使用户能够对最初通过 MongoDB 驱动程序摄入的 JSON 数据执行 SQL 查询或更新 2。当接收到 MongoDB 命令时,API 会对其进行转换。例如,复杂的聚合管道会被分析;诸如 $match、$project、$limit 和 $sort 等阶段会被转换为等效的 SQL 执行计划。然而,基于数据库版本存在明显的限制:如果 Oracle 数据库参数 compatible 设置为低于 23,API 无法利用针对 MongoDB 聚合管道的原生转换,必须采用替代的、可能优化程度较低的执行路径 3。这凸显了该 API 对 Oracle 数据库内核不断演进特性的深度依赖。

[EN]

Furthermore, the architecture enables the utilization of specific Oracle optimizations that act as functional equivalents to MongoDB concepts. The API supports specialized "hints" such as $native and $service (specifically $service hint 3.6), which allow developers to influence the execution strategy of the underlying SQL queries generated by the API.4 This provides a mechanism for tuning performance that bridges the declarative nature of MongoDB queries with the cost-based optimization of the Oracle SQL engine.

[CN]

此外,该架构允许利用特定的 Oracle 优化措施,这些措施充当了 MongoDB 概念的功能等价物。API 支持诸如 $native 和 $service(特别是 $service hint 3.6)等专用“提示”(hints),允许开发者影响由 API 生成的底层 SQL 查询的执行策略 4。这提供了一种性能调优机制,将 MongoDB 查询的声明性本质与 Oracle SQL 引擎的基于成本的优化连接起来。

1.2 Configuration, Security, and Role Management

1.2 配置、安全与角色管理

[EN]

Deploying the MongoDB API within the Oracle Autonomous Database adheres to a "secure by default" philosophy, requiring explicit administrative actions to enable connectivity. Access must first be provisioned at the network level. Subsequently, the MongoDB API itself is not active out-of-the-box; administrators must navigate to the "Tool configuration" tab on the Autonomous Database details page and explicitly toggle the specific tool configuration to "Enabled".2 This separation of network access and protocol activation provides granular control over the attack surface.

[CN]

在 Oracle 自治数据库中部署 MongoDB API 遵循“默认安全”的理念,要求显式的管理操作来启用连接。首先必须在网络层面配置访问权限。随后,MongoDB API 本身并不是开箱即用的;管理员必须导航至自治数据库详情页面的“工具配置”(Tool configuration)选项卡,并将特定的工具配置显式切换为“已启用”(Enabled)2。网络访问与协议激活的分离提供了对攻击面的细粒度控制。

[EN]

Authentication and authorization leverage Oracle's robust user management system but require specific role assignments to bridge the relational and document models. A user intending to connect via the MongoDB API must be granted the SODA_APP role in addition to standard connection privileges.2 The Simple Oracle Document Access (SODA) framework underpins the JSON storage mechanism, and this role assignment effectively links the user identity to the SODA capabilities required for handling JSON collections. Furthermore, because standard MongoDB connections typically occur on port 27017, the Oracle implementation involves mapping these connections through secure OCI load balancers or dedicated ports that enforce TLS encryption, ensuring data-in-motion security.2

[CN]

身份验证和授权利用了 Oracle 强大的用户管理系统,但需要特定的角色分配来桥接关系模型和文档模型。意图通过 MongoDB API 连接的用户,除了标准的连接权限外,必须被授予 SODA_APP 角色 2。Simple Oracle Document Access (SODA) 框架支撑着 JSON 存储机制,该角色分配有效地将用户身份与处理 JSON 集合所需的 SODA 能力联系起来。此外,由于标准的 MongoDB 连接通常发生在端口 27017 上,Oracle 的实现涉及通过安全的 OCI 负载均衡器或强制执行 TLS 加密的专用端口来映射这些连接,从而确保数据传输过程中的安全性 2。

1.3 Mapping MongoDB Concepts to Oracle Structures

1.3 MongoDB 概念到 Oracle 结构的映射

[EN]

To achieve compatibility, the API enforces a rigorous mapping between MongoDB's flexible hierarchy and Oracle's structured schema objects.4 A MongoDB database corresponds to an Oracle Database schema or a logical container within a schema. A MongoDB collection is mapped to an Oracle table, often implemented as a "mapped collection" or supported by "duality views" in newer iterations.4 The MongoDB document becomes a row in the table, typically stored in a high-performance binary JSON column (OSON format).

[CN]

为了实现兼容性,API 在 MongoDB 的灵活层级结构与 Oracle 的结构化模式对象之间实施了严格的映射 4。MongoDB 的 database(数据库)对应于 Oracle 数据库模式(schema)或模式内的逻辑容器。MongoDB 的 collection(集合)映射为 Oracle 表,通常实现为“映射集合”(mapped collection)或在较新迭代中由“对偶视图”(duality views)支持 4。MongoDB 的 document(文档)变成表中的一行,通常存储在高性能的二进制 JSON 列(OSON 格式)中。

[EN]

The handling of the _id field serves as a critical integration point. In MongoDB, _id is the primary key and can be of various types. In the Oracle mapping, this field is strictly managed to enforce uniqueness constraints compatible with relational primary keys. The system supports the conversion of the BSON _id field and handles various BSON scalar types to ensure data fidelity during the translation process.4 The mention of "duality views" is particularly significant; JSON Relational Duality, introduced in recent Oracle versions, allows data to be physically stored in normalized relational tables while being logically accessed and manipulated as JSON documents.4 This architecture permits the API to serve applications that expect a document model while simultaneously allowing other applications to access the same data via optimized SQL over relational tables.

[CN]

对 _id 字段的处理作为一个关键的集成点。在 MongoDB 中,_id 是主键,可以是多种类型。在 Oracle 映射中,该字段受到严格管理,以强制执行与关系型主键兼容的唯一性约束。系统支持 BSON _id 字段的转换,并处理各种 BSON 标量类型,以确保转换过程中的数据保真度 4。关于“对偶视图”的提及尤为重要;在 Oracle 近期版本中引入的 JSON 关系对偶性(JSON Relational Duality),允许数据物理存储在规范化的关系表中,而在逻辑上作为 JSON 文档进行访问和操作 4。这种架构允许 API 服务于期望文档模型的应用程序,同时允许其他应用程序通过针对关系表的优化 SQL 访问相同的数据。

[EN]

The integration capabilities extend to hybrid operations via specific aggregation pipeline stages. The $sql stage is a unique Oracle extension that allows developers to embed raw SQL commands directly within a MongoDB aggregation pipeline.4 This stage accepts parameters such as binds (for variable binding) and dialect, providing an "escape hatch" for developers to leverage the full analytical power of Oracle SQL without leaving the MongoDB driver context. Additionally, the $lookup stage is fully supported and translated into efficient SQL JOIN operations, enabling cross-collection data retrieval that mimics relational joins.4

[CN]

集成能力通过特定的聚合管道阶段扩展到了混合操作。$sql 阶段是 Oracle 独特的一个扩展,允许开发者将原始 SQL 命令直接嵌入到 MongoDB 聚合管道中 4。该阶段接受诸如 binds(用于变量绑定)和 dialect(方言)等参数,为开发者提供了一个“逃生舱”,使其能够在不离开 MongoDB 驱动程序上下文的情况下利用 Oracle SQL 的全部分析能力。此外,$lookup 阶段得到全面支持并被转换为高效的 SQL JOIN 操作,实现了模仿关系型连接的跨集合数据检索 4。

2. Oracle Autonomous Database Workloads: JSON vs. Transaction Processing

2. Oracle 自治数据库工作负载:JSON 与事务处理

[EN]

Oracle offers distinct workload types for its Autonomous Database service, most notably "Autonomous JSON Database" (AJD) and "Autonomous Transaction Processing" (ATP). While these services share the same core technology stack, they are differentiated by licensing models, storage restrictions, and targeted use cases.5 Understanding these nuances is essential for selecting the appropriate architecture for NoSQL-style applications.

[CN]

Oracle 为其自治数据库服务提供了不同的工作负载类型,最值得注意的是“自治 JSON 数据库”(Autonomous JSON Database, AJD)和“自治事务处理”(Autonomous Transaction Processing, ATP)。虽然这些服务共享相同的核心技术栈,但它们在许可模式、存储限制和目标用例上有所区别 5。理解这些细微差别对于为类 NoSQL 应用程序选择合适的架构至关重要。

2.1 Autonomous JSON Database (AJD) Characteristics

2.1 自治 JSON 数据库 (AJD) 特性

[EN]

Oracle Autonomous JSON Database is explicitly designed for developing NoSQL-style applications that utilize JavaScript Object Notation (JSON) documents. Despite being marketed as a distinct service, it is technically equivalent to Autonomous Transaction Processing (ATP) but comes with a critical limitation: users can store only up to 20 GB of non-JSON data (standard relational data).6 There is, however, no storage limit for JSON document collections. This constraint positions AJD as a cost-effective, specialized solution for document-centric workloads, with the option to promote the service to full ATP if the application's relational data requirements exceed the 20 GB threshold.5

[CN]

Oracle 自治 JSON 数据库专为开发使用 JavaScript 对象表示法(JSON)文档的类 NoSQL 应用程序而设计。尽管它被作为一种独特的服务进行营销,但在技术上它等同于自治事务处理(ATP),但带有一个关键限制:用户只能存储最多 20 GB 的非 JSON 数据(标准关系型数据)6。然而,对于 JSON 文档集合没有存储限制。这一限制将 AJD 定位为一种用于文档中心型工作负载的、具有成本效益的专用解决方案,如果应用程序的关系型数据需求超过 20 GB 阈值,则可以选择将服务升级为完整的 ATP 5。

[EN]

AJD leverages Oracle's native binary JSON format, known as OSON. This format is highly optimized for performance, enabling faster query execution and more efficient updates compared to standard text-based JSON. OSON allows the database engine to traverse the JSON tree structure and extract values without the overhead of parsing the entire document text.5 Oracle claims significant performance advantages, stating that AJD delivers "2X better performance than MongoDB Atlas" on the industry-standard YCSB benchmark and provides guaranteed ACID transactions without the performance trade-offs typically associated with multi-document consistency in distributed NoSQL systems.7

[CN]

AJD 利用了 Oracle 的原生二进制 JSON 格式,称为 OSON。与标准的基于文本的 JSON 相比,这种格式针对性能进行了高度优化,能够实现更快的查询执行和更高效的更新。OSON 允许数据库引擎遍历 JSON 树结构并提取值,而无需解析整个文档文本的开销 5。Oracle 声称具有显著的性能优势,指出 AJD 在行业标准的 YCSB 基准测试中提供“比 MongoDB Atlas 好 2 倍的性能”,并提供有保证的 ACID 事务,且没有分布式 NoSQL 系统中通常与多文档一致性相关的性能权衡 7。

2.2 SODA (Simple Oracle Document Access) and Development Paradigms

2.2 SODA (Simple Oracle Document Access) 与开发范式

[EN]

Development on AJD is primarily facilitated through SODA (Simple Oracle Document Access), a set of NoSQL-style APIs available for major languages including Java, Python, Node.js, and via REST.6 SODA enables a schema-less development model where developers interact with "collections" of documents rather than tables. Under the hood, these SODA collections are backed by ordinary database tables managed by the kernel. This abstraction allows for rapid iteration and flexibility in data modeling, as there is no need to define rigid schemas or normalize data upfront.6

[CN]

AJD 上的开发主要通过 SODA(Simple Oracle Document Access)来促进,这是一套适用于包括 Java、Python、Node.js 在内的主要语言以及 REST 的类 NoSQL API 6。SODA 实现了一种无模式的开发模型,开发者与文档的“集合”而非表进行交互。在底层,这些 SODA 集合由内核管理的普通数据库表支持。这种抽象允许在数据建模中进行快速迭代和灵活性,因为无需预先定义严格的模式或对数据进行规范化 6。

[EN]

A notable distinction exists regarding the content of these collections between AJD and ATP. In Autonomous JSON Database, a SODA collection is restricted to containing only JSON data. In contrast, within the full Autonomous Transaction Processing service, collections can be heterogeneous, potentially storing image documents or other binary data alongside JSON.6 This limitation reinforces AJD's focus on pure JSON document store use cases. Additionally, recent updates emphasize the integration of Artificial Intelligence into the core database architecture. The "Oracle Autonomous AI Database" initiative aims to seamlessly integrate AI capabilities across all data types, suggesting that AJD will increasingly support AI-driven workloads directly on JSON data stores.8 Furthermore, the service supports a Multi-Cloud strategy, allowing AJD to be deployed and managed within ecosystems like Microsoft Azure via the Oracle Database Service for Azure, thereby enabling architectures where the application tier resides in Azure while the data tier leverages Oracle's JSON engine.9

[CN]

关于这些集合的内容,AJD 和 ATP 之间存在显著区别。在自治 JSON 数据库中,SODA 集合被限制为仅包含 JSON 数据。相比之下,在完整的自治事务处理服务中,集合可以是异构的,可能存储图像文档或其他二进制数据以及 JSON 6。这一限制加强了 AJD 对纯 JSON 文档存储用例的关注。此外,最近的更新强调了将人工智能集成到核心数据库架构中。“Oracle Autonomous AI Database”倡议旨在无缝集成跨所有数据类型的 AI 能力,这表明 AJD 将越来越多地支持直接在 JSON 数据存储上的 AI 驱动工作负载 8。此外,该服务支持多云策略,允许通过 Oracle Database Service for Azure 在 Microsoft Azure 等生态系统中部署和管理 AJD,从而实现应用层驻留在 Azure 而数据层利用 Oracle JSON 引擎的架构 9。

3. Data Integration and Migration: Oracle GoldenGate for Big Data

3. 数据集成与迁移:Oracle GoldenGate for Big Data

[EN]

For enterprises with existing MongoDB footprints, migration and coexistence are critical operational requirements. Oracle GoldenGate for Big Data provides a robust, log-based architecture for integrating MongoDB with Oracle systems, enabling real-time data replication, zero-downtime migration, and continuous synchronization.10

[CN]

对于拥有现有 MongoDB 足迹的企业而言,迁移和共存是关键的运营需求。Oracle GoldenGate for Big Data 提供了一种强大的、基于日志的架构,用于将 MongoDB 与 Oracle 系统集成,实现实时数据复制、零停机迁移和持续同步 10。

3.1 Capture Mechanism (Extract) and Oplog Integration

3.1 捕获机制 (Extract) 与 Oplog 集成

[EN]

Oracle GoldenGate's capture process, known as "Extract," does not rely on query polling, which can degrade source performance. Instead, it utilizes the MongoDB Oplog (Operations Log) to capture data changes. The Oplog is a capped collection in MongoDB that maintains a rolling record of all operations that modify stored data. GoldenGate "tails" this log to identify and capture INSERT, UPDATE, and DELETE operations in real-time.10

[CN]

Oracle GoldenGate 的捕获过程,称为“Extract”(提取),不依赖于可能降低源性能的查询轮询。相反,它利用 MongoDB 的 Oplog(操作日志)来捕获数据变更。Oplog 是 MongoDB 中的一个固定大小集合,它维护着修改存储数据的所有操作的滚动记录。GoldenGate “尾随”此日志以实时识别和捕获 INSERT、UPDATE 和 DELETE 操作 10。

[EN]

A critical prerequisite for this architecture is the configuration of a MongoDB Replica Set. Standalone MongoDB instances do not generate an Oplog in a format accessible for this type of Change Data Capture (CDC). Therefore, even for single-node sources, a replica set configuration is mandatory to enable the replication stream.10 The solution supports a wide range of MongoDB versions, including 3.x, 4.x, 5.0, and 6.0, and is compatible with MongoDB Enterprise, Community Edition, and MongoDB Atlas.12

[CN]

此架构的一个关键前提是配置 MongoDB 副本集(Replica Set)。独立的 MongoDB 实例不会生成可供此类变更数据捕获(CDC)访问的格式的 Oplog。因此,即使对于单节点源,也必须配置副本集以启用复制流 10。该解决方案支持广泛的 MongoDB 版本,包括 3.x、4.x、5.0 和 6.0,并兼容 MongoDB 企业版、社区版和 MongoDB Atlas 12。

3.2 Deployment Architecture and Replication Capabilities

3.2 部署架构与复制能力

[EN]

The integration is technically implemented as pluggable functionality within the Oracle GoldenGate Java Delivery framework (Java Adapters).13 This modular design allows for flexible deployment scenarios. A primary use case is enabling zero-downtime migration to the Oracle Autonomous JSON Database. The process involves an "Initial Load"—a bulk transfer of the existing dataset—followed by "Change Data Capture" (CDC) to synchronize any changes that occurred during the initial load or subsequent operations.10

[CN]

该集成在技术上作为 Oracle GoldenGate Java Delivery 框架(Java 适配器)内的可插拔功能实现 13。这种模块化设计允许灵活的部署场景。一个主要的用例是实现向 Oracle 自治 JSON 数据库的零停机迁移。该过程涉及“初始加载”(Initial Load)——现有数据集的批量传输——随后是“变更数据捕获”(CDC),以同步在初始加载期间或后续操作中发生的任何变更 10。

[EN]

To ensure data consistency and recovery, the Extract process offers precise positioning controls. Administrators can configure the extract to start from EARLIEST, a specific TIMESTAMP, EOF (End of File), or a specific LSN (Log Sequence Number).14 This granularity is essential for resuming replication after network interruptions or system maintenance without data loss or duplication. Furthermore, Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) supports complex bidirectional replication topologies. In these active-active scenarios, changes made to a source collection are replicated to a target, and vice versa. This requires sophisticated loop detection and conflict resolution mechanisms to prevent infinite replication cycles, enabling high-availability architectures across distributed environments.14

[CN]

为了确保数据一致性和恢复,Extract 进程提供了精确的定位控制。管理员可以将提取配置为从 EARLIEST(最早)、特定 TIMESTAMP(时间戳)、EOF(文件末尾)或特定 LSN(日志序列号)开始 14。这种粒度对于在网络中断或系统维护后恢复复制且不丢失数据或产生重复至关重要。此外,Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) 支持复杂的双向复制拓扑。在这些双活场景中,对源集合所做的更改会复制到目标,反之亦然。这需要复杂的循环检测和冲突解决机制来防止无限复制循环,从而实现跨分布式环境的高可用性架构 14。

4. The Redis Landscape: Open Source vs. Enterprise Architecture

4. Redis 版图:开源与企业级架构

[EN]

Redis has evolved from a specialized caching layer into a comprehensive primary data store. However, significant architectural divergences exist between "Redis Open Source" (OSS) and "Redis Enterprise," particularly regarding clustering strategies, high availability, and memory management efficiency.15

[CN]

Redis 已经从一个专门的缓存层演变为一个综合性的主数据存储。然而,“Redis 开源版”(OSS)和“Redis 企业版”之间存在显著的架构分歧,特别是在集群策略、高可用性和内存管理效率方面 15。

4.1 Comparison of Clustering Architectures

4.1 集群架构比较

[EN]

The fundamental difference lies in how clustering is implemented and managed. In Redis Open Source, clustering relies on a client-aware protocol. The application's client library must be "smart"—it needs to understand the cluster topology, hash slots, and node addresses. The client is responsible for routing requests to the correct shard. If the cluster topology changes (e.g., node addition or removal), the client must update its internal map, which can introduce complexity and latency during reconfiguration events.15

[CN]

根本的区别在于集群的实现和管理方式。在 Redis 开源版中,集群依赖于一种客户端感知的协议。应用程序的客户端库必须是“智能的”——它需要理解集群拓扑、哈希槽和节点地址。客户端负责将请求路由到正确的分片。如果集群拓扑发生变化(例如,节点添加或移除),客户端必须更新其内部映射,这可能会在重新配置事件期间引入复杂性和延迟 15。

[EN]

Redis Enterprise, in contrast, utilizes a proxy-based architecture. Applications connect to a robust, high-performance proxy layer that abstracts the underlying cluster complexity. The application acts as if it is connecting to a single Redis instance. The proxy handles the routing, sharding, and load balancing transparently. This architecture enables seamless scaling, re-sharding, and maintenance operations without requiring any changes to the client configuration or application logic.15

[CN]

相比之下,Redis 企业版利用了基于代理的架构。应用程序连接到一个健壮、高性能的代理层,该层抽象了底层的集群复杂性。应用程序表现得就像连接到一个单一的 Redis 实例。代理透明地处理路由、分片和负载均衡。这种架构实现了无缝的扩展、重新分片和维护操作,而无需更改任何客户端配置或应用程序逻辑 15。

4.2 Scalability, Persistence, and Tiered Memory

4.2 可扩展性、持久化与分层内存

[EN]

Redis Enterprise introduces capabilities for linear scalability and advanced high availability that exceed OSS limitations. While OSS uses master-slave replication, Enterprise offers an SLA-backed 99.999% uptime in Active-Active deployments, featuring instant failure detection and failover across racks, zones, and geographies in single-digit seconds.17 A distinct feature of Enterprise is Active-Active Geo-Distribution using Conflict-Free Replicated Data Types (CRDTs), which allows simultaneous writes to the same dataset in multiple global regions with strong consistency convergence—a capability not native to OSS Redis.16

[CN]

Redis 企业版引入了线性可扩展性和超越 OSS 限制的高级高可用性能力。虽然 OSS 使用主从复制,但企业版在双活部署中提供 SLA 支持的 99.999% 正常运行时间,具有跨机架、区域和地理位置的即时故障检测和个位数秒级故障转移功能 17。企业版的一个显著特征是使用无冲突复制数据类型(CRDTs)的双活地理分布(Active-Active Geo-Distribution),它允许在多个全球区域同时写入同一数据集并具有强一致性收敛——这是 OSS Redis 所不具备的原生能力 16。

[EN]

Furthermore, Redis Enterprise addresses the cost prohibitive nature of pure in-memory storage for large datasets through Redis Flex (formerly Redis on Flash). This intelligent tiering mechanism combines DRAM with high-speed Solid State Drives (SSDs). "Hot" data remains in DRAM for microsecond latency, while "cold" or less frequently accessed values are automatically evicted to Flash storage. This approach can reduce infrastructure costs by up to 70% while maintaining performance metrics comparable to pure DRAM for active workloads.17 Additionally, Enterprise integrates advanced modules such as RediSearch, RedisJSON, and RedisTimeSeries, transforming it into a multi-model store.19

[CN]

此外,Redis 企业版通过 Redis Flex(前身为 Redis on Flash)解决了纯内存存储对于大数据集成本过高的问题。这种智能分层机制结合了 DRAM 和高速固态硬盘(SSD)。“热”数据保留在 DRAM 中以实现微秒级延迟,而“冷”数据或访问频率较低的值会自动逐出到闪存存储。这种方法可以在保持与纯 DRAM 相当的活跃工作负载性能指标的同时,降低高达 70% 的基础设施成本 17。此外,企业版集成了 RediSearch、RedisJSON 和 RedisTimeSeries 等高级模块,将其转变为多模型存储 19。

5. Multi-Model Paradigms: ArangoDB and Couchbase

5. 多模型范式:ArangoDB 与 Couchbase

[EN]

The industry is increasingly adopting native multi-model databases that support diverse data structures within a single core, challenging the traditional "best-of-breed" approach where separate databases are used for each data type. ArangoDB and Couchbase are prime examples of this trend.

[CN]

行业正越来越多地采用原生多模型数据库,在单个核心内支持多样化的数据结构,挑战了为每种数据类型使用单独数据库的传统“最佳组合”方法。ArangoDB 和 Couchbase 是这一趋势的典型代表。

5.1 ArangoDB: The Native Multi-Model Architecture

5.1 ArangoDB:原生多模型架构

[EN]

ArangoDB distinguishes itself with a single C++ core architecture designed to support three specific data models: Graph, Document, and Key-Value.20 Unlike systems that layer different engines on top of one another, ArangoDB utilizes a unified query language, AQL (ArangoDB Query Language), for all three models. AQL is a declarative, SQL-like language that enables complex queries involving joins across documents and graph traversals in a single execution context.20

[CN]

ArangoDB 的独特之处在于其单一的 C++ 核心架构,旨在支持三种特定的数据模型:图、文档和键值 20。与将不同引擎层叠在一起的系统不同,ArangoDB 对所有三种模型使用统一的查询语言 AQL(ArangoDB Query Language)。AQL 是一种声明性的、类似 SQL 的语言,能够在单个执行上下文中实现涉及跨文档连接和图遍历的复杂查询 20。

[EN]

In this architecture, documents are stored as JSON objects. The graph model is implemented natively: vertices are standard documents in collections, while edges are documents in special "edge collections" that explicitly contain _from and _to attributes to define relationships.21 The Key-Value model is treated as a specialized subset of the document model, optimized for retrieval via the primary _key attribute.21 ArangoDB Enterprise enhances this with features like "SmartGraphs" (graph-aware sharding to minimize network hops) and "Satellite Collections" (replication of small collections to all shards for fast local joins), addressing the specific performance challenges of distributed graph processing.23

[CN]

在这种架构中,文档存储为 JSON 对象。图模型是原生实现的:顶点是集合中的标准文档,而边是特殊的“边集合”中的文档,显式包含 _from 和 _to 属性以定义关系 21。键值模型被视为文档模型的专用子集,针对通过主 _key 属性的检索进行了优化 21。ArangoDB 企业版通过“SmartGraphs”(图感知分片以最小化网络跳数)和“Satellite Collections”(将小集合复制到所有分片以实现快速本地连接)等功能增强了这一点,解决了分布式图处理的特定性能挑战 23。

5.2 Couchbase: SQL++ and Service Isolation

5.2 Couchbase:SQL++ 与服务隔离

[EN]

Couchbase approaches interoperability by bridging the gap between JSON flexibility and SQL familiarity. It employs SQL++ (formerly N1QL), a comprehensive extension of the SQL standard designed specifically for querying JSON data. SQL++ allows developers to perform SELECT, JOIN, and GROUP BY operations on nested structures, arrays, and denormalized data, effectively bringing relational query power to a schematic-less document store.24

[CN]

Couchbase 通过弥合 JSON 灵活性与 SQL 熟悉度之间的差距来实现互操作性。它采用 SQL++(前身为 N1QL),这是 SQL 标准的全面扩展,专为查询 JSON 数据而设计。SQL++ 允许开发者对嵌套结构、数组和非规范化数据执行 SELECT、JOIN 和 GROUP BY 操作,有效地将关系查询能力引入到无模式的文档存储中 24。

[EN]

Architecturally, Couchbase implements "Multi-Dimensional Scaling" through service isolation. The platform separates distinct workloads into independent services: Data Service (Key-Value operations), Index Service, Query Service (SQL++ execution), Search Service (Full-text), and Analytics Service.24 This isolation ensures that a resource-intensive analytical query running on the Analytics node does not degrade the latency of high-throughput operational reads and writes on the Data nodes.24 The Query Service is optimized for operational OLTP queries, while the Analytics Service handles OLAP workloads on large datasets without requiring complex ETL pipelines.24

[CN]

在架构上,Couchbase 通过服务隔离实现了“多维扩展”。该平台将不同的工作负载分离为独立的服务:数据服务(键值操作)、索引服务、查询服务(SQL++ 执行)、搜索服务(全文)和分析服务 24。这种隔离确保了在分析节点上运行的资源密集型分析查询不会降低数据节点上高吞吐量操作读写的延迟 24。查询服务针对操作型 OLTP 查询进行了优化,而分析服务则处理大数据集上的 OLAP 工作负载,无需复杂的 ETL 管道 24。

6. Graph Database Specialization: Neo4j

6. 图数据库专精:Neo4j

[EN]

While multi-model databases offer graph capabilities, Neo4j maintains its position as a specialist "native" graph database, emphasizing performance for deep, complex traversals.

[CN]

虽然多模型数据库提供了图能力,但 Neo4j 保持了其作为专业“原生”图数据库的地位,强调深度、复杂遍历的性能。

[EN]

Neo4j utilizes "index-free adjacency" for its native graph storage. Unlike systems that simulate graphs using table joins or document lookups, Neo4j physically stores connected nodes and relationships effectively pointing to each other. This architecture enables constant-time traversal for relationships, providing significant performance benefits for queries involving multiple degrees of separation (e.g., fraud detection rings or social network paths) compared to relational systems where performance degrades exponentially with join depth.26

[CN]

Neo4j 利用“免索引邻接”(index-free adjacency)作为其原生图存储。与使用表连接或文档查找模拟图的系统不同,Neo4j 物理存储连接的节点和关系,有效地相互指向。这种架构实现了关系的常数时间遍历,与关系系统相比(后者的性能随连接深度的增加而呈指数级下降),为涉及多个分离度(例如欺诈检测环或社交网络路径)的查询提供了显著的性能优势 26。

[EN]

The query interface is Cypher, a declarative, pattern-matching language using ASCII-art style syntax to represent nodes and relationships visually within the code.27 The Neo4j Enterprise Edition adds critical production features, including Causal Clustering (ensuring data safety and read-your-own-write consistency), online Hot Backups, and advanced memory management. Specifically, the Enterprise edition supports a "Native Index Provider" with increased key size limits (~8KB), enabling more robust indexing strategies for complex property graphs.28

[CN]

查询接口是 Cypher,这是一种声明性的模式匹配语言,使用 ASCII 艺术风格的语法在代码中直观地表示节点和关系 27。Neo4j 企业版增加了关键的生产功能,包括因果集群(确保数据安全和“读己之写”一致性)、在线热备份和高级内存管理。具体而言,企业版支持具有增加的键大小限制(~8KB)的“原生索引提供程序”,从而为复杂的属性图实现更稳健的索引策略 28。

7. Cassandra Evolution: DataStax and IBM

7. Cassandra 的演进:DataStax 与 IBM

[EN]

Apache Cassandra is the industry standard for wide-column, high-write-throughput distributed storage. DataStax has been the primary commercial entity driving its enterprise adoption, recently culminating in a significant acquisition by IBM.

[CN]

Apache Cassandra 是宽列、高写入吞吐量分布式存储的行业标准。DataStax 一直是推动其企业级应用的主要商业实体,最近以 IBM 的重大收购告终。

[EN]

DataStax Enterprise (DSE) is a self-managed platform extending OSS Cassandra. It integrates additional engines into the same JVM or cluster, including Apache Solr for search, Apache Spark for analytics, and DataStax Graph (based on TitanDB). This allows for mixed workloads but requires careful resource management.29 In contrast, Astra DB represents the shift to a fully managed, serverless Database-as-a-Service (DBaaS). Built on Kubernetes (K8ssandra), Astra DB decouples compute from storage, allowing them to scale independently. It removes the operational overhead of managing compactions, repairs, and nodetool operations, offering a consumption-based pricing model.29

[CN]

DataStax Enterprise (DSE) 是一个扩展 OSS Cassandra 的自管理平台。它将额外的引擎集成到同一个 JVM 或集群中,包括用于搜索的 Apache Solr、用于分析的 Apache Spark 和 DataStax Graph(基于 TitanDB)。这允许混合工作负载,但需要仔细的资源管理 29。相比之下,Astra DB 代表了向全托管、无服务器数据库即服务(DBaaS)的转变。基于 Kubernetes(K8ssandra)构建,Astra DB 将计算与存储解耦,允许它们独立扩展。它消除了管理压缩、修复和 nodetool 操作的操作开销,提供了基于消耗的定价模型 29。

[EN]

The acquisition of DataStax by IBM (completed May 2025) marks a consolidation in the market. IBM intends to integrate DataStax technologies, particularly DSE and Astra DB, into the IBM watsonx.data platform. This move is strategically aligned with the rise of Generative AI, positioning DataStax's vector search capabilities as a core component of IBM's "GenAI data" stack (Vector Search + RAG), while rebranding support offerings like "Luna" under IBM Elite Support.31

[CN]

IBM 对 DataStax 的收购(于 2025 年 5 月完成)标志着市场的整合。IBM 打算将 DataStax 技术,特别是 DSE 和 Astra DB,整合到 IBM watsonx.data 平台中。这一举措与生成式 AI 的兴起在战略上保持一致,将 DataStax 的向量搜索能力定位为 IBM “GenAI 数据”栈(向量搜索 + RAG)的核心组件,同时将“Luna”等支持服务重新品牌化为 IBM Elite Support 31。

8. IBM and MongoDB: The Enterprise Partnership

8. IBM 与 MongoDB:企业级合作伙伴关系

[EN]

Before acquiring DataStax, IBM established a robust partnership with MongoDB Inc., resulting in the MongoDB Enterprise Advanced with IBM offering. This is not a technical fork but a strategic resale and support collaboration targeting hybrid cloud and mainframe modernization.32

[CN]

在收购 DataStax 之前,IBM 与 MongoDB Inc. 建立了牢固的合作伙伴关系,推出了 MongoDB Enterprise Advanced with IBM 产品。这并非技术分支,而是针对混合云和主机现代化的战略转售与支持合作 32。

[EN]

A primary use case is mainframe augmentation, where read-heavy workloads are offloaded from mainframes to MongoDB instances. This reduces the consumption of expensive MIPS on the mainframe while maintaining the core transaction systems of record.32 Through this partnership, IBM offers "Certified by MongoDB" managed services on IBM Cloud, ensuring customers have access to the latest MongoDB features (like time series collections and window functions) that are fully supported by IBM's consulting and technical teams.33 This enables enterprises to deploy standard MongoDB architectures within the compliance and security perimeter of the IBM Cloud ecosystem.

[CN]

一个主要用例是主机增强(mainframe augmentation),即从主机卸载读密集型工作负载到 MongoDB 实例。这减少了主机上昂贵的 MIPS 消耗,同时维护了核心事务记录系统 32。通过这种合作关系,IBM 在 IBM Cloud 上提供“MongoDB 认证”的托管服务,确保客户能够访问最新的 MongoDB 功能(如时间序列集合和窗口函数),并由 IBM 的咨询和技术团队提供全面支持 33。这使得企业能够在 IBM Cloud 生态系统的合规性和安全边界内部署标准的 MongoDB 架构。

9. IBM Db2 Native JSON Capabilities

9. IBM Db2 原生 JSON 能力

[EN]

Beyond partnerships, IBM's flagship Db2 database possesses native JSON capabilities, mirroring Oracle's converged database strategy.34 Db2 allows for the storage of JSON documents in either their original text format or the binary-encoded BSON format, which is more efficient for storage and traversal.

[CN]

除了合作伙伴关系,IBM 的旗舰产品 Db2 数据库拥有原生的 JSON 能力,与其 Oracle 的融合数据库策略相呼应 34。Db2 允许以原始文本格式或二进制编码的 BSON 格式存储 JSON 文档,后者在存储和遍历方面更为高效。

[EN]

Interacting with JSON in Db2 is achieved via a suite of SQL functions. JSON_VALUE extracts scalar values from a document, while JSON_TABLE projects JSON data into a relational result set, enabling easy integration with standard SQL tables. JSON_QUERY is used to retrieve complex objects or arrays.35 Crucially, Db2 supports expression-based indexing using JSON_VALUE. This allows administrators to create indexes on specific fields deeply nested within a JSON document without needing to duplicate that data into separate physical columns, ensuring high-performance lookups on semi-structured data.35

[CN]

在 Db2 中与 JSON 的交互是通过一套 SQL 函数实现的。JSON_VALUE 从文档中提取标量值,而 JSON_TABLE 将 JSON 数据投影为关系结果集,从而能够轻松地与标准 SQL 表集成。JSON_QUERY 用于检索复杂的对象或数组 35。至关重要的是,Db2 支持使用 JSON_VALUE 的基于表达式的索引。这允许管理员在 JSON 文档深层嵌套的特定字段上创建索引,而无需将该数据复制到单独的物理列中,从而确保对半结构化数据的高性能查找 35。

10. Oracle NoSQL Database: Community vs. Enterprise

10. Oracle NoSQL 数据库:社区版与企业版

[EN]

Often overshadowed by the Autonomous Database, Oracle NoSQL Database is a distinct product lineage rooted in high-performance Key-Value architectures. It is dual-licensed, available in both an Open Source Community Edition (Apache 2.0) and a commercial Enterprise Edition.36

[CN]

常被自治数据库的光芒所掩盖,Oracle NoSQL 数据库是一个独特的产品系列,植根于高性能键值架构。它是双重许可的,既有开源社区版(Apache 2.0),也有商业企业版 36。

[EN]

The architecture supports both a simple Key-Value model and a table model for more structured schema interaction. A standout feature of the Oracle NoSQL Database Cloud Service is "Global Active Tables." This capability enables multi-region, active-active table replication, ensuring that an update operation performed in one OCI region is automatically replicated to all participating regions.36 This provides disaster recovery resilience and low-latency local access for globally distributed applications. Furthermore, the licensing model supports "Bring Your Own License" (BYOL), allowing organizations to leverage their on-premises Enterprise licenses within the Oracle Cloud infrastructure, facilitating hybrid deployment models.37

[CN]

该架构支持简单的键值模型和用于更结构化模式交互的表模型。Oracle NoSQL 数据库云服务的一个突出功能是“全局活动表”(Global Active Tables)。此功能实现了多区域、双活表复制,确保在一个 OCI 区域执行的更新操作会自动复制到所有参与区域 36。这为全球分布式应用程序提供了灾难恢复弹性和低延迟的本地访问。此外,许可模式支持“自带许可证”(BYOL),允许组织在 Oracle Cloud 基础设施内利用其本地企业版许可证,促进混合部署模型 37。

Works cited

Oracle Database API for MongoDB - Home, accessed November 28, 2025,

Using Oracle Database API for MongoDB, accessed November 28, 2025,

1 Overview of Oracle Database API for MongoDB, accessed November 28, 2025,

Oracle Database API for MongoDB, accessed November 28, 2025,

About Autonomous AI Database Workload Types - Oracle Help Center, accessed November 28, 2025,

About Autonomous JSON Database - Oracle Help Center, accessed November 28, 2025,

Autonomous AI JSON Database - Oracle, accessed November 28, 2025,

Oracle Autonomous AI Database, accessed November 28, 2025,

Yes, Oracle Autonomous JSON Database is Multicloud | by Hermann Bär - Medium, accessed November 28, 2025,

Replicate data from MongoDB to OCI GoldenGate - Oracle Help Center, accessed November 28, 2025,

Oracle GoldenGate for Big Data - {Hadoop} | {Cloud} | {NoSQL} | {Database}, accessed November 28, 2025,

Oracle GoldenGate is Certified for Online MongoDB Migrations, accessed November 28, 2025,

1 Introducing Oracle GoldenGate for Big Data, accessed November 28, 2025,

MongoDB - GoldenGate - Oracle Help Center, accessed November 28, 2025,

Redis: Open Source vs. Enterprise - MetricFire, accessed November 28, 2025,

[Answered] What are the differences between Redis Enterprise and Redis Open Source?, accessed November 28, 2025,

Advantages of Redis Enterprise vs. Redis Open Source, accessed November 28, 2025,

Redis Software vs. Redis Open Source, Community Edition & open source forks, accessed November 28, 2025,

Redis Open Source and Redis Enterprise | Docs, accessed November 28, 2025,

accessed November 28, 2025,

Navigating ArangoDB's Multi-Model Magic with GoFr: An Informative and Practical Approach | by Mundhraumang | Level Up Coding, accessed November 28, 2025,

Data Models | ArangoDB Documentation, accessed November 28, 2025,

ArangoDB - Wikipedia, accessed November 28, 2025,

Query Data with SQL++ | Couchbase Docs, accessed November 28, 2025,

SQL++ - Query Language for Managing JSON - Couchbase, accessed November 28, 2025,

Introduction - Operations Manual - Neo4j, accessed November 28, 2025,

The Neo4j Graph Data - Product, accessed November 28, 2025,

Ready for Testing: Neo4j Enterprise Edition 4.0 Milestone Release 2, accessed November 28, 2025,

DataStax - Wikipedia, accessed November 28, 2025,

Difference between DataStax Enterprise, Astra DB and Luna for Cassandra - Reddit, accessed November 28, 2025,

IBM + DataStax - IBM TechXchange Community, accessed November 28, 2025,

MongoDB Enterprise Advanced with IBM, accessed November 28, 2025,

MongoDB Launches Certification Program for Cloud Partners Offering MongoDB Database Services, accessed November 28, 2025,

SQL access to JSON documents - Db2 - IBM, accessed November 28, 2025,

Working with JSON documents - Db2 - IBM, accessed November 28, 2025,

Oracle NoSQL Database 25.1 Enterprise Edition (EE) | Datasheet, accessed November 28, 2025,

Oracle NoSQL Enterprise Edition Frequently Asked Questions, accessed November 28, 2025,