
在阅读了Eric A. Brewer的keynote at the PODC和Gilbert and Lynch的Brewer’s Conjecture and the Feasibility of Consistent,Available,Partition-Tolerant Web Service后,对CAP有了些初步的认识,也尝试弄清楚一些基本的问题。

1. 为什么我们需要知道CAP





2. 什么是CAP


CAP是Eric A. Brewer在PODC上首次提出,Brewer的一部分工作便设计大型的分布式系统设计与实现,例如Global Search Engines、Distributed Web Caches等。所以CAP理论也是基于分布式(Distributed Systems)的系统,所以下面全部的内容,都需要放到分布式系统的中去理解,否则你会迷失。


CAP是指Consistency, availability and partition tolerance的缩写,不过理解这三个概念并不容易。CAP理论指出,这三者不可兼得。

Consistency是容易理解的;关于Availability和Partition tolerance,Jeff Darcy在文章Availability and Partition Tolerance做了如下阐述

“Availability is sacrificed if certain nodes are forced to wait for unbounded time because of a failure. This includes the common approach of forcing non-quorum nodes down, which Brewer alludes to.”

“Partition tolerance is sacrificed if certain requests are forced to wait for unbounded time because of a failure. This is most often the case when a node holding a lock cannot be reached, and quorum loss is not used to break the lock.”


3. Sample of CAP


4. CAP 和 MySQL

这里以MySQL Replication(Master-Slave)方案为例。在MS结构中,节点M和节点S,都可以对外提供服务,当其中的一个节点故障,另一个节点仍然能够对外提供服务,所以MS方案认为是满足Availability的;目前MS结构仍然是异步的方式运行,即当主库写入时,备库未必一定也写入了,备库甚至允许短暂和主库断开连接,而且当主库处理请求时,也无需确认备库的状态,所以MS结构是满足Partition tolerance的;但是由于MS结构的异步特性,我们看到主备的数据是可能不一致的,即不满足Consistency。

Semi-sync Replication则是在Consistency要求更严格,同时在部分失去Partition tolerance;如果MySQL使用DRBD(严格模式) 做HA方案,则实现了Consistency,就失去了Partition tolerance。

关于CAP

  1. 毛剑


    MySQL Replication(Master-Slave)
    其实是不满足 Partition tolerance的,比如是双Master 才满足
    Partition tolerance定义是,在网络丢失情况下,partition nodes 是能够 承担读和写 双向任务的,当然,我们可以切换 Slave 为master 然后再和原来的master 搭建 双master 。

  2. 毛剑

    Partition tolerance refers to the ability for a system to continue to operate in the presence of a network partitions. For example, if I have a database running on 80 nodes across 2 racks and the interconnect between the racks is lost, my database is now partitioned. If the system is tolerant of it, then the database will still be able to perform read and write operations while partitioned. If not, often times the cluster is completely unusable or is read-only.

  3. Anonymous

    不同意毛剑的评论,两个概念:Partition tolerance和Partition。要回答Partition tolerance,首先必须回答什么是Partition。

