Testing Serializable Isolation in CockroachDB

Observing the effects of serializable isolation

·

8 min read

As a follow-up to A Basic Guide to Transaction Isolation, this article will reproduce a handful of interesting tests described on the PostgreSQL SSI page. Only this time for CockroachDB.

Another great resource to illustrate the behaviour of serializable is Martin Kleppman's Hermitage project and the CockroachDB contribution. It goes through a rich set of anomalies ranging from dirty writes to write skew on disjoint and predicate reads.

Examples

These tests were executed using the following CockroachDB version:

$ cockroach version
Build Tag:        v22.2.8
Build Time:       2023/04/17 13:22:08
Distribution:     CCL
Platform:         linux amd64 (x86_64-pc-linux-gnu)
Go Version:       go1.19.6
C Compiler:       gcc 6.5.0
Build Commit ID:  9a7c644e565b21d29db26a0a82524a00809d0a8c
Build Type:       release

First, create a test database:

cockroach sql --insecure --host=localhost -e "CREATE database test"

Next, start three separate shell windows representing transactions T1, T2 and T3. Whenever there's a "-- T1" comment for a SQL statement, run that statement in the designated console session.

cockroach sql --insecure --host=localhost --database test // for T1
cockroach sql --insecure --host=localhost --database test // for T2
cockroach sql --insecure --host=localhost --database test // for T3

Black and White Marbles

This is a test for Write Skew (A5B), prevented by serializable. Write Skew is when two transactions overlap and one reads data that another is writing.

Schema setup:

create table if not exists marbles (
  id    bigint      not null primary key,
  color varchar(25) not null
);

delete from marbles where 1=1;

insert into marbles values
  (1,'black'),
  (2,'black'),
  (3,'black'),
  (4,'black'),
  (5,'black'),
  (6,'white'),
  (7,'white'),
  (8,'white'),
  (9,'white'),
  (10,'white');

Note: The set transaction isolation level serializable part is redundant for CockroachDB since it's the default (and only level supported).

Observe that CockroachDB serializable prevent this anomaly:

begin; set transaction isolation level serializable; -- T1
begin; set transaction isolation level serializable; -- T2
update marbles set color = 'black' where color = 'white'; -- T1
update marbles set color = 'white' where color = 'black'; -- T2
commit; -- T1. First commit wins.
commit; -- T2. ERROR: restart transaction: TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE - failed preemptive refresh due to a conflict: committed value on key /Table/137/1/6/0): "sql txn" meta={id=f8ee6d8c key=/Table/137/1/1/0 pri=0.00066739 epo=0 ts=1682691951.358336984,2 min=1682691934.561490359,0 seq=5} lock=true stat=PENDING rts=1682691934.561490359,0 wto=false gul=1682691935.061490359,0
SQLSTATE: 40001
HINT: See: https://www.cockroachlabs.com/docs/v22.2/transaction-retry-error-reference.html#retry_serializable

All the colours must match:

SELECT * from marbles order by id;
  id | color
-----+--------
   1 | black
   2 | black
   3 | black
   4 | black
   5 | black
   6 | black
   7 | black
   8 | black
   9 | black
  10 | black
(10 rows)

If you would run in SNAPSHOT (which CockroachDB doesn't provide) then it would not prevent write skew and look like this instead:

+----+-------+
| id | color |
+----+-------+
|  1 | white |
|  2 | white |
|  3 | white |
|  4 | white |
|  5 | white |
|  6 | black |
|  7 | black |
|  8 | black |
|  9 | black |
| 10 | black |
+----+-------+
-- (10 rows)

The colours have been flipped due to write skew, which is expected under concurrent execution with snapshot isolation (or read committed).

Red, Green and Blue Marbles

This example is similar to the previous one, only this time involving three transactions.

Setup schema:

create table if not exist marbles (
  id    bigint      not null primary key,
  color varchar(25) not null
);

delete from marbles where 1=1;
insert into marbles (id,color) values (1,'red');
insert into marbles (id,color) values (2,'red');
insert into marbles (id,color) values (3,'red');
insert into marbles (id,color) values (4,'yellow');
insert into marbles (id,color) values (5,'yellow');
insert into marbles (id,color) values (6,'yellow');
insert into marbles (id,color) values (7,'blue');
insert into marbles (id,color) values (8,'blue');
insert into marbles (id,color) values (9,'blue');

Again, CockrochDB SERIALIZABLE isolation prevents Write Skew (A5B):

begin; set transaction isolation level SERIALIZABLE ; -- T1
update marbles set color = 'yellow' where color = 'red'; -- T1

begin; set transaction isolation level SERIALIZABLE ; -- T2
update marbles set color = 'blue' where color = 'yellow'; -- T2. Blocks on T1 intents.

begin; set transaction isolation level SERIALIZABLE ; -- T3
update marbles set color = 'red' where color = 'blue'; -- T3. Blocks.

commit; --T1
(T2 unblocks - rows affected 6)
(T3 unblocks - rows affected 3)
commit; -- T3
commit; -- T2. ERROR: restart transaction: TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE - failed preemptive refresh due to a conflict

The correct outcome (only yellow and red):

select * from marbles;
+----+--------+
| id | color  |
+----+--------+
|  1 | yellow |
|  2 | yellow |
|  3 | yellow |
|  4 | yellow |
|  5 | yellow |
|  6 | yellow |
|  7 | red    |
|  8 | red    |
|  9 | red    |
+----+--------+
(9 rows)

Intersecting Data

Two concurrent transactions read data, and each uses it to update the range read by the other.

Setup schema:

create table if not exists tab (
  id bigint not null,
  value bigint not null
);

delete from tab where 1=1;

INSERT INTO tab VALUES
(1, 10), (1, 20), (2, 100), (2, 200);

Observe CockroachDB guarantees a serial execution:

begin; set transaction isolation level serializable; -- T1
SELECT SUM(value) FROM tab WHERE id = 1; -- T1
INSERT INTO tab VALUES (2, 30); -- T1

begin; set transaction isolation level serializable; -- T2
SELECT SUM(value) FROM tab WHERE id = 2; -- T2 (blocks)

commit; -- T1 (unblocks T2)

INSERT INTO tab VALUES (1, 330); -- T2

commit; -- T2

SELECT * from tab;

Yields:

SELECT * from tab;
  id | value
-----+--------
   1 |    10
   1 |    20
   2 |   100
   2 |   200
   2 |    30
   1 |   330
(6 rows)

Overdraft Protection

Here we will protect the invariant that the total of all accounts must exceed the amount requested.

Schema setup:

create table if not exists account
  (
    name VARCHAR(25) not null,
    type VARCHAR(25) not null,
    balance NUMERIC(19, 2) not null,

    primary key (name, type)
  );

delete from account where 1=1;

insert into account values
  ('alice','saving', 500),
  ('alice','checking', 500);

Let's try to play the bank under serializable isolation:

begin; set transaction isolation level serializable ; -- T1
select type, balance from account where name = 'alice'; -- T1

begin; set transaction isolation level serializable ; -- T2
select type, balance from account where name = 'alice'; -- T2

update account set balance = balance - 900.00 where name = 'alice' and type = 'saving'; -- T1
commit; -- T1

update account set balance = balance - 900.00 where name = 'alice' and type = 'checking'; -- T2
commit; -- T2 ERROR: restart transaction: TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE - failed preemptive refresh due to a conflict:

Yields the following where the invariant holds:

select * from account;
  name  |   type   | balance
--------+----------+----------
  alice | checking |  500.00
  alice | saving   | -400.00
(2 rows)

Deposit Report

Setup schema (before every test run):

create table if not exists control
  (
    deposit_no int not null
  );
create table if not exists receipt
  (
    receipt_no bigint NOT NULL PRIMARY KEY DEFAULT unique_rowid(),
    deposit_no int not null,
    payee text not null,
    amount numeric(19,2) not null
  );

DELETE from control where 1=1;
DELETE from receipt where 1=1;
insert into control values (1);

insert into receipt
  (deposit_no, payee, amount)
  values ((select deposit_no from control), 'Crosby', 100.00);
insert into receipt
  (deposit_no, payee, amount)
  values ((select deposit_no from control), 'Stills', 200.00);
insert into receipt
  (deposit_no, payee, amount)
  values ((select deposit_no from control), 'Nash', 300.00);

Test sequence:

begin; set transaction isolation level serializable ; -- T1
insert into receipt (deposit_no, payee, amount) values ( (select deposit_no from control), 'Young', 100.00 ); -- T1
select * from receipt; -- T1

begin; set transaction isolation level serializable ; -- T2   
select deposit_no from control; -- T2
update control set deposit_no = 2 where 1=1; -- T2
commit; -- T2

begin; set transaction isolation level serializable ; -- T3   
select * from receipt where deposit_no = 1; -- T3. Blocks on T1
commit; -- T1
(T3 unblocks)
commit; -- T3

Yields:

select * from receipt;
      receipt_no     | deposit_no | payee  | amount
---------------------+------------+--------+---------
  860561294810873858 |          1 | Crosby | 100.00
  860561295115354114 |          1 | Stills | 200.00
  860561295382970370 |          1 | Nash   | 300.00
  860561358736326657 |          1 | Young  | 100.00
(4 rows)

Rollover

This example was created to show that PostgreSQL can roll back read-only transactions to prevent serialization conflicts. It won't happen in CockroachDB.

Schema setup:

create table if not exists rollover (
  id int primary key, 
  n int not null
  );
delete from rollover where 1=1;
insert into rollover values (1,100), (2,10);

Financial transaction under serializable isolation:

begin; set transaction isolation level serializable ; -- T1
update rollover
  set n = n + (select n from rollover where id = 2)
  where id = 1; -- T1

begin; set transaction isolation level serializable ; -- T2
update rollover set n = n + 1 where id = 2; -- T2 - blocks on T1
commit; --T2 

begin; set transaction isolation level snapshot ; -- T3
select count(*) from rollover; -- T3 - blocks on T1         
commit; -- T1

select n from rollover where id in (1,2); -- T3

Conclusion

This article showcased a few examples outlined in the PostgreSQL SSI description page. It highlights some runtime differences between PostgreSQL SSI and CockroachDB serializable.

Did you find this article valuable?

Support Kai Niemi by becoming a sponsor. Any amount is appreciated!