Spring Retry with CockroachDB

Spring Retry with CockroachDB

Using spring-retry to retry transient SQL exceptions in CockroachDB

·

5 min read

Spring Retry is a small library for retrying failed method invocations of a transient nature. Typically when interacting with another service over the network, a message broker or database.

In this tutorial, we'll look at using spring retry for serialization conflict errors denoted by the SQL state code 40001.

Maven Setup

To use Spring Retry, you need to add the Spring Retry and Spring AOP dependencies to your pom.xml.

<dependency>
    <groupId>org.springframework.retry</groupId>
    <artifactId>spring-retry</artifactId>
    <version>2.0.1</version>
</dependency>

<dependency>
    <groupId>org.springframework</groupId>
    <artifactId>spring-aspects</artifactId>
    <version>5.3.10</version>
</dependency>

Configuration

To enable Spring Retry in an application, add the @EnableRetry annotation to any of the @Configuration classes:

@EnableRetry
@Configuration
@SpringBootApplication
public class MyApplication {
    public static void main(String[] args) {
        new SpringApplicationBuilder(MyApplication.class)
                .run(args);
    }
}

Example Service

Using Spring Retry is as simple as adding the @Retryable annotation to the methods to-be-retried:

@Service
public class OrderService {
    @Transactional(propagation = Propagation.REQUIRES_NEW)
    @Retryable
    public Order updateOrderStatus(Long orderId,
ShipmentStatus status, BigDecimal amount) {
        Order order = ...;
        return order;
    }
}

In our case, however, we want to be more specific on what type of exceptions qualify for a retry and also tailor the backoff policy to use an exponentially increasing delay with jitter.

@Service
public class OrderService {
    @Transactional(propagation = Propagation.REQUIRES_NEW)
    @Retryable(exceptionExpression = "@exceptionClassifier.shouldRetry(#root)",
            maxAttempts = 5,
            backoff = @Backoff(maxDelay = 15_000, multiplier = 1.5))
    public Order updateOrderStatus(Long orderId,
ShipmentStatus status, BigDecimal amount) {
        Order order = ...;
        return order;
    }
}

The backoff annotation parameters defines a policy that results in the ExponentialRandomBackOffPolicy is used at runtime.

Next, let's look at the exception classifier:

public class CockroachExceptionClassifier {
    private final Logger logger = LoggerFactory.getLogger(getClass());

    private static final String SERIALIZATION_FAILURE = "40001";

    public boolean shouldRetry(Throwable ex) {
        if (ex == null) {
            return false;
        }
        Throwable throwable = NestedExceptionUtils.getMostSpecificCause(ex);
        if (throwable instanceof SQLException) {
            return shouldRetry((SQLException) throwable);
        }
        logger.warn("Non-transient exception {}", ex.getClass());
        return false;
    }

    public boolean shouldRetry(SQLException ex) {
        if (SERIALIZATION_FAILURE.equals(ex.getSQLState())) {
            logger.warn("Transient SQL exception detected : sql state [{}], message [{}]",
                    ex.getSQLState(), ex.toString());
            return true;
        }
        return false;
    }
}

We also add the classifier bean to the configuration:


    @Bean
    public CockroachExceptionClassifier exceptionClassifier() {
        return new CockroachExceptionClassifier();
    }

The shouldRetry method simply looks for the exception type and if it is a SQLException that it has the proper state code 40001.

We could qualify exceptions with other state codes but then there are no guarantees of multiple side effects when retried. For example, if a transaction involves multiple INSERTs and the COMMIT is successful but lost in transit in the reply back to the client. In that case, it wouldn't use state code 40001 but more likely a broken connection error code.

To be safe, only retry on the state code 40001 and nothing else, unless you are sure about the side effects of your SQL transactions and it's considered safe (or the operations are idempotent).

Demo Project

Roach Retry is a project that provides runnable examples of different transaction retry strategies for Spring Boot and the JavaEE stack. It includes Spring Retry along with a simpler AOP-driven approach and JavaEE interceptors for old-style stateless session beans.

Step 1: Startup

Create the database:

cockroach sql --insecure --host=localhost -e "CREATE database roach_retry"

Build the app:

cd spring-retry
../mvnw clean install

Run the app:

java -jar target/roach-retry.jar

Then open another shell window so you have at least two windows. In any of the shells, check that the service is up and connected to the database:

curl --verbose http://localhost:8090/api

Step 2: Get Order Request Form

Print an order form template that we will use to create orders:

curl http://localhost:8090/api/order/template > form.json

Step 3: Submit Order Form

Create at least one purchase order:

curl http://localhost:8090/api/order -H "Content-Type:application/json" -X POST -d "@form.json"

Step 4: Produce a Read/Write Conflict

Assuming that there is now an existing order with ID 1 with status PLACED. We will read that order and change the status to something else, concurrently. This is known as a read-write or unrepeatable-read conflict which is prevented by serializable isolation. As a result, there will be a SQL exception and a rollback.

When this happens, the retry mechanism will kick in and retry the failed transaction. It will then succeed since the two transactions are no longer conflicting since one of them was committed successfully.

To observe this predictably we'll use two separate sessions with a controllable delay between the read and write operations.

Overview of the SQL operations executed (what the service will execute):

BEGIN; -- T1
SELECT * FROM purchase_order WHERE id=1; -- T1 
-- T1: Assert that status is `PLACED`
-- T1: Suspend for 15s  
BEGIN; -- T2
SELECT * FROM purchase_order WHERE id=1; -- T2
-- Assert that status is still `PLACED`
UPDATE purchase_order SET order_status='PAID' WHERE id=1; -- T2 
COMMIT; -- T2 (OK)
UPDATE purchase_order SET order_status='CONFIRMED' WHERE id=1; -- T1 (ERROR!)
ROLLBACK; -- T1

Now prepare the two separate shell windows so you can run the commands concurrently.

First, check that the order with ID 1 exists and has the status PLACED (or anything else other than CONFIRMED)

curl http://localhost:8090/api/order/1

Now let's run the first transaction (T1) where there is a simulated 15-sec delay before the commit (you can increase/decrease the time):

curl http://localhost:8090/api/order/1?status=CONFIRMED\&delay=15000 -i -X PUT

In less than 15 sec and before T1 commits, run the second transaction (T2) from another session which doesn't wait and succeeds with a commit:

curl http://localhost:8090/api/order/1?status=PAID -i -X PUT

At this point, T1 has no other choice than to rollback and that will trigger a retry:

ERROR: restart transaction: TransactionRetryWithProtoRefreshError: WriteTooOldError: write for key /Table/109/1/12/0 at timestamp 1669990868.355588000,0 too old; wrote at 1669990868.778375000,3: "sql txn" meta={id=92409d02 key=/Table/109/1/12/0 pri=0.03022202 epo=0 ts=1669990868.778375000,3 min=1669990868.355588000,0 seq=0} lock=true stat=PENDING rts=1669990868.355588000,0 wto=false gul=1669990868.855588000,0

The retry mechanism will catch that SQL exception, back off for a few hundred millis and then retry until it eventually succeeds (1 attempt).

The expected outcome is a 200 OK returned to both client sessions. The final order status must be CONFIRMED since client 1 request (T1) was retried and eventually committed, thereby overwriting T2.

curl http://localhost:8090/api/order/1

Conclusion

In this tutorial, we explore using Spring Retry, a library for retrying failed method invocations of a transient nature, to handle serialization conflict errors denoted by the SQL state code 40001. We cover how to set up Maven, configure Spring Retry, create a sample service, and demonstrate a retry scenario using a demo project.