JPA Best Practices - JSONB Mapping

How to use CockroachDB JSONB data type with JPA and Hibernate

·

7 min read

Introduction

This article is part five of a series of data access best practices when using JPA and CockroachDB. Although most of these guidelines are fairly framework and database agnostic, it’s mainly targeting the following Java technology stack:

  • JPA 2 and Hibernate 5

  • Spring Boot 2.7

  • Spring Data 2.7

  • JDK 1.8

  • CockroachDB v22

Example Code

The code examples are available on GitHub.

Chapter 5: JSONB

CockroachDB supports storing, manipulating and indexing JSON documents through the JSONB data type. This is useful for storing and querying unstructured JSON documents alongside structured elements.

In this chapter, we will take a look at how to map and use the JSONB data type with Hibernate 5 and JPA. If you are using Hibernate 6, then there's a built-in type available via the @JdbcTypeCode annotation. For this article, we are going to use a custom UserType to achieve the same thing.

Let's first take a look at the table schema:

create table journal
(
    id         STRING PRIMARY KEY AS (payload ->> 'id') STORED, 
    event_type varchar(15) not null,
    payload    json,
    tag        varchar(64),
    updated    timestamptz default clock_timestamp(),
    INVERTED INDEX event_payload (payload)
);

create index idx_journal_main on journal (event_type, tag) storing (payload);

This table stores journal entries in the form of JSON event payloads. Think of it like an event store as part of an event-driven system design. There are a few things to highlight. First off, we are using a computed primary index column (id) that projects into the JSONB document in the payload column. That way, we use a common string in the JSON documents as the table's primary key.

The JSON document for each row is stored in the payload column, on which we also apply a Generalized Inverted Index or GIN index. Generalized inverted indexes store mappings from values within a container column (such as the payload JSONB document) to the row that holds that value.

Finally, we use a Covering Index storing the payload to avoid index joins when filtering on event type and tag in queries.

All this indexing isn't needed for the Hibernate mapping example, but more to show a few indexing techniques for unstructured JSONB documents.

Next, let's look at the Journal entity mapped against this table, which is modeled as an abstract type:

@Entity
@Table(name = "journal")
@Inheritance(strategy = InheritanceType.SINGLE_TABLE)
@DiscriminatorColumn(
        name = "event_type",
        discriminatorType = DiscriminatorType.STRING,
        length = 15
)
@MappedSuperClass
public abstract class Journal<T> {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY) 
    private String id;

    @Column(name = "tag")
    private String tag;

    @Basic
    @Column(name = "updated")
    private LocalDateTime updated;

    @Type(type = "jsonb")
    @Column(name = "payload")
    @Basic(fetch = FetchType.LAZY)
    private T event;
...
}

The reason it's abstract is that we are using the JPA single inheritance strategy with the event_type column as a discriminator. That means we can use a single table to host a hierarchy of entity types. In this case, there are only two: The first one is a journal of accounts and the second is a journal of monetary transactions towards accounts. We are storing the account and transaction models serialized into binary JSON format, rather than decomposing the model into a normalized table structure like account, transaction and transaction_item.

JSONB using Hibernate 4 or 5

As mentioned you can use the @JdbcTypeCode(SqlTypes.JSON) since Hibernate 6 to map arbitrary objects to JSONB columns. For earlier Hibernate versions (4 and 5), you will need to use a custom UserType which we are going to do next.

First of all, you have to implement the methods sqlTypes() and returnedClass(),which tells Hibernate what SQL type and Java class to use for the mapping. In this case, we use different entity subclasses, so the returnedClass is left abstract. Later we'll define a concrete subclass per entity level.

public abstract class AbstractJsonDataType<T> implements UserType {
    @Override
    public int[] sqlTypes() {
        return new int[] {Types.JAVA_OBJECT};
    }

    @Override
    public abstract Class<T> returnedClass();
...
}

The next methods are nullSafeGet and nullSafeSet, used for reading and writing respectively.

    @Override
    public Object nullSafeGet(ResultSet rs, String[] names, SharedSessionContractImplementor session, Object owner)
            throws HibernateException, SQLException {
        final String cellContent = rs.getString(names[0]);
        if (cellContent == null) {
            return null;
        }
        TypeDef typeDef = AnnotationUtils.findAnnotation(owner.getClass(), TypeDef.class);
        Class<?> clazz = typeDef != null ? typeDef.defaultForType() : returnedClass();
        try {
            if (isCollectionType()) {
                JavaType type = mapper.getTypeFactory().constructCollectionType(List.class, returnedClass());
                return mapper.readValue(cellContent.getBytes("UTF-8"), type);
            }
            return mapper.readValue(cellContent.getBytes("UTF-8"), clazz);
        } catch (Exception ex) {
            throw new HibernateException("Failed to deserialize json to " + clazz.getName(), ex);
        }
    }

    @Override
    public void nullSafeSet(PreparedStatement ps, Object value, int index, SharedSessionContractImplementor session)
            throws HibernateException, SQLException {
        if (value == null) {
            ps.setNull(index, Types.OTHER);
            return;
        }
        try {
            StringWriter w = new StringWriter();
            mapper.writeValue(w, value);
            w.flush();
            ps.setObject(index, w.toString(), Types.OTHER);
        } catch (Exception ex) {
            throw new HibernateException("Failed to serialize " + value.getClass().getName() + " to json", ex);
        }
    }

Finally, the deepCopy method is important to implement in such a way it creates a separate byte-level copy of supplied object references. In this case by serializing and deserializing to/from JSON.

    @Override
    public Object deepCopy(final Object value) throws HibernateException {
        try {
            if (isCollectionType()) {
                JavaType type = mapper.getTypeFactory().constructCollectionType(List.class, returnedClass());
                return mapper.readValue(mapper.writeValueAsString(value), type);
            }
            return mapper.readValue(mapper.writeValueAsString(value), returnedClass());
        } catch (IOException ex) {
            throw new HibernateException(ex);
        }
    }

There's some additional code to handle collection types which enables the use of List<MyObject> element type of fields.

The complete user type used for this example is available on GitHub.

Using the User Type

The final step is to register the user type with the @TypeDef annotation. We also create one concrete subclass for each JSON type.

@Entity
@DiscriminatorValue("ACCOUNT")
@TypeDef(name = "jsonb", typeClass = AccountJournal.JsonType.class, defaultForType = Account.class)
public class AccountJournal extends Journal<Account> {
    public static class JsonType extends AbstractJsonDataType<Account> {
        @Override
        public Class<Account> returnedClass() {
            return Account.class;
        }
    }
}

@Entity
@DiscriminatorValue("TRANSACTION")
@TypeDef(name = "jsonb", typeClass = TransactionJournal.JsonType.class, defaultForType = Transaction.class)
public class TransactionJournal extends Journal<Transaction> {
    public static class JsonType extends AbstractJsonDataType<Transaction> {
        @Override
        public Class<Transaction> returnedClass() {
            return Transaction.class;
        }
    }
}

The event payload field is defined as a generic type in the superclass using the @Type annotation. The lazy fetch type is optional and the default is eager fetching:

..   
 @Type(type = "jsonb")
 @Column(name = "payload")
 @Basic(fetch = FetchType.LAZY)
 private T event;
 ..

Application Code

Now, from an application standpoint, all we need to do to use the custom user type is to assign the object references. Here we have an account value object which will be stored in binary JSON format in the mapped payload column.

// Event value object to be stored as JSONB
Account account = Account.builder()
                .withGeneratedId()
                .withAccountType("asset")
                .withName("abc")
                .withBalance(BigDecimal.valueOf(250.00))
                .build();
// JPA entity in transient state
AccountJournal journal = new AccountJournal();
journal.setTag("asset");
journal.setEvent(account);
journal = accountJournalRepository.save(journal);

Projection and Aggregation Queries

To wrap up, let's take a quick look at how we use Spring Data JPA repositories for querying our journal entities with JSON documents.

In the following example, we are filtering journal entries for account events with a set balance between a lower and upper bound:

@Repository
public interface AccountJournalRepository extends JpaRepository<AccountJournal, UUID> {
    @Query(value = "SELECT * FROM journal WHERE event_type='ACCOUNT'"
            + " AND (payload ->> 'balance')::::decimal BETWEEN :lowerBound AND :upperBound",
            nativeQuery = true)
    List<AccountJournal> findAccountsWithBalanceBetween(
            @Param("lowerBound") BigDecimal lowerBound, @Param("upperBound") BigDecimal upperBound);
}

In the final example below, we are doing similar queries against the transaction journal event types:

@Repository
public interface TransactionJournalRepository extends JpaRepository<TransactionJournal, UUID> {
    @Query(value = "SELECT j FROM Journal j WHERE j.tag=:tag")
    List<TransactionJournal> findByTag(@Param("tag") String tag);

    @Query(value = "SELECT * FROM journal WHERE event_type='TRANSACTION'"
            + " AND payload ->> 'transferDate' BETWEEN :startDate AND :endDate",
            nativeQuery = true)
    List<TransactionJournal> findTransactionsInDateRange(@Param("startDate") String startDate,
                                                         @Param("endDate") String endDate);

    @Query(value =
            "WITH x AS(SELECT payload from journal where event_type='TRANSACTION' AND tag=:tag),"
                    + "items AS(SELECT json_array_elements(payload->'items') as y FROM x) "
                    + "SELECT sum((y->>'amount')::::decimal) FROM items",
            nativeQuery = true)
    BigDecimal sumTransactionLegAmounts(@Param("tag") String tag);
}

The last sumTransactionLegAmounts method executes an aggregation query on the amount field of the stored documents. It uses the following CTE in a more readable format:

WITH x AS (SELECT payload from journal where event_type = 'TRANSACTION' AND tag='tag'), items AS (SELECT json_array_elements(payload -> 'items') as y FROM x) SELECT sum((y ->> 'amount')::decimal)
FROM items;

If we run an explain on that query, you can see the index idx_journal_main being used without an index join (since the payload is stored in the secondary index):

distribution: local
vectorized: true

• group (scalar)
│ estimated row count: 1
│
└── • render
    │
    └── • project set
        │ estimated row count: 40,600
        │
        └── • render
            │
            └── • scan
                  estimated row count: 4,060 (100% of the table; stats collected 3 hours ago; using stats forecast for 3 hours ago)
                  table: journal@idx_journal_main
                  spans: [/'TRANSACTION'/'cashout' - /'TRANSACTION'/'cashout']

Summary

CockroachDB offers the JSONB data type to store unstructured JSON documents in binary format in the database. Hibernate 6 provides a standard annotation @JdbcTypeCode that can be used for JSONB columns. For Hibernate 4 and 5 you can implement the UserType interface and register it with a @TypeDef annotation. Mapping SQL queries against JSONB columns in Spring Data is easily done through native SQL queries.

Reference