Photo by Tingey Injury Law Firm on Unsplash
Running CockroachDB TPC-C benchmark on GKE
Running the TPC-C benchmark on a self-hosted, single-region CockroachDB cluster on GKE
Overview
This article will demonstrate how to run a TPC-C 2.5K benchmark on a self-hosted, 3-node, single-region CockroachDB cluster on Google Kubernetes Engine (GKE).
The TPC-C workload is modeled around the concept of a warehouse which is used as the throughput "knob" to measure scalability. The number of warehouses used maps to a given data volume baseline where a warehouse count of 2,500 translates to approximately a 200 GiB dataset.
The demo cluster is using 3x c2-standard-16
machines to match that volume, along with provisioned IOPS to follow CockroachDB recommended production guidelines on hardware ratios. The workload is run through an internal client pod with an option to run through either and internal or external load-balancer as well.
About the TPC-C Benchmark
The CockroachDB's built-in TPC-C workload is based on official TPC-C, the industry standard benchmark for On-line Transaction Processing (OLTP) performance. It simulates an industry-agnostic business with an OLTP database that manages, sells, or distributes a product.
The TPC-C workload measures databases across two different metrics:
- Throughput: Measured as throughput-per-minute (tpm), which in practical terms measures the number of orders processed per minute.
- Scale: Measured as the total number of warehouses supported. Each warehouse is of a fixed data size and has a max amount of tpm that it is allowed to support, so the total data size of the benchmark is scaled proportionally to throughput.
The CockroachDB TPC-C implementation can be found here and the schema can be found here.
The TPC-C workload is constructed to validate that the efficiency rate can be sustained
when aiming for an increasingly higher tpmC (max throughput). Efficiency is measured in
an explicit way. There's a limit to the number of tpmC allowed per warehouse, which is 12.86. The data amount per warehouse is about 200MiB, so for 2,500 warehouses, the maximum throughput is 2500 x 12.86
tpmC, which is 32,150
.
Because TPC-C is constrained to a maximum amount of throughput per warehouse (12.86 tpmC), we often discuss TPC-C performance as the maximum number of warehouses for which a database can maintain the maximum throughput per minute (tpmC). In TPC-C, the required minimum to qualify is P9585%.
To take a few examples, assume:
100 warehouses at 200MiB 100 gives 1240 tpmC (max is 1286), that's an efficiency rate of 96.4% or (1240/(100 12.86)).
1000 warehouses at 200MiB 1000 gives 12,500 tpmC, that's an efficiency rate of 97.2% (12500/(1000 12.86)).
2500 warehouses at 200MiB 2500 gives 30,837 tpmC, that's an efficiency rate of 95.9% (30837/(2500 12.86)).
100,000 warehouses at ~20TiB gives 1,200,000 tpmC, that's an efficiency rate of 93.3%.
The largest [published result (cockroachlabs.com/docs/v22.1/performance#be..) for CockroachDB is 1.7M tpmC with 140,000 warehouses on 81 nodes, resulting in an efficiency score of 95%.
TPC-C Test Setup
Overview of the cluster setup for a TPC-C workload size small (2.5K warehouses) on a 3 node cluster in a single region. For more details, see Performance Benchmarking with TPC-C Small. and also Deploy CockroachDB with Kubernetes.
Layout:
- Single region: europe-west-1
- Machines: c2-standard-16
- 3 CockroachDB nodes + 1 client node
- Secure cluster
- Manual StatefulSet configuration
Optional:
- External load balancer service
- 1x client outside of k8s for controlling the tpcc workload
Setup Steps
Step 1 - Start GKE cluster
gcloud container clusters create cockroachdb --machine-type c2-standard-16 --region europe-west1 --num-nodes 1
Step 2 - Create RBAC roles
kubectl create clusterrolebinding $USER-cluster-admin-binding --clusterrole=cluster-admin --user=<email>
Step 3 - Configure cluster
curl -O https://raw.githubusercontent.com/cockroachdb/cockroach/master/cloud/kubernetes/bring-your-own-certs/cockroachdb-statefulset.yaml
Edit cockroachdb-statefulset.yaml
and update:
Resource requests / limits to reflect c2-standard-16
machines:
resources:
requests:
cpu: "15"
memory: "55Gi"
limits:
cpu: "15"
memory: "55Gi"
Add a custom pd-ssd
storage class:
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gocrazy
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
Add storageClassName
and change storage size to 2TiB:
volumeClaimTemplates:
- metadata:
name: datadir
spec:
accessModes:
- "ReadWriteOnce"
storageClassName: gocrazy
resources:
requests:
storage: 2Ti
pd-ssd is recommended for pods < 32 vCPU and a minimum of 500 IOPS per vCPU is needed for optimal performance.
See also:
Step 4 - Create certificates
mkdir certs my-safe-directory
cockroach cert create-ca --certs-dir=certs --ca-key=my-safe-directory/ca.key
cockroach cert create-client root --certs-dir=certs --ca-key=my-safe-directory/ca.key
kubectl create secret generic cockroachdb.client.root --from-file=certs
cockroach cert create-node localhost 127.0.0.1 cockroachdb-public cockroachdb-public.default cockroachdb-public.default.svc.cluster.local "*.cockroachdb" "*.cockroachdb.default" "*.cockroachdb.default.svc.cluster.local" --certs-dir=certs --ca-key=my-safe-directory/ca.key
kubectl create secret generic cockroachdb.node --from-file=certs
kubectl get secrets
Step 5 - Initialize cluster
kubectl create -f cockroachdb-statefulset.yaml
kubectl get pods
kubectl get pv
kubectl exec -it cockroachdb-0 -- /cockroach/cockroach init --certs-dir=/cockroach/cockroach-certs
kubectl get pods
Step 6 - Create secure pod for SQL cli
kubectl create -f https://raw.githubusercontent.com/cockroachdb/cockroach/master/cloud/kubernetes/bring-your-own-certs/client.yaml
kubectl exec -it cockroachdb-client-secure -- ./cockroach sql --certs-dir=/cockroach-certs --host=cockroachdb-public
Create user in CLI:
CREATE USER roach WITH PASSWORD '123456'; -- maintains the rank..
GRANT admin to roach;
Step 7 - Access DB console (optional)
Setup port forwarding:
kubectl port-forward service/cockroachdb-public 8080
open https://localhost:8080/#/overview/list
Step 8 - Add external Load Balancer (optional)
kubectl get services
kubectl expose service cockroachdb --port=26257 --target-port=26257 --name=cockroachdb-external --type=LoadBalancer
kubectl get services
(wait for external ip)
Example:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cockroachdb ClusterIP None <none> 26257/TCP,8080/TCP 11m
cockroachdb-external LoadBalancer 10.3.248.15 34.140.51.58 26257:30143/TCP 48s
cockroachdb-public ClusterIP 10.3.244.188 <none> 26257/TCP,8080/TCP 11m
kubernetes ClusterIP 10.3.240.1 <none> 443/TCP 26m
Try connecting which should fail:
cockroach sql --url "postgres://root@34.140.51.58:26257" --certs-dir=certs
ERROR: x509: certificate is valid for 127.0.0.1, not 34.140.51.58
Update certificates with external IP:
cockroach cert create-node 34.140.51.58 localhost 127.0.0.1 cockroachdb-public cockroachdb-public.default cockroachdb-public.default.svc.cluster.local "*.cockroachdb" "*.cockroachdb.default" "*.cockroachdb.default.svc.cluster.local" --certs-dir=certs --ca-key=my-safe-directory/ca.key --overwrite
kubectl delete secret cockroachdb.node --ignore-not-found
kubectl create secret generic cockroachdb.node --from-file=certs
Restart pods and reconnect with success:
cockroach sql --url "postgres://root@34.140.51.58:26257" --certs-dir=certs
Benchmark Steps
Step 1 - Import dataset
2,500 warehouses is about 200GiB of data - see jobs in db console.
Use either alternative:
Option 1 - via public ip:
cockroach workload fixtures import tpcc --warehouses=2500 'postgres://root@<external-ip>:26257?sslmode=verify-full&sslrootcert=certs/ca.crt&sslcert=certs/node.crt&sslkey=certs/node.key'
Option 2 - via client pod and public service:
kubectl exec -it cockroachdb-client-secure -- ./cockroach workload fixtures import tpcc --warehouses=2500 'postgres://root@cockroachdb-public:26257?sslmode=verify-full&sslrootcert=/cockroach-certs/ca.crt&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key'
Step 2 - Run TPC-C workload for 30m
Use either alternative:
Option 1 - via external lb:
ulimit -n 100000 && cockroach workload run tpcc --tolerate-errors --warehouses=2500 --ramp=1m --duration=15m 'postgres://root@34.140.51.58:26257?sslmode=verify-full&sslrootcert=certs/ca.crt&sslcert=certs/node.crt&sslkey=certs/node.key'
Option 2 - via client pod and public service:
kubectl exec -it cockroachdb-client-secure -- ./cockroach workload run tpcc --tolerate-errors --warehouses=2500 --ramp=1m --duration=30m 'postgres://root@cockroachdb-public:26257?sslmode=verify-full&sslrootcert=/cockroach-certs/ca.crt&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key'
Option 3 - via client pod directly to pods:
create an addrs file:
postgres://root@cockroachdb-0.cockroachdb.default.svc.cluster.local:26257?sslmode=verify-full&sslrootcert=/cockroach-certs/ca.crt&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key postgres://root@cockroachdb-1.cockroachdb.default.svc.cluster.local:26257?sslmode=verify-full&sslrootcert=/cockroach-certs/ca.crt&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key postgres://root@cockroachdb-2.cockroachdb.default.svc.cluster.local:26257?sslmode=verify-full&sslrootcert=/cockroach-certs/ca.crt&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key
run:
kubectl exec -it cockroachdb-client-secure -- ./cockroach workload run tpcc --tolerate-errors --warehouses=2500 --ramp=1m --duration=30m $(cat addrs)
Step 3 - Review results
Ex:
_elapsed_______tpmC____efc__avg(ms)__p50(ms)__p90(ms)__p95(ms)__p99(ms)_pMax(ms)
1800.1s 30837.5 95.9% 326.2 302.0 604.0 704.6 973.1 3623.9
Benchmark passing criteria for our derivative TPC-C results:
- P90 Latency < 5 Seconds
- Efficiency rate over 95%.
TPC-C requirements are P95<10s and efficiency rate over 85%.
Cleanup Steps
Ensure the storage claims are deleted as well since it's not automatic in GKE.
kubectl delete pods,statefulsets,services,poddisruptionbudget,jobs,rolebinding,clusterrolebinding,role,clusterrole,serviceaccount -l app=cockroachdb
kubectl delete pod cockroachdb-client-secure
gcloud container clusters delete cockroachdb --region europe-west1
Conclusion
This was a tutorial of running the TPC-C workload in CockroachDB on a self-hosted GKE cluster.
Related Resources
- cockroachlabs.com/guides/2022-cloud-report
- cockroachlabs.com/docs/v22.1/kubernetes-per..
- cockroachlabs.com/docs/v22.1/operate-cockro..
- cockroachlabs.com/docs/v22.1/recommended-pr..
- cloud.google.com/compute/docs/disks
- cockroachlabs.com/docs/v22.1/performance-be..
- cockroachlabs.com/docs/v22.1/performance#sc..
- tpc.org/tpc_documents_current_versions/pdf/..