Benchmarking rustls 0.23.15 vs OpenSSL 3.3.2 vs BoringSSL on ARM64

2024-10-31

System configuration

We ran the benchmarks on a bare-metal server with the following characteristics:

This is the Hetzner RX170.

Versions

The benchmarking tool used for both OpenSSL and BoringSSL was openssl-bench d5de57d9.

This was built from source with its makefile.

BoringSSL

The tested version of BoringSSL is 76968bb3d5, which was the most recent point on master when we started the previous measurements.

BoringSSL was built from source with CC=clang CXX=clang++ cmake -DCMAKE_BUILD_TYPE=Release. clang is used here to avoid potential performance deficits to GCC and for consistency with the x86 results.

OpenSSL

The tested version of OpenSSL is 3.3.2, which was the latest release at the time of writing.

OpenSSL was built from source with ./Configure ; make -j12.

Rustls

The tested version of rustls was 0.23.15, which was the latest release at the time of writing. This was used with aws-lc-rs 1.10.0 / aws-lc-sys 0.22.0.

Additionally the following two commits were included, which affect the benchmark tool but do not affect the core crate:

Measurements

BoringSSL was tested with this command:

~/bench/openssl-bench
$ BENCH_MULTIPLIER=16 setarch -R make measure BORINGSSL=1

OpenSSL was tested with this command:

~/bench/openssl-bench
$ BENCH_MULTIPLIER=16 setarch -R make measure

rustls was tested with this command:

~/bench/rustls
$ BENCH_MULTIPLIER=16 setarch -R make -f admin/bench-measure.mk measure

Results

Transfer measurements are in megabytes per second. Handshake units are handshakes per second.

BoringSSL 76968bb3OpenSSL 3.3.2rustls 0.23.15
transfer, 1.2, aes-128-gcm, sending2211.532101.232077.19
transfer, 1.2, aes-128-gcm, receiving2250.932344.942173.4
transfer, 1.3, aes-256-gcm, sending1886.171741.071809.8
transfer, 1.3, aes-256-gcm, receiving1899.721953.491935.8
BoringSSL 76968bb3OpenSSL 3.3.2rustls 0.23.15
full handshakes, 1.2, rsa, client1968.071588.544498.42
full handshakes, 1.2, rsa, server334.077319.886614.27
full handshakes, 1.2, ecdsa, client1527.731118.562154.06
full handshakes, 1.2, ecdsa, server3636.482950.548303.67
full handshakes, 1.3, rsa, client1861.151441.863986.81
full handshakes, 1.3, rsa, server330.484312.446599.39
full handshakes, 1.3, ecdsa, client1459.641045.982032.11
full handshakes, 1.3, ecdsa, server3252.582440.256212.45
BoringSSL 76968bb3OpenSSL 3.3.2rustls 0.23.15
resumed handshakes, 1.2, client45452.218396.565267.61
resumed handshakes, 1.2, server43356.520426.465313.22
resumed handshakes, 1.3, client3969.883282.148443.11
resumed handshakes, 1.3, server3791.213071.617841.35

graph of transfer speeds

graph of full handshakes

graph of resumed handshakes

Observations on results

rustls trails a little in throughput tests. The three underlying cryptography libraries (BoringSSL, aws-lc, OpenSSL) have their own benchmarking tools which confirm that there is little variance between them:

~/bench/boringssl
$ LD_LIBRARY_PATH=. ./tool/bssl speed -filter AES-256-GCM
(...)
Did 139000 AES-256-GCM (16384 bytes) seal operations in 1004138us (138427.2 ops/sec): 2268.0 MB/s
(...)
~/bench/aws-lc
$ LD_LIBRARY_PATH=. ./tool/bssl speed -filter AES-256-GCM
(...)
Did 139000 EVP-AES-256-GCM encrypt (16384 bytes) operations in 1004522us (138374.3 ops/sec): 2267.1 MB/s
(...)
~/bench/openssl
$ LD_LIBRARY_PATH=. ./apps/openssl speed -aead -evp aes-256-gcm
(...)
Doing AES-256-GCM ops for 3s on 16384 size blocks: 434715 AES-256-GCM ops in 3.00s
(...)
The 'numbers' are in 1000s of bytes per second processed.
type              2 bytes     31 bytes    136 bytes   1024 bytes   8192 bytes  16384 bytes
AES-256-GCM      13570.29k   168865.38k   623050.41k  1766296.58k  2320760.83k  2374123.52k

That is 2268, 2267 and 2264 MB/s for BoringSSL, aws-lc and OpenSSL respectively. Given these project's shared lineage, it would not be surprising if the implementations are the same.