Thanks for that ... I got a little bit nerd sniped. I haven't gotten to the bottom of this one, but I dug a bit.
On my machine dual-socket Intel(R) Xeon(R) CPU L5640 running 14.2-RELEASE-p3, I found similar differences in speed with the system openssl (3.0.16) and rustls from head (795ae1f5d0435dbc80dac04ec147e85d4970563c).
Openssl 3.0.16 (FreeBSD base 14.2-RELEASE-p3)
handshakes server TLSv1.3 TLS_AES_256_GCM_SHA384 1275.38 handshakes/s (512 / 0.401448)
Rustls: (795ae1f5d0435dbc80dac04ec147e85d4970563c)
handshakes TLSv1_3 EcdsaP256 TLS13_AES_256_GCM_SHA384 server server-auth no-resume 1998.39 handshakes/s
I looked at a lot of stuff, but no real smoking guns. There's a difference in behavior between the two handshakes, but it's not that different. openssl-bench generates 4 application packet wrappers for the 'first flight', whereas rustls generates one which contains the 4 messages of encrypted extensions, server cert, server cert verify, server handshake finished; this seems like it could be significant, but I couldn't easily undo it to test. Also, openssl-bench generates 2 more application packets after receiving the client handshake finished; I'm pretty sure those are tickets, but turning off ticket generation was ~ 1% improvement, so whatever.
However, one of my friends suggested aws-lc might just be super fast, so I ran openssl-bench linked against that and saw a big improvement. So I went ahead and tried with all the options from FreeBSD pkg. Here's my list of results:
aws-lc-1.48.4 (freebsd pkg)
handshakes server TLSv1.3 TLS_AES_256_GCM_SHA384 2478.93 handshakes/s (512 / 0.206541)
openssl111-1.1.1w_2 (freebsd pkg)
handshakes server TLSv1.3 TLS_AES_256_GCM_SHA384 1773.9 handshakes/s (512 / 0.28863)
openssl-3.0.16,1
handshakes server TLSv1.3 TLS_AES_256_GCM_SHA384 1333.5 handshakes/s (512 / 0.383951)
openssl31-3.1.8
handshakes server TLSv1.3 TLS_AES_256_GCM_SHA384 1387.69 handshakes/s (512 / 0.368958)
openssl32-3.2.4
handshakes server TLSv1.3 TLS_AES_256_GCM_SHA384 1353.54 handshakes/s (512 / 0.378267)
openssl33-3.3.3
handshakes server TLSv1.3 TLS_AES_256_GCM_SHA384 1406.62 handshakes/s (512 / 0.363994)
openssl34-3.4.1
handshakes server TLSv1.3 TLS_AES_256_GCM_SHA384 1393.34 handshakes/s (512 / 0.367463)
openssl35-3.5.0.b1
handshakes server TLSv1.3 TLS_AES_256_GCM_SHA384 1155 handshakes/s (512 / 0.443289)
boringssl-0.0.0.0.2025.03.27.01_1
did not manage to get a matching cipher
libressl-4.0.0_1
(does not compile, don't care to fix)
So.... in my testing, on my machine, rustls is faster than openssl-bench linked against openssl and openssl 1.1.1 is faster than openssl 3.x, but openssl-bench linked against aws-lc is faster than rustls.
I'll try to get ahold of the authors tomororow and suggest they add openssl-bench linked against aws-lc to their test.
https://rustls.dev/perf/2024-11-28-threading/