Safari does not support vp8 or vp9 when playing youtube, and youtube serves h264 instead. h264 is less efficient in terms of compression ratio (more to download for the same quality), but h264 is decoded in hardware on OSX, and VP8 or VP9 isn't, which explains what you see.
This is why, for example, Safari does not have 4k video on Youtube, while being perfectly capable of playing 4k videos in general.
Depending on the machine, VP9 can be decoded in hardware on Firefox on Windows, but chip support is limited.
All that said, we're working on our video playback performance as we speak, especially on OSX (because it was so bad a few release back), but also in general.
Anyway, I imagine that's the price to pay for being crossplatform. You can't implement everything for every platform. Safari only has to work on macOS/iOS.
Speaking for Chrome's implementation, efficiently rendering video on macOS does require CALayer compositing, but it's not sufficient.
Only certain types of decoded frames can be efficiently scanned out (different from the types that can be used efficiently in OpenGL compositing). Actually entering the most efficient fullscreen video mode requires some magic. Matching macOS behavior exactly when a fallback to OpenGL compositing is required can be difficult (eg. colorspace bugs can result in flickering).
I have not looked at the new code in Firefox, but I would expect that not all of the benefit would be realized in a first release. In any case it's a huge undertaking to support a single platform; congrats to the team for making it happen!