But… this does drop data? Only the start and end timestamp are preserved; the mi...

corytheboyd · on Dec 6, 2024

Exactly my thoughts, the order of these events by timestamp is itself necessary for debugging.

If I want something like per-transaction rollup of events into one log message, I build it and use it explicitly.

efitz · on Dec 6, 2024

Was going to point out the same thing - the original article's solution is losing timestamps and possibly ordering. They also are losing some compressibility by converting to a structured format (JSON). And if they actually include a lot of UUIDs (their diagram is vague on what transaction IDs look like), then good luck - those don't compress very well.

I worked at a magnificent 7 company that compressed a lot of logs; we found that zstd actually did the best all-around job back in 2021 after a lot of testing.

greggyb · on Dec 6, 2024

We have a process monitor that basically polls ps output and writes it to JSON. We see ~30:1 compression using zstd on a ZFS dataset that stores these logs.

I laugh every time I see it.

eru · on Dec 6, 2024

Agreed.

If you used something like sequential IDs (even in some UUID format) it can compress pretty well.

willvarfar · on Dec 6, 2024

As a member of the UUIDv7 cheering squad let me say 'rah rah'! :D

pdimitar · on Dec 6, 2024

Which compression level of zstd worked best in terms of the ideal balance between compression ratio vs. run time?