I love and use both clickhouse and duckdb all the time, but it's become almost comical how senior clickhouse leadership comment on any duckdb-related HN thread like clockwork. And only very rarely bring up their affiliation :)
...says the completely anonymous internet guy. I'm laughing as much as you are.
I wasn't trying to imply anything negative about DuckDB with my post - was just sharing how ClickHouse does the same thing. FWIW: the blog author added my query to his blog, so my non-combative comment was politely received.
but what the person said is true: seems like clickhouse comments descend upon every recent duckdb post as if it’s some sort of competition or born out of inferiority complex.
clickhouse is really cool tech. duckdb is really cool tech. i grow weary of the CH infiltration to the point where it’s working against the intent to make me love CH
I ran the same queries and got similar results but the bandwidth utilization I measured was significantly different. On the same fly.io instance with 1vCPU/256MB both queries completed successfully but ClickHouse/chdb reached 10MB/s (max) and logically completed the count faster, while DuckDB only peaked at around 2.5MB/s.
This might be due to the tiny resources but I like rock bottom measurements. Did anyone else notice a similar bandwidth utilization gap?
If that's raw network transfer, it's probably just a difference in headers or MTU size. Larger MTU -> fewer headers required. Maybe a difference in network configuration that requires more or less data in the header.
I suppose if you had data in a format that DuckDb doesn't work with, like Protobuf, Avro, ORC, Arrow, etc. ClickHouse reads and writes data in over 70 formats
That's interesting! I don't have much experience with Clickhouse, especially not in the last two years. I'll have to try this out myself. That's a pretty incredible if it can handle batching internally.
It's definitely cool to be able to query data in place instead of inserting it into a table. You can use clickhouse-local to do the same thing with JSON files (and with dozens of other data formats): https://clickhouse.com/blog/worlds-fastest-json-querying-too...