Scanning through a CSV can be quite close to querying a SQL database in performa... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		piperswe on March 3, 2023 \| parent \| context \| favorite \| on: DuckDB: Querying JSON files as if they were tables Scanning through a CSV can be quite close to querying a SQL database in performance when the SQL database doesn't have any indices. The primary benefits of using a SQL database for querying are (1) indices and (2) a declarative query language. Using DuckDB or SQLite's CSV/JSON support gets you the best of both worlds (minus indices), where you get the declarative query language and query planner but your data's still just CSV/JSON files. For a dataset that size, I'd probably use SQLite to avoid having to manage a persistent MySQL process, especially when it's being used as an alternative to CSV files. That is, unless there's a MySQL/Postgres server already running I can just create a new database on.

sidpatil on March 3, 2023 [–]

> Using DuckDB or SQLite's CSV/JSON support gets you the best of both worlds (minus indices)

DuckDB automatically creates indexes for all general-purpose columns. However, they're not persisted.

https://duckdb.org/docs/sql/indexes.html

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact