Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No, SQLite's dynamic data types would silently coerce data just like opening a CSV directly with Excel does.

The advantage of CSV is that it's as accurate as your plain text representation of your data can be. Since binary data can be represented by character data, that's 100% accurate. As soon as you introduce a storage format that has made assumptions about the type of data being stored, you've lost flexibility.

SQLite is not intended for data serialization. It's intended for data storage to be read back by essentially the same application in the same environment.



Excel's default insistence on trying to coerce the data into what it expects the data to be is the source of many of the problems of people using CSV. Excel will even ignore quotes around values if it thinks that the value inside might be a number. I often work with CSV files that contain 17 digit IDs that use numeric characters "01233998742021925" which Excel will convert lossily into scientific notation rather than leaving it as a string. There are ways to override that but they are tedious and people don't do it by default.

I'd be satisfied if Excel was just a little less aggressive about it's data coercion on import.


Excel trying to be helpful and silently converting data is a source of endless pain. In fact it is so bad, that geneticists have renamed some genes: https://www.theverge.com/2020/8/6/21355674/human-genes-renam...


> No, SQLite's dynamic data types would silently coerce data just like opening a CSV directly with Excel does.

SQLite's "dynamic data types" coerce data on input, not output. Once the data is in sqlite the way you wanted it, excel has no interpretation to perform, except insofar as really really wanting dates.

> The advantage of CSV is that it's as accurate as your plain text representation of your data can be.

Yeah nah.

> SQLite is not intended for data serialization. It's intended for data storage to be read back by essentially the same application in the same environment.

That's completely, absolutely, utterly, objectively, nonsensical.

Data analysis, exchange, container, and archiving, are literally all use cases listed as "appropriate uses for sqlite" in "Situations Where SQLite Works Well" on the official sqlite website: https://www.sqlite.org/whentouse.html


How is a csv not the most accurate representation of the data? If you trust the other agent encoded it properly in the db, then sure. Your flippant dismissal was inappropriate in tone and detracted from the rest of your opinion.

Cockiness tells me that you’re insecure about your knowledge, not that you know more than GP.


There's plenty of CSVs that have been produced or will be parsed by

for line in input: ','.join(line)

It's not exactly a problem with "CSV" specifically, but the environment in which it exists.


True, but if you don't trust the code generating the data, then any and every data format is suspect.


It's a thousand times easier to make a malformed "working" CSV than to make a malformed "working" sqlite file.


> How is a csv not the most accurate representation of the data? If you trust the other agent encoded it properly in the db, then sure.

The idea that a CSV would be more likely to be correctly encoded than a DB is hilarious, thanks for the laugh. But that you were confident enough to seriously put it in writing shows how little experience you have with CSV.


Yep, you got me. I'm actually a trapeze artist moonlighting on these forums.

A CSV file represents the exact digits that are to be stored. You have unlimited precision. You could even store irrational numbers, equations, or mix data types in a column. OTOH, you have to make sure the delimiting character is not present in the data - that can be pretty easy, if you use the field-separator character in ASCII, or even just a \t. I've even seen people terminate fields with ^8675309| because they felt confident no data would contain Jenny's number.

A database, like Excel, likes to conform data. This is usually awesome! But sometimes, it's not.


Excel isn't a database.


Which is exactly what GP didn't write.

"A bird, like homo sapiens, has two feet" doesn't mean that birds are humans or humans are birds either. It means that in this respect, birds and humans are alike. Which is what GP meant: With respect to conforming data, databases behave just like Excel.

Language note: I think it's the fact that there are two commas around the inserted-clause [terminology?] ", like Excel," that does it. If there were only one comma, before or after "like Excel" it would read the way you read it. Can be tricky for non-native (and sometimes also native) speakers.


> SQLite's "dynamic data types" coerce data on input, not output.

That's not relevant. If a data format coerces data when it's stored, it's still not a data format suitable for serialization.


It's strictly dependent on the table definitions you use. A column with no type (the simplest way to use them) will take data in any of the natively supported types—whatever you actually insert in your SQL statement—and will not attempt any translation.¹

If you do add column types to your tables as in traditional databases, then of course you should expect some varying behavior based on that.

I can imagine some potential for issues from people expecting to insert exact decimal numbers and choosing floating-point by accident, perhaps, or using integer types for actually-text numeric ID fields. If you go all-strings on input with no column affinity, then you'll get the same strings on output, though, so that's isomorphic to good-CSV.

¹ https://www.sqlite.org/datatype3.html – §3.1 “If the declared type for a column contains the string "BLOB" or if no type is specified then the column has affinity BLOB.” + §3 “A column with affinity BLOB does not prefer one storage class over another and no attempt is made to coerce data from one storage class into another.”




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: