As long as it is text-based. At least you can actually look at a CSV and see what is going (wrong), as well as use all the text tools we have. Not that there aren't problems as the article points out. But some binary file based on the whims of a proprietary program...no thanks.
Human can't efficiently write or parse XML or json, though. In some scenarios CSV hits the right spot to be accessible to human and computer, and the table can be laid out so that one can sort/grep/awk to quickly gain some insight.
Late for the response but: JSON naturally encourages data to structural in ways that are often nested. In your example, if there's one or two additional layers of context (say, affiliation of the two persons), maintaining a CSV-like layout will be more cumbersome and unnecessary. I don't see many CSV or tabular-like JSON/XML in the wild and they really shouldn't be used that way.
Well, to work as a CSV-like JSON, i.e greppable similar to a CSV it needs formatting with linebreaks as above. It is still completely valid JSON. ++ is the formatting convention.
Yeah, it's never really come up as it's generally trivial to convert into this format from either a text editor or sed (Look for the `],[` and put a new line after the comma).
Also it's way more difficult to parse than CSV. I'm not saying that CSV is better (or worse), just that to each it's job! Wtf is this about retiring a format because it doesn't fit someone's expectation!
I disagree. Json, in particular, can be nearly as compact as CSV by storing the data as an array of arrays.
[[1,2,3],
[4,5,6]]
It's easy to make a structured data interchange format mimic and unstructured format. It's impossible to go the other way around without severe problems.
It can be, but what's ensuring that format when you read in a JSON file? JSON is one of the formats Pandas can read, but it has to be structured in a format the python library can read in as tabular data. Excel would have the same issue as would any program that is consuming tabular data. At least with CSVs, you know the data is tabular.
> It can be, but what's ensuring that format when you read in a JSON file?
What's ensuring the format of data in a CSV file?
Format comes from the same place it comes from for CSV. It's part of whatever data contract you are making with whatever system is providing the data. If someone shoves YAML into a Json file you've got problems, just like you have problems if someone starts talking about cars when you expected a file about airplanes.
At some point, you've got to turn file data into application data regardless of format. CSV offers no help in shaping the data and has a bunch of footguns to boot.
> JSON is one of the formats Pandas can read, but it has to be structured in a format the python library can read in as tabular data.
Pandas is highly flexible in it's ability to read in Json data, What I showed would be trivially loadable by Pandas, so would other formats (including more traditional JSON data).
Turning structured data into tabular data is trivial. What isn't trivial is turning tabular data into structured data.
Excel actually has really good JSON import… it's just "hidden" in the Data tab on the ribbon and users don't want to learn how to use it.
I feel so much of what keeps CSV in use is that it's a format that you can relatively easily generate without pulling in a lot of library dependencies and relatively quickly generate something that users can just "double click it opens in Excel".
What the Excel team could really gift to developer humanity at this point is some dumb file extension Excel registers like XLJSON that you could just rename a JSON file to and Excel opens it like a spreadsheet on double click.
Correct. That'd be a moving goalpost. The original claim was that JSON doesn't work well with columnar data.
If the primary usecase is to take data give it to a non-programmer to evaluate it, then CSV isn't terrible. Add a transformer that's ran when someone manually requests to see the data.
However, for machine to machine communication, CSV should never be used.