Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As long as it is text-based. At least you can actually look at a CSV and see what is going (wrong), as well as use all the text tools we have. Not that there aren't problems as the article points out. But some binary file based on the whims of a proprietary program...no thanks.


Xlsx (office format for almost a decade now) are zip and XML all the way. Not fun to look at, but totally readable by a human.


Human can't efficiently write or parse XML or json, though. In some scenarios CSV hits the right spot to be accessible to human and computer, and the table can be laid out so that one can sort/grep/awk to quickly gain some insight.


As someone that works with maven and npm.... what?

This is valid JSON

    [["bob", "jones", 1, 22],
    ["frank", "was", 32, 45]]
That's unreadable and unparsable by a human?

Not only is JSON often more parsable, because it's structured it also becomes a lot easier to query.

I grep and awk xml and json stuff all the time. I also have the added bonus of being able to use `jq` for json content.


Late for the response but: JSON naturally encourages data to structural in ways that are often nested. In your example, if there's one or two additional layers of context (say, affiliation of the two persons), maintaining a CSV-like layout will be more cumbersome and unnecessary. I don't see many CSV or tabular-like JSON/XML in the wild and they really shouldn't be used that way.


Thats just a CSV with extra steps.


Nope, that's CSV without the drawbacks of CSV. That's CSV that can have special characters and doesn't suffer from delimiter problems.

When someone says "Maybe we can fix CSV" this is what you should do instead of trying to "fix" CSV.


Interesting. This is JSON++ somehow. What should it be called? Line-oriented JSON? Row-JSON?


It's just JSON. It's not an extension to the spec, it's a subset of the spec. If anything, you could say it's JSON--.


Well, to work as a CSV-like JSON, i.e greppable similar to a CSV it needs formatting with linebreaks as above. It is still completely valid JSON. ++ is the formatting convention.


Gotcha.

Yeah, it's never really come up as it's generally trivial to convert into this format from either a text editor or sed (Look for the `],[` and put a new line after the comma).


Readable but not comprehensible. CSV is hard to beat in that sense, being somewhat "natural" like a table written on paper.


Also it's way more difficult to parse than CSV. I'm not saying that CSV is better (or worse), just that to each it's job! Wtf is this about retiring a format because it doesn't fit someone's expectation!


JSON and XML both make it easy to see what is going wrong and don't have near the same amount of drawbacks that CSV has.


Both are also not a good fit for columnar data at all.


I disagree. Json, in particular, can be nearly as compact as CSV by storing the data as an array of arrays.

    [[1,2,3],
     [4,5,6]]
It's easy to make a structured data interchange format mimic and unstructured format. It's impossible to go the other way around without severe problems.


It can be, but what's ensuring that format when you read in a JSON file? JSON is one of the formats Pandas can read, but it has to be structured in a format the python library can read in as tabular data. Excel would have the same issue as would any program that is consuming tabular data. At least with CSVs, you know the data is tabular.


> It can be, but what's ensuring that format when you read in a JSON file?

What's ensuring the format of data in a CSV file?

Format comes from the same place it comes from for CSV. It's part of whatever data contract you are making with whatever system is providing the data. If someone shoves YAML into a Json file you've got problems, just like you have problems if someone starts talking about cars when you expected a file about airplanes.

At some point, you've got to turn file data into application data regardless of format. CSV offers no help in shaping the data and has a bunch of footguns to boot.

> JSON is one of the formats Pandas can read, but it has to be structured in a format the python library can read in as tabular data.

Pandas is highly flexible in it's ability to read in Json data, What I showed would be trivially loadable by Pandas, so would other formats (including more traditional JSON data).

Turning structured data into tabular data is trivial. What isn't trivial is turning tabular data into structured data.


Except that won't open in Excel


Excel actually has really good JSON import… it's just "hidden" in the Data tab on the ribbon and users don't want to learn how to use it.

I feel so much of what keeps CSV in use is that it's a format that you can relatively easily generate without pulling in a lot of library dependencies and relatively quickly generate something that users can just "double click it opens in Excel".

What the Excel team could really gift to developer humanity at this point is some dumb file extension Excel registers like XLJSON that you could just rename a JSON file to and Excel opens it like a spreadsheet on double click.


Correct. That'd be a moving goalpost. The original claim was that JSON doesn't work well with columnar data.

If the primary usecase is to take data give it to a non-programmer to evaluate it, then CSV isn't terrible. Add a transformer that's ran when someone manually requests to see the data.

However, for machine to machine communication, CSV should never be used.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: