This is a good question to ask, especially in the age of everything pulling in every possible dependency just to get one library function or an `isNumeric()` convenience function.
The answer is that there is observability functionality which provides its results as JSON output via a UNIX socket[0]. As far as I can see, there's no other JSON functionality anywhere else in the code, so this is just to allow for easily querying (and parsing) the daemon's internal state.
(I'm not convinced that JSON is the way to go here, but that's the answer to the question)
If the pieces of state are all well known at build time - and trusted in terms of their content - it may be feasible to print out JSON 'manually' as it were, instead of needing to use a JSON library,
> You're gonna have a hard time exploiting a text file output that happens to be JSON.
If you’re not escaping double quotes in strings in your hand-rolled JSON output, and some string you’re outputting happens to be something an attacker can control, then the attacker can inject arbitrary JSON. Which probably won’t compromise the program doing the outputting, but it could cause whatever reads the JSON to do something unexpected, which might be a vulnerability, depending on the design of the system.
If you are escaping double quotes, then you avoid most problems, but you also need to escape control characters to ensure the JSON isn’t invalid. And also check for invalid UTF-8, if you’re using a language where strings aren’t guaranteed to be valid UTF-8. If an attacker can make the output invalid JSON, then they can cause a denial of service, which is typically not considered a severe vulnerability but is still a problem. Realistically, this is more likely to happen by accident than because of an attacker, but then it’s still an annoying bug.
Oh, and if you happen to be using C and writing the JSON to a fixed-size buffer with snprintf (I’ve seen this specific pattern more than once), then the output can be silently truncated, which could also potentially allow JSON injection.
Handling all that correctly doesn’t require that much code, but it’s not completely trivial either. In practice, when I see code hand-roll JSON output, it usually doesn’t even bother escaping anything. Which is usually fine, because the data being written is usually not attacker-controlled at all. For now. But code has a tendency to get adapted and reused in unexpected ways.
Hand-rolling TSV is no better. The average TSV generator does not pay any mind to data cleaning, and quoting / escaping is non-standard, so what the other wide will do with it is basically playing russian roulette.
Using C0 codes is likely safer at least in the sense that you will probably think to check for those and there is no reason whatsoever for them to be found in user data.
> If the pieces of state are all well known at build time - and trusted in terms of their content
.. than use library, because you should not rely on the assumption that next developer adding one more piece to this code will magically remember to validate it with json spec.
No magic necessary. Factor your hand-rolling into a function that returns a string (instead of printing as in the example), and write a test that parses it's return with a proper JSON library. Assert that the parsing was successful and that the extracted values are correct. Ideally you'd use a property test.
That’s somewhat better than assembling, say, HTML or SQL out of text fragments, but it’s still not fantastic. A JSON output DSL would be better still—it wouldn’t have to be particularly complicated. (Shame those usually only come paired with parsers, libxo excepted.)
I don’t think our dependency tree is perfect, but I think our dependencies are reasonable overall. We use JSON for transferring metrics data from our NTP daemon to our prometheus metrics daemon. We’ve made this split for security reasons, why have all the attack surface of a HTTP server in your NTP daemon? That didn’t make sense to us. Which is why we added a readonly unix socket to our NTP daemon that on connecting dumps a JSON blob and then closes the connection (i.e. doing as little as possible), which is then usable by our client tool and by our prometheus metrics daemon. That data transfer uses json, but could have used any data format. We’d be happy to accept pull requests to replace this data format with something else, but given budget and time constraints, I think what we came up with is pretty reasonable.
Probably, but we still need to parse that string on the client side as well. If you’re willing to do the work I’m sure we would accept a pull request for it! There’s just so many things to do in so little time unfortunately. I think reducing our dependencies is a good thing, but our dependencies for JSON parsing/writing are used so commonly in Rust and the way we use it hopefully prevents any major security issues that I don’t think this should be a high priority for us right now compared to the many things we could be doing.
Its an entirely relevant response...they are saying what you call a poorly thought out adhoc format is still better than json in any form or dependency.
Could be instead YAML or TOML. The point wasn’t specifically and only a JSON dependency, it was a dependency in general to do something useful and standardised
> yet another poorly thought out, ad-hoc homegrown config file format
OpenBSD style ntpd.conf:
servers 0.gentoo.pool.ntp.org
servers 1.gentoo.pool.ntp.org
servers 2.gentoo.pool.ntp.org
servers 3.gentoo.pool.ntp.org
constraints from "https://www.google.com"
listen on *
I mean, there's always the possibility that they used a common, well known and pretty decent config file format. In this particular case, this shouldn't be the thing that differentiates your ntpd implementation anyways.
That config file perfectly illustrates the point. There's no need for it to be custom, and require me to waste time learning its syntax when it could just be JSON or TOML. Honestly I would even take YAML over that and YAML is the worst.
You still have to learn the syntax even if it is expressed in json or yaml. Perhaps stating the obvious, but not every json object is a valid ntp configuration.
The configuration object will always and by definition be proprietary to ntp. Expressing it as plain text allows for a trivial parser, without any of the security implications of wrapping it in a general language language ("should this string be escaped?", "what should we do with invalid utf8?").
The more simple format has survived over thirty years, is trivial to parse by anyone, and does not bring any dependencies that needs maintaining. That should count for something.
Sure you have to learn how to configure things but you don't have to learn basic syntax like "how do I escape a string".
The fact that it has survived tells you nothing other than it's not so completely awful that someone went through the pain of fixing it. That doesn't mean it is good. There are plenty of awful things that survive because replacing them is painful due to network effects. Bash for example.
Poorly thought out, ad-hoc homegrown config file format, please. Every time.
1. Code doesn't change at the whims of others.
2. The entire parser for an INI-style config can be in about 20 lines of C
3. Attacker doesn't also get to exploit code you've never read in the third party dependency (and its dependencies! The JSON dependency now wants to pull in the ICU library... I guess you're linking to that, too)
4. Complexity of config file formats are usually format-independent, the feature-set of the format itself only adds complexity, rather than takes it away. To put it another way, is this any saner...
1. I can pin my json parser dependency and literally never update it again
2. And how many times have we seen 20 lines of C backfire with some sort of memory safety issue.
3. First off, i'd go out on a limb and say the number of attacks from a well-established (or even a naive one) rust json parsing library is dwarfed by the number of attacks from adhoc config parsers written in C with some overlooked memory safety issue.
4. Usually being the key word, tons of adhoc config formats have weird shit in them. With json (or yaml/toml) you know what you're getting into and you immediately know what you're able and unable to do.
My argument was not that sudoers is good - it's crazy overcomplicated.
My argument was data interchange format standards are orthogonal to config files. They don't have the same goals.
A programmer who thinks "I'll use JSON|YAML|TOML for my config file" - well, you didn't solve your first problem (what should the config look like, in a way that makes sense and is easily readable, understandable, updateable by the user) and you added a second problem before you even started solving the first - whatever your config looks like, it now also has to be 100% compliant with some data interchange format and support all its features, and that's going to require a lot of code - and then we get into whether you write the compliant parser/generator, or if someone else does and you do/don't audit every line of it. And then on top of that you add an additional pile of code to parse/generate whatever your actual config format is.
Rust's stdlib is (at least notionally) forever. Things in the standard library (including core and alloc) get deprecated but must be maintained forever, which means that "at this point" isn't enough.
In 2003 those programs would have used XML, in 1993 probably .INI files. Are you sure that despite all its shortcomings JSON is the end of history? I don't believe you.
If you want "at this point" you can, as software does today, just use a crate. Unlike the stdlib, if next week Fonzie files are huge and by 2026 "nobody" is using JSON because Fonzie is cool, the JSON config crate merely becomes less popular.