Why does your ntpd have a json dependency?

danudey · on June 24, 2024

This is a good question to ask, especially in the age of everything pulling in every possible dependency just to get one library function or an `isNumeric()` convenience function.

The answer is that there is observability functionality which provides its results as JSON output via a UNIX socket[0]. As far as I can see, there's no other JSON functionality anywhere else in the code, so this is just to allow for easily querying (and parsing) the daemon's internal state.

(I'm not convinced that JSON is the way to go here, but that's the answer to the question)

[0] https://docs.ntpd-rs.pendulum-project.org/development/code-s...

motrm · on June 24, 2024

If the pieces of state are all well known at build time - and trusted in terms of their content - it may be feasible to print out JSON 'manually' as it were, instead of needing to use a JSON library,

  print "{"
  print "\"some_state\": \"";
  print GlobalState.Something.to_text();
  print "\", ";
  print "\"count_of_frobs\": ";
  print GlobalState.FrobsCounter;
  print "}";

Whether it's worth doing this just to rid yourself of a dependency... who knows.

Gigachad · on June 25, 2024

This looks like the exact kind of thing that results in unexpected exploits.

Spivak · on June 25, 2024

Hand rolled JSON input processing, yes. Hand rolled JSON output, no.

You're gonna have a hard time exploiting a text file output that happens to be JSON.

comex · on June 25, 2024

> You're gonna have a hard time exploiting a text file output that happens to be JSON.

If you’re not escaping double quotes in strings in your hand-rolled JSON output, and some string you’re outputting happens to be something an attacker can control, then the attacker can inject arbitrary JSON. Which probably won’t compromise the program doing the outputting, but it could cause whatever reads the JSON to do something unexpected, which might be a vulnerability, depending on the design of the system.

If you are escaping double quotes, then you avoid most problems, but you also need to escape control characters to ensure the JSON isn’t invalid. And also check for invalid UTF-8, if you’re using a language where strings aren’t guaranteed to be valid UTF-8. If an attacker can make the output invalid JSON, then they can cause a denial of service, which is typically not considered a severe vulnerability but is still a problem. Realistically, this is more likely to happen by accident than because of an attacker, but then it’s still an annoying bug.

Oh, and if you happen to be using C and writing the JSON to a fixed-size buffer with snprintf (I’ve seen this specific pattern more than once), then the output can be silently truncated, which could also potentially allow JSON injection.

Handling all that correctly doesn’t require that much code, but it’s not completely trivial either. In practice, when I see code hand-roll JSON output, it usually doesn’t even bother escaping anything. Which is usually fine, because the data being written is usually not attacker-controlled at all. For now. But code has a tendency to get adapted and reused in unexpected ways.

syncsynchalt · on June 24, 2024

Even better to just use TSV. Hand-rolling XML or JSON is always a smell to me, even if it's visibly safe.

masklinn · on June 25, 2024

Hand-rolling TSV is no better. The average TSV generator does not pay any mind to data cleaning, and quoting / escaping is non-standard, so what the other wide will do with it is basically playing russian roulette.

Using C0 codes is likely safer at least in the sense that you will probably think to check for those and there is no reason whatsoever for them to be found in user data.

hackernudes · on June 24, 2024

Do you mean TLV (tag-length-value)? I can't figure out what TSV is.

FredFS456 · on June 24, 2024

Tab Separated Values, like CSV but tabs instead of commas.

fiedzia · on June 24, 2024

> If the pieces of state are all well known at build time - and trusted in terms of their content

.. than use library, because you should not rely on the assumption that next developer adding one more piece to this code will magically remember to validate it with json spec.

maxbond · on June 24, 2024

No magic necessary. Factor your hand-rolling into a function that returns a string (instead of printing as in the example), and write a test that parses it's return with a proper JSON library. Assert that the parsing was successful and that the extracted values are correct. Ideally you'd use a property test.

mananaysiempre · on June 24, 2024

That’s somewhat better than assembling, say, HTML or SQL out of text fragments, but it’s still not fantastic. A JSON output DSL would be better still—it wouldn’t have to be particularly complicated. (Shame those usually only come paired with parsers, libxo excepted.)

rnijveld · on June 25, 2024

I don’t think our dependency tree is perfect, but I think our dependencies are reasonable overall. We use JSON for transferring metrics data from our NTP daemon to our prometheus metrics daemon. We’ve made this split for security reasons, why have all the attack surface of a HTTP server in your NTP daemon? That didn’t make sense to us. Which is why we added a readonly unix socket to our NTP daemon that on connecting dumps a JSON blob and then closes the connection (i.e. doing as little as possible), which is then usable by our client tool and by our prometheus metrics daemon. That data transfer uses json, but could have used any data format. We’d be happy to accept pull requests to replace this data format with something else, but given budget and time constraints, I think what we came up with is pretty reasonable.

stavros · on June 25, 2024

If you're only dumping a string, couldn't you replace this dependency with some string concatenation?

rnijveld · on June 25, 2024

Probably, but we still need to parse that string on the client side as well. If you’re willing to do the work I’m sure we would accept a pull request for it! There’s just so many things to do in so little time unfortunately. I think reducing our dependencies is a good thing, but our dependencies for JSON parsing/writing are used so commonly in Rust and the way we use it hopefully prevents any major security issues that I don’t think this should be a high priority for us right now compared to the many things we could be doing.

orf · on June 24, 2024

Would you rather it had a JSON dependency to parse a config file, or yet another poorly thought out, ad-hoc homegrown config file format?

cozzyd · on June 25, 2024

JSON is a terrible configuration format since it doesn't support comments.

orf · on June 25, 2024

Ok. Do you want to now add anything relevant to the comment you’re replying to?

cozzyd · on June 25, 2024

As a user, I always prefer the bespoke configuration file format, provided it has comments explaining what each configuration option does.

petee · on June 28, 2024

Its an entirely relevant response...they are saying what you call a poorly thought out adhoc format is still better than json in any form or dependency.

Ironically your snark isnt relevant.

orf · on June 28, 2024

Could be instead YAML or TOML. The point wasn’t specifically and only a JSON dependency, it was a dependency in general to do something useful and standardised

petee · on June 28, 2024

> Would you rather it had a JSON dependency to parse a config file, or yet another poorly thought out, ad-hoc homegrown config file format

Ok, but you didn't write that, and thus they responded as such. Snark was still unnecessary; they made a simple statement in reply.

akira2501 · on June 24, 2024

> yet another poorly thought out, ad-hoc homegrown config file format

OpenBSD style ntpd.conf:

    servers 0.gentoo.pool.ntp.org
    servers 1.gentoo.pool.ntp.org
    servers 2.gentoo.pool.ntp.org
    servers 3.gentoo.pool.ntp.org

    constraints from "https://www.google.com"

    listen on *

I mean, there's always the possibility that they used a common, well known and pretty decent config file format. In this particular case, this shouldn't be the thing that differentiates your ntpd implementation anyways.

IshKebab · on June 25, 2024

That config file perfectly illustrates the point. There's no need for it to be custom, and require me to waste time learning its syntax when it could just be JSON or TOML. Honestly I would even take YAML over that and YAML is the worst.

xorcist · on June 25, 2024

You still have to learn the syntax even if it is expressed in json or yaml. Perhaps stating the obvious, but not every json object is a valid ntp configuration.

The configuration object will always and by definition be proprietary to ntp. Expressing it as plain text allows for a trivial parser, without any of the security implications of wrapping it in a general language language ("should this string be escaped?", "what should we do with invalid utf8?").

The more simple format has survived over thirty years, is trivial to parse by anyone, and does not bring any dependencies that needs maintaining. That should count for something.

IshKebab · on June 25, 2024

Sure you have to learn how to configure things but you don't have to learn basic syntax like "how do I escape a string".

The fact that it has survived tells you nothing other than it's not so completely awful that someone went through the pain of fixing it. That doesn't mean it is good. There are plenty of awful things that survive because replacing them is painful due to network effects. Bash for example.

itishappy · on June 24, 2024

It uses TOML for configuration.

orf · on June 25, 2024

Cool, thats why this is a hypothetical question

amiga386 · on June 24, 2024

Poorly thought out, ad-hoc homegrown config file format, please. Every time.

1. Code doesn't change at the whims of others.

2. The entire parser for an INI-style config can be in about 20 lines of C

3. Attacker doesn't also get to exploit code you've never read in the third party dependency (and its dependencies! The JSON dependency now wants to pull in the ICU library... I guess you're linking to that, too)

4. Complexity of config file formats are usually format-independent, the feature-set of the format itself only adds complexity, rather than takes it away. To put it another way, is this any saner...

    {"user":"ams","host":"ALL","runas":["/bin/ls","/bin/df -h /","/bin/date \"\"","/usr/bin/","sudoedit /etc/hosts","OTHER_COMMANDS"}

... than ...

    # I may be crazy mad but at least I can have comments!
    ams ALL=/bin/ls, /bin/df -h /, /bin/date "", /usr/bin/, sudoedit /etc/hosts, OTHER_COMMANDS

All the magic in the example is in what those values are and what they imply, the format doesn't improve if you naively transpose it to JSON.

An example of an NTP server's config:

    # I can have comments too
    [Time]
    NTP=ntp.ubuntu.com
    RootDistanceMaxSec=5
    PollIntervalMinSec=32
    PollIntervalMaxSec=2048

If you just want key-value pairs of strings/ints, nothing more complex is needed. Using JSON is overdoing it.

vlakreeh · on June 25, 2024

1. I can pin my json parser dependency and literally never update it again

2. And how many times have we seen 20 lines of C backfire with some sort of memory safety issue.

3. First off, i'd go out on a limb and say the number of attacks from a well-established (or even a naive one) rust json parsing library is dwarfed by the number of attacks from adhoc config parsers written in C with some overlooked memory safety issue.

4. Usually being the key word, tons of adhoc config formats have weird shit in them. With json (or yaml/toml) you know what you're getting into and you immediately know what you're able and unable to do.

kelnos · on June 25, 2024

I feel like using the incomprehensibly error-prone and inscrutable sudoers format as an example kinda argues against your point.

(I do agree that JSON is a terrible configuration file format, though.)

amiga386 · on June 25, 2024

My argument was not that sudoers is good - it's crazy overcomplicated.

My argument was data interchange format standards are orthogonal to config files. They don't have the same goals.

A programmer who thinks "I'll use JSON|YAML|TOML for my config file" - well, you didn't solve your first problem (what should the config look like, in a way that makes sense and is easily readable, understandable, updateable by the user) and you added a second problem before you even started solving the first - whatever your config looks like, it now also has to be 100% compliant with some data interchange format and support all its features, and that's going to require a lot of code - and then we get into whether you write the compliant parser/generator, or if someone else does and you do/don't audit every line of it. And then on top of that you add an additional pile of code to parse/generate whatever your actual config format is.

whytevuhuni · on June 25, 2024

I once saw an .ini for a log parser:

    [Alarm]
    Name=Nginx Errors
    Pattern="[error] <pid>#<tid>: <message>"

The thing worked. Without any errors. And yet it took:

    Pattern="[error] <pid>

..and then considered the rest of the line a comment. It didn't even error on the fact that the quotes were not closed.

Hand-rolling config formats is hard.

IshKebab · on June 25, 2024

Yeah try adding a git alias with quotes... I ended up reading the source code to figure out wtf it was doing.

patmorgan23 · on June 25, 2024

Why isn't there a decent parser in the standard library? More than 50% of programs will probably touch json at this point.

tialaramex · on June 25, 2024

Rust's stdlib is (at least notionally) forever. Things in the standard library (including core and alloc) get deprecated but must be maintained forever, which means that "at this point" isn't enough.

In 2003 those programs would have used XML, in 1993 probably .INI files. Are you sure that despite all its shortcomings JSON is the end of history? I don't believe you.

If you want "at this point" you can, as software does today, just use a crate. Unlike the stdlib, if next week Fonzie files are huge and by 2026 "nobody" is using JSON because Fonzie is cool, the JSON config crate merely becomes less popular.