Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
There is no 'printf' (netmeister.org)
325 points by pr0zac on Oct 21, 2021 | hide | past | favorite | 147 comments


There is 'printf'. It's just that printf (and the rest of the standard library) is technically as much a part of the C language as the language grammar itself, and C compilers are welcome to use innate knowledge of those functions for optimizations. The other place you typically see this is calls to functions like memcpy/memset being elided to inline vector ops or CISC copies, or on simpler systems, large manual zeroing and copying being elided the other way to a memset or memcpy call.

C compilers will typically have an escape hatch for envs like deeply embedded systems and kernels like gcc's -ffreestanding and -fno-builtin that says "but for real though, don't assume std lib functions exist or you know what they are based on the function's name".

<rust_task_force> One of my favorite parts of rust as someone who uses it for deeply embedded systems is the separation of core and std (where core is the subset of std that only requires memcpy, memset, and one other I'm forgetting). The rest of the standard library is ultimately an optional part of the language with compiler optimizations focused on general benefits rather than knowing at the complier how something like printf works. no_std is such a nicer env than the half done ports of newlib or pdclib that everyone uses in C embedded land. </rust_task_force>


> C compilers will typically have an escape hatch for envs like deeply embedded systems and kernels like gcc's -ffreestanding and -fno-builtin that says "but for real though, don't assume std lib functions exist or you know what they are based on the function's name".

GCC requires standard library functions even on freestanding environments! It will still emit calls to those functions in certain circumstances.

https://gcc.gnu.org/onlinedocs/gcc/Standards.html

> GCC requires the freestanding environment provide memcpy, memmove, memset and memcmp.

> if __builtin_trap is used, and the target does not implement the trap pattern, then GCC emits a call to abort.


This comment isn't meant to disagree with what you said but is more like a related ramble.

At least in C++, memcpy is special because it can be used to reinterpret bits of one object as another (in C you can use a union, and as of C++20 you can use the std::bit_cast function but it either just calls memcpy or it still needs its own magic)

    double d = 1.0;
    // type punning with union, allowed in C but not C++
    union {
        double d;
        int64_t i;
    } u;
    u.d = d;
    int64_t y1 = u.i; // undefined behaviour
    // cast not allowed, not even reinterpret_cast
    int64_t y2 = *reinterpret_cast<int64_t*>(&x.d); // undefined behaviour
    // memcpy - allowed! (if sizeof() params is the same)
    int64_t y3;
    static_assert(sizeof(y3) == sizeof(d));
    memcpy(&y3, &d, sizeof(y3));  // well defined!
    // C++20 bit_cast (probably just a convenience wrapper around memcpy)
    int64_t y4 = std::bit_cast<int64_t>(d);  // well defined
It seems wasteful to create a separate object in memory and copy bits from an old object, when all you want to do is reinterpret the bits of the existing chunk of memory. But memcpy doesn't do that (or at least might not) because it's optimised into a simple cast under the hood, but without the undefined behaviour. So memcpy looks like a standard library function but in a way it's more like a compiler instrinic. It would be possible to create a fully conformant version of memcpy, even in C++, by casting both parameters to char* which is special with respect to the aliasing rules, and copying char-by-char. But I suppose that would be harder to optimise into a simple cast.


What a pain, right? Why can't they just let us do what they know we want to do? It's such an absurd situation.

We have double precision floating point numbers. Sometimes we want to look at the bits. Sometimes we want to modify those bits. Would it kill them to let us cast the thing? I will never understand this.


The requirements for memcpy, memmove, memset and memcmp stem from the fact that statements like "char arr[100] = {0};" or "struct somestruct some_a = some_b;" cause implicit memsets and memcpys, in cases where it can't be handled within less instructions. These functions are required by the freestanding part of the C standard, too, anyway.


> implicit memsets and memcpys, in cases where it can't be handled within less instructions

Yeah, I know. Still, I expected GCC to generate code like this in freestanding environments. There is no standard library so the compiler should not implicitly emit any calls to standard library functions at all. That ought to have freed me to do whatever I want, including implementing those functions differently or with different names.

> These functions are required by the freestanding part of the C standard, too, anyway.

What? If I remember correctly, only a very small list of headers are required in freestanding environments and string.h was not among them.


The "no_std" idea is one that has washed up over and over again, over the last half-century. ISO Standards typically call it "freestanding".

It has invariably failed to yield the promised benefits. It hangs around, anyway, cluttering up Standards. People waste endless hours proposing alterations to it for the next Standard edition, to be likewise ignored by implementers and informed users of whichever language.

I don't doubt that "no_std" will similarly float along, and even be taken up into whichever Standard takes hold, but it must identically fail to deliver to actual system developers, in practically useful terms, whatever benefit its promoters promise. A mirage may be beautiful, but it does not slake thirst.


no_std is already in ubiquitous use in the embedded rust community and does exactly what it says on the tin. Having a no_std feature is extremely common to allow (restricted) crate use in that context.


Quick Summary:

The C compiler optimizer replaces printf("Hello World!\n") with puts("Hello World!\n") and the implicit return from main() changes from 13 (the return value of printf) to 10 (the return value of puts)


Calls on puts you say?


In other words long volatility


The behavior is optional.


Brilliant.


C is Citigroup


No, C stands for Capitalism.


This sounds like a compiler optimizing bug, in that if the expected behavior of main without a return is that it will return the result of the last function call, then the last function in the main does use its return value and that should not be optimized.


As far as I understand, there is no “expected behavior of main without a return” under ANSI; it’s undefined behavior as (according to this version of the C standard) a function with an int return type must return an int. The fact that it happens to return the result of last function call is more-or-less a coincidence because the compiler doesn’t bother to clear out or reuse the return value register.

If this seems unintuitive, consider a more general example:

int foo(int x) { if (x >= 0) return 123; if (x < 0) return 456; }

What’s the “expected” return value if neither of the if-statements match? That seems like a nonsensical question because the conditions are clearly mutually exclusive, but if you’re a dumb enough compiler (or the conditions are more complex) you might not be able to prove that. So should it return zero? Should the compiler insert code to save and remember the result of the last function call? According to the C standard, the compiler doesn’t have to handle this situation; it is free to assume that the branches are indeed mutually exclusive, and it will most likely do whatever produces the easiest/fastest code in that case.

In the OP’s example, it seemingly just decides to return void (leaving whatever garbage was previously in the return value register for the caller to discover). But it could also have assumed that main must infinitely loop (since there’s no return statement), and omitted the return instruction altogether causing execution to “fall off” the end of the function into whatever happened to be next in memory.


> As far as I understand, there is no “expected behavior of main without a return” under ANSI; it’s undefined behavior

You are right. There are still a lot of so-called developers who cannot understand the concept of _undefined_ behavior.


No, it's not undefined behavior. Only the status returned to the host environment is undefined. (And in C99 and later, the status is 0.)

As for your example:

int foo(int x) { if (x >= 0) return 123; if (x < 0) return 456; }

that has undefined behavior only if the caller uses the value. (That's to cater to ancient C code written before the void type was introduced.)


Any such expectation is unfounded. The C90 standard says:

"A return from the initial call to the main function is equivalent to calling the exit function with the value returned by the main function as its argument. If the main function executes a return that specifies no value, the termination status returned to the host environment is undefined."


Yeah honestly this blog post just seems to be highlighting a bug in gcc when it is compiling to an older version of C. Not sure why the author chose to frame it as some big point about the printf function.


Not a bug in GCC. The program GCC tries to compile is invalid, and the C standard says GCC can do anything it please in that case. So GCC is fine, just fix your program.


I think that's because the only way to see puts() return value after calling printf is in this weird situation. If you actually assign the return value somewhere and read it, the compiler can't replace it with puts and you'll get the return value you expect. But in this case the compiler is tricked into exposing that it's really calling puts().


When I do it on Linux, with gcc 10.2.1, it always returns 0, regardless of "-ansi", "-O2", or what-have-you. So, the author's experience is either peculiar to some old version of cc he used, or to a quirk of BSD.

The ISO C++ Standard dictates that if main fails to return a result, the program result is zero. And, g++ does this.

[Edit: I'm wrong... with "gcc -ansi -std=c90", with or without optimization, I get the 13! But not 10.]


> the expected behavior

Which part of "undefined behavior" is not clear to you?


Huh, this is pretty great; I've always fussily used fputs() when I'm just printing static strings, and apparently I don't need to bother, since the compiler will just do it for me.


It doesn't hurt to use (f)puts anyway when you intend to print a static string. This way you avoid extra overhead when using compilers that don't do this or building with flags that disable this optimization, as well as just clarify your intent.


Typically if you're calling printf you are not concerned about overhead.



This is a bit like saying there is no '+';

Because if you put in

    return 1+2+3;

And look at the assembly code, you will see that the compiler generated something like

    return 6;

The compiler is allowed to take advantage of the standard to substitute in more efficient code that does the same thing.

IIRC, for C++, it would actually be ok if std::vector was implemented completely as a compiler intrinsic with no actual header file. (No compiler I am aware of actually does it that way).


Code that does

  #include <vector>
must compile, so that header must exist (whether it is stored in a file is the implementer’s choice. AFAIK, the standard carefully avoids the use of the term ‘header file’)

Also, I think code that doesn’t do that include must fail to compile when it tries to use std::vector. So, logically, that header must exist.


Well not really. The preprocessor is part of the compiler, so it only needs set a flag to tell the compiler proper to enable std::vector.


Is there more info to this, I remember this from Commmon Lisp (but details evade me) that the compiler can take benefit of certain specific functions and rely on them being... "open coded" - e.g. it can produce more efficient code by replacing these with something more suitable... http://www.sbcl.org/manual/#Open-Coding-and-Inline-Expansion

https://www.thecodingforums.com/threads/what-is-the-meaning-...


yeah but everyone knows that "there is no +"; It's an operator, and in C, anyways operators are special and expected to not necessarily do C-function-ey things, e.g, "take arguments of different types and add them successfully" not everyone is aware that C has "anointed functions" (including, I believe malloc) that the compiler is allowed to fiddle with.


> more efficient code that does the same thing

In this case, it produces a different result.


It produces a different ub, which is ub.

Furthermore observability would be defined in terms of the C abstract machine, “observing” by decompiling the program is out of scope.


oh right

> But what if you're not using C99 or newer?

UB - that takes all the fun out of it.


> But what if you're not using C99 or newer?

If you're using C90, but under an implementation that supports C99, that implementation should obey all the new rules in all areas where there is no conflict between C90 and C99.

The ISO C90 standard is obsolescent, so the fact that dropping off the end of main with no return value is an unspecified termination status is an obsolescent requirement (or non-requirement).

It is possible for a conforming C90 implementation to return 0 in that case. A conforming C99 implementation must do that as a conformance requirement. There is no good reason for a C99 implementation behaving in a C90 compatibility mode to simply drop non-conflicting C99 requirements and revert a behavior such as this.

This should be a don't care issue, unless you're targeting a bona fide nothing-but-C90 implementation.

However, if you are telling your implementation to be C90, what reason are you doing that for? If it's not just some nerd gesture to show your contempt for C99, and you really care about portability to C90 implementations, then you probably want to be returning an explicit 0 from your main. Or even just to show contempt for C99, really, you should be returning that 0.


Regardless of how you think a modern compiler in C99 mode should behave, the fact is that "gcc -std=c90" generates code that returns an arbitrary value from main if no return or exit is executed.

clang doesn't do this as far as I can tell -- though it's difficult to be sure, since returning a status of 0 is valid C90 behavior.


I would say that is a bug.

More than 22 years ago, ISO C introduced this requirement for a reason. The reason was almost certainly the desire to fix that issue for as many C programs as possible, and that there was no intent that there be an exceptions for C programs that happen to request C90 compatibility in their accompanying Makefile.

An indeterminate termination status causes real problems like, oh, if those programs are run out of POSIX script in "set -e" mode, the script will randomly terminate at that point. It has consequences.

This requirement plugged a hole; when those programs are recompiled with compiler which adopts the requirement, that issue is fixed.

From the developer's perspective, using GCC in C90 mode does mean they should be prepared to conform to C90 and not do things like that falling off the end of main if there is any risk that the termination status matters. In my original comment above, I made remarks to that effect.

But that developer isn't the only stake holder. A developer working in C90 can still accidentally forget the return, and some downstream user who doesn't even program in C is affected by that.

We also have to think from the point of view of a that downstream user building and operating a program received from such a developer. The user wants a program that has a successful termination status when it terminates normally an de facto successfully; the user doesn't care whether that comes from the compiler, or whether the program takes care of it.

It's pretty shoddy from that not to come from the compiler, almost a quarter century after it was declared that, in the C language, "int main(void) {}" is now a successfully terminating program!

When you request dialect compatibility from an implementation, you don't want old bugs to come back, unless they are specifically required. Your downstream user doesn't want that, certainly.

If you really want old bugs, you go get the actual old compiler; -std=c90 passed to GCC 11 does not mean "emulate every detail of GCC 1.x".


But there aren't a lot of good reasons to compile with C90 or older these days (for portability you can just restrict yourself to the portable subset). I suspect the flag is there to support old unmaintained programs and it makes sense not to change their behaviour (because nobody is going to fix bugs in them), as long as it is not a big burden on the compiler. In fact, think of old K&R C instead of C90.

I wouldn't be surprised if there are old unix tools that relied on propagating the result of the last operation executed into main. In fact sometimes gcc has added explicit flags to enable this kind of "traditional" behaviour.

Clang was written well after C99, so it didn't make sense to implement the old traditional behaviour there in the first place.


> aren't a lot of good reasons to compile with C90

Sure there are, like:

- Project doesn't want mixed declarations and statements creeping into the code base

- Project doesn't want want variable-length arrays creeping into the code base.

- Project doesn't want new footguns in preprocessing.

- Project hates mixtures of // and /.../ comments.

I think the Linux kernel is still using C90 (with GNU extensions).

> tools that relied on propagating the result of the last operation executed into main.

In all the situations in which the compiler has enough information to indicate the insert of a "return 0", it could emit a diagnostic (if it is in C90 mode).

   foo.c:17: warning: main returns without a value [-std=c90, -Wmain-return]
The user is alerted to this and can look at the code and decide whether it's just missing a "return last_operation();" or whether it should be "(void) last_operation(); return 0;".

-std=c90 is not a good option for requesting "I want the random contents of the return register left behind by the last function called in main to be the return value of main, if possible".


> There is no good reason for a C99 implementation behaving in a C90 compatibility mode to simply drop non-conflicting C99 requirements and revert a behavior such as this.

A compatibility with a code which depends on a behaviour of a previous version of the compiler is a reason enough.

If compatibility is dropped you now have a very subtle bug which is hard to discover and understand unless you read about that particular GNUism before. It's also a perfect setup for Schrödinger's bug.


There's no compatibility requirement here, because the old behavior is UB (both in theory and in practice).


The old behavior is not undefined behavior. In C90, falling off the end of main() returns an undefined status to the environment. The behavior of the program is otherwise well defined.

It's true that an implementation that follows the C99 and later requirements also conforms to the looser C90 requirements.

But if a compiler chooses not to do so, I don't see it as a bug. If I use "-std=c90" or equivalent, I'm specifically not asking for C99 or later semantics; I want the compiler to conform to the C90 standard. If I use "-std=c90" and my program assumes C99 or later semantics, that's a bug in my program, not in the compiler.


If you specify "-std=c90", all you are saying, for this case, is that you don't care what value is returned. If you don't care, then zero is as good as any other value, so there is no reason for the compiler to have a different code path for that case. There are always good reasons not to have poorly exercised code paths.


Earlier versions of gcc didn't support C99 or later at all. Making reaching the closing "}" do an implicit "return 0;" was a new language feature, and it was tied to the (non-default) "-std=c9x" or later "-std=c99" option.

Some existing code might have depended (foolishly IMHO) on the arbitrary non-zero value returned from main. Even if that's not the case, if you're programming in C90, having your program indicate that it failed is a good reminder to add the "return 0;".

I've never bothered to check exactly how "gcc -std=c90" handles this, but having it intentionally return a non-zero value in C90 mode would have been a good choice, since it would make the bug of omitting the return or exit more visible. gcc can also warn about not returning a value from a non-void function.

Whatever it returns in C90 mode, it's not a bug, and so the gcc maintainers have probably decided that they have more important things to do than change the behavior. They might have made a different choice if they were implementing "-std=c90" from scratch today.


What do you suppose has changed between when they did, and today, that could lead to a different choice?


Apparently the compiler writers thought there was, and the compiler writers get to define semantics above and beyond what's in the C spec.


I don't know how it works for C, but in C++ some (but not all!) fixes are recognized as "defects" in the standard and implementers are encouraged to implement the fixes in previous standard versions too.


My compilers call printf just fine until you enable optimizations. -O0 adds references to printf, -O1 and higher switches to puts.

I was kind of surprised about the fact that there was no warning about the missing return from main(). Normally, I'd expect the compiler to complain that a supposed int returning function does not return anything, because that would normally be undefined behaviour.


If you use GCC, enable -Wall and you'll get the diagnostic you want:

    /tmp/cp.c: In function 'main':
    /tmp/cp.c:4:1: warning: control reaches end of non-void function [-Wreturn-type]
        4 | }
          | ^
If you use Clang, there is no diagnostic even with -Wall (or even -Weverything), but it looks like Clang always implements the implicit `return 0;` from C99 regardless of the `-std` setting.


To clarify, gcc -Wall only produces the warning with -std=c89/gnu89 (or GCC < 5 without -std=c99/gnu99). As with clang, it implements the implicit `return 0;` for C99 and later. Example: https://godbolt.org/z/Y7Wx5Y89q


> I was kind of surprised about the fact that there was no warning about the missing return from main(). Normally, I'd expect the compiler to complain that a supposed int returning function does not return anything, because that would normally be undefined behaviour.

If main() returns without a return statement, it is defined to return 0 by the specification. (§5.1.2.2.3 of http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2596.pdf, but this hasn't changed in quite some time).


The article already explains this. What you said is only true in C99 (and they indeed show the return value of the overall program is 0 when compiled in that mode). In C89 it's undefined behaviour, so a warning when you do this would be very sensible.


There is no printf because there is no defined behavior.

To call a variadic function, you must have a prototype declaration in scope. (A correct prototype, needless to say.)

A non-prototype (i.e. old-style) declaration cannot declare a variadic function such as printf, so no such declaration can be correct.

If the function is not declared, then a declaration will effectively be assumed for the call, deduced from the types of the actual arguments, and a return value of int. In this case, the function will be "implicitly declared" to look something like

   int printf(char *);
but that is not the correct declaration for printf. Therefore, the call has undefined behavior.

The optimizer is allowed to rewrite this to:

   puts("daemons are running in your nose");
Likewise, it is allowed to rewrite it to:

   puts("Hello World!");


No, this optimization also happens when you `#include <stdio.h>`. More accurately speaking, GCC (among others) evidently assumes that printf is the standard C library function no matter prototype is defined or not and replaces it with puts under the same assumption plus the following additional conditions:

- No other arguments are present.

- The format argument has no format specifiers and ends with `\n`.

- The return value is unused. (So `return printf("...\n");` would have prevented this optimization.)


Moar please. I'm loving these counterintuitive C optimization gotchas lately[1]. They are like little brain teasers.

1. https://news.ycombinator.com/item?id=28930271


About a year ago there was something of a "joke isEven() implementation discourse" on Twitter, which eventually evolved a sort of informal optimizer abuse contest. For example:

https://twitter.com/zeuxcg/status/1291872698453258241

https://twitter.com/jckarter/status/1428071485827022849


OK, those are horrifying and fascinating, and they basically break my brain.

Is there a explanation somewhere of why the first one "works"? The second one I think is the compiler assuming the default case will never be hit since it'll result in infinite recursion, which is UB under C++, so it's basically assuming 0<=x<=3 and optimizing from there. Is that correct?

The first one I'm less certain about. The only thing I can think of is that the compiler deduces an upper limit of INT_MAX - 1 to avoid signed overflow, and then somehow figuring out the true/false pattern from there? Still a bit of a gap in my understanding there.


Optimizers have to keep the same input/output pairs unless there is undefined behavior. In the second function the truth table looks like:

    in    | out
    ----------
    0b000 | 1
    0b001 | 0
    0b010 | 1
    0b011 | 0
    0b100 | don't care
          .
          .
          .
    MAX   | don't care
The compiler just chooses the most efficient way it knows to get the filled out entries correct which happens to be:

    in    | ~in[0]
    ----------
    0b000 | 1
    0b001 | 0
    0b010 | 1
    0b011 | 0
    0b100 | 1
          .
          .
          .
    MAX   | 1
It would have been just as valid to do:

    in    | in[2] or ~in[0]
    ----------
    0b000 | 1
    0b001 | 0
    0b010 | 1
    0b011 | 0
    0b100 | 1
    0b101 | 1
          .
          .
          .
    MAX   | 1
The first function's table looks like:

    in    | out
    ----------
    0b000 | 1
    0b001 | don't care
    0b010 | don't care
          .
          .
          .
    MAX   | don't care
And the compiler still likes the even check in this case, which makes sense.


The first function (the `n == 0 || !isEven(n+1)` recursive function) has defined behavior for negative numbers. That's probably why it compiled to an even number check.


Hmm you're right that there's definitely more to the first function than coincidence. I dug a bit deeper and surprisingly when n is unsigned the function is still well defined and correct. [0] AFAIK clang will usually just treat a signed overflow as an unsigned overflow when optimizing as it simplifies things. So my bet would be that clang is just casting n to unsigned and performing some inductive reasoning similar to the proof I wrote below. Pretty impressive on the behalf of clang. [1]

[0]: I did the proof below for anyone interested:

    int isEven(unsigned int n) {
        return n == 0 || !isEven(n + 1);
    }

    # assumptions
    MAX + 1 --> 0      # from the C standard
    MAX is odd         # true for most machines

    # base case
    isEven(0) --> True # trivial short circuit

    # base case
    isEven(MAX):
    ==> return MAX == 0 || !isEven(MAX+1) # MAX is not zero
    ==> return False || !isEven(MAX+1)    # MAX + 1 rolls over
    ==> return False || !isEven(0)
    ==> return False || !True
    ==> return False || False
    ==> return False
    ==> isEven(MAX) --> False

    # recursive steps
    isEven(n) where 0 < n < MAX
    ==> return n == 0 || !isEven(n + 1)   # n cannot be zero
    ==> return False || !isEven(n + 1)    # False || anything is tautological
    ==> return !isEven(n + 1)

    which will turn into an inverter chain of length MAX - n until you reach isEven(MAX)

    - an odd length inverter chain is the same as a single inversion
    - an even length inverter chain is the identity
    (can be proven by induction trivially)

    isEven(n) where 0 < n < MAX and n is even
    ==> return !isEven(n + 1)     # MAX - n is odd, when n is even; replace with single inversion before isEven(MAX)
    ==> return !isEven(MAX)
    ==> !(False) --> True

    isEven(n) where 0 < n < MAX and n is odd
    ==> return !isEven(n + 1)    # MAX - n is even, when n is odd; replace with identity of isEven(MAX)
    ==> return isEven(MAX)
    ==> False --> False

    All cases of n are accounted for so the function is correct (and equivalent to return n & 1 == 0).
[1]: Yup clang just recognizes this optimization: https://gcc.godbolt.org/z/Tc1MTa6nj


> AFAIK clang will usually just treat a signed overflow as an unsigned overflow when optimizing as it simplifies things.

This was a bit of a surprise to me; I thought that Clang could use the presence of signed overflow to infer bounds, like when it "breaks" naïve overflow checks. I didn't know about the treat-signed-as-unsigned behavior you described.

If that's what clang is doing, that's pretty neat! Quite a bit of reasoning to work through there.

Thanks for taking the time to explain!


Treat-signed-as-unsigned is what you get "by default" with a two's complement representation of integers, if you just use ADD/SUB opcodes regardless of the type (i.e. the cheapest/fastest way to do it).


Right, but I had thought that clang would have done something involving deducing the bounds of n, which could be used to avoid overflow in later optimizations.

Now that I think about it, though, they key there is "could" - overflow could be avoided by deducing bounds, but if said deduction can't occur then what you describe sounds like the most natural action to take as long as there aren't other issues.


max(unsigned int) being odd is also required by the C standard; C17 6.2.6.2/1:

"For unsigned integer types other than unsigned char, the bits of the object representation shall be divided into two groups: value bits and padding bits (there need not be any of the latter). If there are N value bits, each bit shall represent a different power of 2 between 1 and 2*(N-1), so that objects of that type shall be capable of representing values from 0 to 2*(N-1) using a pure binary representation; this shall be known as the value representation."


My guess: since overflowing int is UB, and the only value of n that stops the recursion is zero, the compiler assumes that n must be zero and checks accordingly.

That doesn’t explain why it uses test dil, 1 instead of test dil, dil or cmp 0 or whatever.


The compiler cannot assume that much, because the argument is a signed integer (negative integers will not overflow and do have well-defined behaviour).


The rabbit hole goes deeper than that: https://gcc.godbolt.org/z/Tc1MTa6nj


That is a well defined function. And indeed implements isEven. Because unsigned int has defined overflow semantics.

Essentially, it will eventually overflow and hit the correct base-case for 0.


It's all fun and games until you write (or review) C/C++ test cases for a compiler or disassembler ;-) It never stopped to amaze me how good the compiler was to figure out that I actually wrote very complicated "return 0".


So, why does puts do "return r ? EOF : '\n';"? Some backwards compatibility? Or is there a logical reason for that?


It's what historic Unix did: https://github.com/v7unix/v7unix/blob/master/v7/usr/src/libc...

Why it did that? I'm not sure, but at the time C did not have 'void' functions: every function returned a value. They probably wanted to make the behavior of the stdlib functions deterministic, even if the return value was useless and undocumented.


That particular implementation probably returns the result of the last fputc() or equivalent that it called.

puts() returns EOF (typically -1) on error, or some unspecified non-negative value on success.

fputc() returns EOF on error or the written character, treated as an unsigned char and converted to int, on success.

Don't expect all puts() implementations to do the same thing. For example, the glibc implementation appears to return the number of characters written on success. Implementations are free to rely on implementation-defined behavior. User code that's intended to be portable cannot.


That particular implementation (NetBSD's) (which is transcribed in to the article) does something more optimized than making repeated calls to `putchar()`.

But as pdw's link shows, what you suggest is exactly what the historical implementation was. So NetBSD is simply matching historical Unix.


For what it's worth, musl's implementation returns 0 on success.


Per the man:

> puts() and fputs() return a nonnegative number on success, or EOF on error.

r is the result of the write, if it’s nonzero the write failed and thus so did puts.


Yeah, but I think the question was why EOF and "\n". It could as easily just return 1 or -1 for example, and it would make more sense I think.


puts() always adds a line termination so success means that '\n' is the last char for that implementation.


That's more of an implementation detail, probably a BC remnant from when it called putchar in a loop and you’d get the result if the last putchar.


Compiler optimization can sometimes cause unpredictable or even incorrect behavior. Below is a blob of C code for the TI MSP430 compiler that exemplifies at least one of TI's optimization bugs:

// Define Common Communications Frame

typedef volatile union commFrameType

{

  struct

  {

    unsigned SyncHeader:16;

    unsigned MessageID:8;

    unsigned short MessageData[msgDataSize];  // ID-unique data

    unsigned CRC:8;             // LSB of CCITT-16 for above data

  } __attribute__ ((packed)) Frame;

  unsigned char  b[16];         // Accessible as raw bytes as well

  unsigned short w[8];          // Accessible as raw words as well

  unsigned long  l[4];          // Accessible as raw long words as well
} __attribute__ ((packed)) CommFrame;

static CommFrame IpcMessage = { FRAME_SYNC_R, IpcBlankMessage };

    // If frame was accepted into TX queue, prepare next frame for transmission
// IpcMessage.Frame.MessageID++; // Bump up to next message type

// IpcMessage.Frame.MessageID += 1;

// The above two lines that are commented out cause a bizzare linker error if either are used instead of the line below.

    IpcMessage.Frame.MessageID = IpcMessage.Frame.MessageID + 1; // Bump up to next message type
The MSP-430 is a 16-bit microcontroller and the packed CommFrame structure has Frame.MessageID on an odd-byte boundary. Some processors might raise a SIGBUS, but TI says that it's okay to access a byte on an odd address boundary.

It's pretty silly that i++; and i+=1; don't work, but i=i+1; is just fine.


'unsigned MessageID:8;' isn't the same as 'unsigned char MessageId'


I agree. It's explicitly an unsigned 8-bit integer (bit field). A 'char' can have a different number of bits on different architectures.


Imagine somebody thought omitting the return statement and doing whatever the compiler likes is a good feature to have.


Not that odd if you know the evolution of the language... and what it actually means for modern programmers.

The very first C compilers ran on a PDP-11 in just a few dozen kilobytes of memory. The entire emphasis was on minimalism, and that meant that things like type enforcement was left to the human.

One of the things the earliest language was missing was the "void" type. If you didn't give a return type to a function, it just defaulted to returning "int". Therefore it was totally normal to have a function fail to return anything.

Since there was no distinction between "function returning int" and "function returning nothing", there was nothing stopping your program from using the value returned... it just meant you got whatever happened to be in the right register.

What does it mean for us today? Basically nothing. "void" was added 30+ years ago when ANSI C appeared. The language couldn't just break the old behavior but in any rational environment you enable enough compiler warnings to avoid these ancient quirks entirely:

  $ gcc -Wall -c a.c
  a.c:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
      1 | foo() {}
        | ^~~
  a.c: In function ‘foo’:
  a.c:1:8: warning: control reaches end of non-void function [-Wreturn-type]
      1 | foo() {}
        |        ^


OK, but this doesn’t do wonders as a counterargument that C is just a big pile of historical baggage.


But of course it is; did you expect anything else from a language that's almost 50 years old by now? It's not any different from, say, Common Lisp in that regard.

This particular quirk is a few years older than C itself, actually! C was based on a similar language called B. Now B was originally written for PDP-7 and similar machines, which could only address words in memory, not individual bytes. So it was designed "typeless" - the only and always-implicit type in that language is the machine word, so e.g. (a+b) always adds two ints, and (*a) always dereferences a pointer. Otherwise, the syntax was very similar to C:

   max(a, b) {
      if (a > b)
         return a;
      else
         return b;
   }

   swap(a, b) { 
      auto c = *a;
      *a = *b;
      *b = c;
   }

   main() {
      auto a, b;
      swap(&a, &b);
   }
Since all functions took zero or more words as arguments, and returned a word, there was no need for function prototypes, either, or for function pointer types - if you used () on something, it was treated as a function / function pointer with the corresponding number of parameters. Even labels / goto worked like that.

When they got a PDP-11, which had byte-addressed memory, this simplistic approach no longer worked - they had to distinguish char/int and char*/int*. So C added types and pointers - but kept int as the default type, as well as the keyword "auto" (which is completely redundant in C if you specify the type), so that existing B code could be easily ported. For the same reason, they allowed calling functions without declaring them, assuming int as return type.

This is also presumably why pre-ANSI K&R C had that weird syntax for parameter types:

   max(a, b)
      int a;
      int b;
   { ... }
If you omit the types, they all default to ints, and it becomes identical to the B declaration!

That implicit int rule made it into the ANSI C89 / ISO C90 standard, since it was still common enough in K&R C code floating around at the time, and they were trying to not break things too much. It eventually got removed in C99, except that "short" and "long" still allow to omit the following "int". The useless "auto" is still around, though, although it might eventually get repurposed the same way it was in C++.


You might be surprised. Half the time I say that, I get a comment like yours. The other half of respondents go on about how it’s nonsense and that C is perfectly [modern/simple/usable/reasonable].


You can just send them to argue with Ritchie himself :)

"C is quirky, flawed, and an enormous success."

https://www.bell-labs.com/usr/dmr/www/chist.html


Like the article says, this got cleaned up over 20 years ago in C99.

But sure, I wouldn't recommend anyone choose C as a language for new software unless they have an extremely good reason, or if you really love stuff like implementing every data structure you use from scratch and using void* as a kind of wildly unsafe generic.


> I wouldn't recommend anyone choose C as a language for new software

Unless you are doing firmware where all those constraints are still around, and you have to deal with whatever compiler the vendor supplies for their microcontroller, which will almost certainly be some variant of C.


Thankfully there are vendors that still bother to sell Basic, Pascal compilers for tiny microcontrollers.

But yeah doing so is being a special snowflake, still they are in business to this day.


Is is the opposite. The original C compiler produced instructions very closely to the higher level language.

As such, it produces code based on what you tell it, no more, no less. If you don't tell it what value to return from a subroutine, it does not generate the code to. It is quite common to have a function that does not return a value.

The case shown here is not a case of the compiler doing 'what it likes', but the compiler not emitting any code for it. The fact that the return value from the function called last is still in the register used to return values when the function returns is simply a result of no code that touches that register in between.

As someone else pointed out, if you look at the code generated for a PDP-11 all the quirky things like, pointers with pre or post increment operators make much more sense as they emit instructions that do just that.


Is it undefined behavior?


If you wrote the same thing in assembly language and, by convention, you return the value from a subroutine in the AX register. This function is not returning anything useful so you don't 'MOV AX,blah', yet your caller expected something in it, is it undefined?

Does the AX register exist in the caller? Yes.

Does it have a value? Yes.

Is its 'value' undefined. Yes.

Is there any undefined 'behaviour' of anything you told it to do? Not to me. Subtle maybe.


I mean what the C standard says about it.


Like Scala?


pretty sure scala (and most FP) has a well-defined "what to do when you leave off the return statement", not one that "is up to the compiler"


Not just FP. "Return the value of the last statement" is fairly common in imperative languages as well. Off the top of my head both Pearl and R do so as well.


Apparently there is no available capacity for that site either.


https://search.yahoo.com/ for

There is no 'printf'

and look through the cache


[off topic] I always wondered how '%n' is used in production code.



Those answers were so frustrating to read. The question is explicitly asking "Why does this exist? When would you need it?" and yet most of the answers are "This is what it does." (sometimes with a few snide RTFMs thrown in for good measure).


that was exactly what i was asking for, i know how it works, but i would like to know how it is used by pros, (if used at all)


When you call printf you will actually issue a system call to write to stdout. Does stdlib directly issue system calls or does it use some inline assembly?


What do you feel the difference is between "directly issuing system calls" and "some inline assembly"?


Using some system interrupts vs calling some driver.


To issue system interrupts (afaik) it would have to use inline assembly. Or non-inline assembly, linked to the rest of the library.

C doesn't have an 'int' (call interrupt) primitive.

So the answer to your question is yes.


> puts(3) only returns "a nonnegative integer on success and EOF on error"

How does it decide which nonnegative integer to return?


That's answered below:

> On success, puts(3) appears to return '\n', the newline or line feed (LF) character, which has ASCII value... 10.

But note that that isn't standard behavior. The language in POSIX[1] is identical to that in the blog post. `puts` is free to return whatever positive number it wants on return.

[1]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/p...


It's arbitrary. The article shows an implementation that returns 10 (ASCII '\n'). But the spec says it doesn't matter, so you should only be using it to test >0 for success.


The correct implementation is obviously to return 1 on success !


The correct implementation is to return any non-negative int value you like.


"Pop quiz! What will the following program return?

   int main() {
           printf("Hello World!\n");
   }
Easy, right? It's gotta be 0, because since at least ISO/IEC 9899:1999 (aka "C99"), main shall implicitly return 0 if you, the forgetful programmer, didn't bother to explicitly return a value:"

As hobbyist programer I write small programs and run them on older computers, running a variety of OS. Among other things, I use -std=c89 for various reasons. Thus, I know the answer to the pop quiz, for me, is not zero. Being forgetful triggers a warning.

   cat <<eof >1.c
   int main() {
           printf("Hello World!\n");
   }
   eof
   cc -Wall -std=c89 -pedantic -ansi 1.c
   ./a.out

   1.c: In function 'main':
   1.c:4:10: warning: implicit declaration of function 'printf' [-Wimplicit-function-declaration]
       4 |          printf("Hello World!\n");
         |          ^~~~~~
   1.c:4:10: warning: incompatible implicit declaration of built-in function 'printf'
   1.c:1:1: note: include '<stdio.h>' or provide a declaration of 'printf'
     +++ |+#include <stdio.h>
       1 | 
   1.c:5:2: warning: control reaches end of non-void function [-Wreturn-type]
       5 |  }
         |  ^
As a learning exercise I try to silence the warnings one by one.

Instead of just including <stdio.h>, which teaches me nothing, I find the prototype string and insert it the .c file using a short script something like the following

    #!/bin/sh
    test $# = 1||exec echo usage: $0 function
    grep -r " $1(" /usr/include/* 2>/dev/null|sed 's/.*://;s/ *//' 
This enables me to learn that the printf protoype uses __restrict instead of restrict, and consequently a portability note:

On Linux, __restrict is defined in /usr/include/features.h

On NetBSD, __restrict is defined in /usr/include/sys/cdefs.h

Now, with the printf() prototype included

   sed \$r1.c <<eof >2.c
   int printf(const char *restrict, ...); 
   eof
   cc -Wall -std=c89 -pedantic -ansi 2.c
   ./a.out

   2.c: In function 'main':
   2.c:5:2: warning: control reaches end of non-void function [-Wreturn-type]
       5 |  }
         |  ^
The -std=c89 option enables me to learn to use exit(). Using return statement will also suppress the warning. For the main() function however, cf. a subroutine inside main(), I use exit() instead of return.

   sed 4r2.c <<eof >3.c
           exit(0);
   eof
   cc -Wall -std=c89 -pedantic -ansi 3.c
   ./a.out

   3.c: In function 'main':
   3.c:5:35: warning: implicit declaration of function 'exit' [-Wimplicit-function-declaration]
       5 |          printf("Hello World!\n");exit(0);
         |                                   ^~~~
   3.c:5:35: warning: incompatible implicit declaration of built-in function 'exit'
   3.c:1:1: note: include '<stdlib.h>' or provide a declaration of 'exit'
     +++ |+#include <stdlib.h>
       1 | 
Find the prototype for exit() and add it in, instead of just blindly including <stdlib.h>.

   sed 2r3.c <<eof >4.c
   void exit(int);
   eof
   cc -Wall -std=c89 -pedantic -ansi 4.c
   ./a.out


    "Pop quiz! What will the following program return?
       int main() {
           printf("Hello World!\n");
       }
In C99 or later, it produces a mandatory diagnostic because `printf` is not declared.

In C90, it has undefined behavior because a variadic function is called with no visible prototype.


Tldr:

Not returning a value from main() is undefined behavior in ANSI C. So compiler will do whatever it likes. It will return 0 or 42 or crash. In this case, gcc just replaces printf with something else it likes more.


No, it's not undefined behavior.

An aside: "ANSI C" usually refers to the C language as defined by the last standard directly published by ANSI, in 1989 -- but the ANSI organization has adopted each new ISO C standard after publication, so the C standard currently recognized by ANSI is ISO C 2017. We're not going to get people to change what they mean by "ANSI C", so I suggest referring instead to "C89" or "C90".

In C90, not returning a value from main() causes an undefined status to be returned to the environment. It does not cause undefined behavior. The behavior of such a program is otherwise well defined. (C99 and later made reaching the closing "}" of the main function do an implicit "return 0;" instead, a rule borrowed from C++.)


> It does not cause undefined behavior.

I haven't read the standard. I am just stating what TFA says. So are you saying it is inaccurate?


Yes, it's inaccurate.

Here's what the C90 standard says:

5.1.2.2.3 Program termination

A return from the initial call to the main function is equivalent to calling the exit function with the value returned by the main function as its argument. If the main function executes a return that specifies no value, the termination status returned to the host environment is undefined.


"I wanna be close to the metal. I wanna be in complete control. I wanna code in C!"

Yeah. Right.


In other news: man discovers compilers do things you don't expect when you operate out of spec.

Jeeze, don't tell him about semihost-supporting compilers like IAR or Keil, where printf can be one of several different things depending on your target or what configuration options are set for debugging.


Compiler optimization is really annoying sometimes. It's allowed to assume that functions with the same name as standard library functions behave according to standard. They'll swap out printf calls for puts if they don't have a format. The compiler just knows it can do that. Sure, it optimizes things but it gets to the point the code no longer reflects what's written on the source file. Try to hook into printf and it won't work because it's not actually calling printf.

Even on freestanding environments they can and will generate calls to memcpy and memmove. That's insane...


Why is that a problem? (That's a serious question. I'm not suggesting that it isn't a problem. Apparently it is for you.)

If I write printf("Hello, world\n"), all I care about is that those characters are written to the standard output stream when I run the program -- and that's all the language standard specifies. I rarely even look at the assembly or machine code.

As someone else mentioned, "gcc -fno-builtin" inhibits optimizations like this. Other compilers are likely to have similar options.


Because it's surprising and breaks our expectations. The author of this article clearly expected a call to printf to be present in the generated code.

> all I care about is that those characters are written to the standard output stream when I run the program

Sometimes people care about a lot more. Such as the ability to hook into a specific function or ensuring the compiler doesn't generate calls to certain functions.

> that's all the language standard specifies

The author clearly cared about the undefined behavior. He had a mental model of what would happen that was perfectly reasonable. It was constantly invalidated by the optimizer.

> I rarely even look at the assembly or machine code.

I do. It's very jarring when you write some code and the compiler deletes some of your calls and reorders the rest. It can really complicate debugging sessions.

> gcc -fno-builtin

Yeah, that's become a standard flag for me. Just checked the documentation and it turns out it's also implied by -ffreestanding which is a better language than hosted C anyway just because it gets rid of all the libc cruft.


There is no "undefined behavior" in the sense that the C standard uses that term. The choice of which function to generate a call to in the generated assembly or machine code is not "behavior".

The way I think of it is that a C program specifies run-time behavior (which consists mostly of I/O). Any generated code is just a way to achieve that. C is not some kind of assembly language. The standard says nothing about CPU instructions or registers.

If you want to use C source code as a way to generate specific assembly or machine code, you'll have to go beyond what the language standard guarantees. If you need to do that, and you have an implementation that helps you with it, that's great.

I wonder if there's enough demand for a language that's independent of the target processor but still guarantees that a source call to a given function actually results in a call to that function, that a "+" operator results in a single addition CPU instruction, and so forth. It's not something I'd have any use for myself, but that doesn't mean it wouldn't be valuable. C isn't that language, but something similar to C might be.


That's exactly why we have specifications and standards - so that you don't have to guess and build mental models on flimsy assumptions, and know exactly what to expect.


The standard leaves so much stuff undefined or implementation defined there's no way to fully understand what's going on anyway. The only mental model you can form is of the very limited non-existent abstract machine, anything else you have to guess or go to great pains to avoid. It just isn't very useful to think in those terms, so many intuitions just aren't possible. Turns the language into this huge minefield.

I've given up on that. Standards don't compile code so they don't really matter in the end. I'm interested in what my compilers do and the code they generate. I've found I can just tell them to define the formerly undefined behavior, significantly improving the language as a result. No strict aliasing, forcing signed integers to wrap around as you'd expect them to, etc.


It sounds like what you really want is a high-level portable assembler. Which, to be fair, is one of the niches that C has occupied... but I'm not convinced it's optimally designed for that in general, even leaving UB aside.

But back in DOS days, there was something called Sphinx C--: https://bkhome.org/archive/goosee/cmm/c--doc.htm. A modern cross-arch reincarnation of that could be interesting.


> high-level portable assembler

Yeah.

More precisely, what I want is something that:

1. Gives me simple native code ELFs

2. Containing no symbols other than the functions I defined

3. That can interface directly with the Linux kernel with zero dependencies

If I bend C enough it turns into something resembling that. Freestanding C, a couple flags to fix the language and the compiler's inline assembly to fulfill the 3rd requirement. I agree that it's not perfect for the role but C compilers are way too important for me to simply disregard them and look for or invent a new language. I'd be giving up too much.

I wish the newer C standards took the time to define previously undefined behavior instead of adding even more cruft to the standard library. C11 is just the opposite of what I wanted.


Reducing UB also reduces optimization opportunities, and these days, outside of embedded, C (or C++) is usually used because it's fast - there are better options if that's not a concern. So I don't think that the base C standard will ever do that. But there can be standards derived from it, which provide more rigorous guarantees at the cost of perf.

Then you have embedded, where the norm is either gcc (which loves to optimize away UB), or bespoke compilers produced by the hardware manufacturer that are usually full of weird bugs in any case.

Given that, is C really that important? I can see the ability to parse C headers as somewhat useful for the sake of interop, just so you could use all the libraries (and not just syscalls); but aside from that?


Do you also expect `sqrt(a)` to be `call sqrt` and not `sqrtss xmm0`?


> Do you also expect `sqrt(a)` to be `call sqrt` and not `sqrtss xmm0`?

Of course. Who else is going to set errno when I pass negative a number?


Ok, so make the question one of `call sqrt` vs `sqrtss xmm0` + a test that sets errno in case of negative then?

Or even simpler: do you demand that `x = sqrt(4)` entails emitting code that calls `sqrt`?


Nobody has to, if you pass it an unsigned int. Would you still expect it to be a function call then?


So the compiler actually can't optimize because of legacy errno cruft? That's hilarious...


The compiler respects the C standard by default. According to it, puts() is the same as a printf() in some conditions, but sqrt() is not the same as the instruction.

If you don't like C, you can use a dialect of it by setting some flags: -fno-math-errno if you want to optimize sqrt anyway or -fno-builtin if you don't want the compiler to replace functions calls


I like C for the most part. I don't like the standard library. It's got stuff like errno, a thread local global variable, and that's just the simplest problem. Freestanding C turned out to be a far better language just because it has no legacy cruft weighing it down.

> -fno-math-errno

That's interesting, didn't know about it.


No. I expect that from I/O functions.


How is it different? Why do you draw the line at I/O function?


Because it doesn't really matter how the program calculates mathematical functions as long as the result is correct. Whether it's a function or an instruction, it's inconsequential.

I/O on the other hand is a lot more interesting. They're a very common source of undefined behavior, bugs and interception via custom shared objects. I'd rather the compiler touched them as little as possible. Actually I'd rather not use them at all. I find system calls to be much more ergonomic.


For me, it doesn't matter whether printf("hello\n") works via a call to printf or a call to puts. If those 6 characters I specified are written to standard output, I don't care how it was done.

Of course I care if I want to use the value returned by printf, but if my code uses that value the optimization isn't performed.

I think the semantic gap between assembly language and C is much wider than a lot of people realize. I wonder if a language somewhere in the middle of that gap would be useful to some people. (It probably wouldn't be useful to me.)


What about datetime, random, system, etc?


They're all system calls that are better served by using operating functions directly. I'm not sure how the compiler could possibly optimize those away though. Maybe the random function could become a very untrustworthy instruction?


Random number generation generally isn't handled by system calls. It's just computation. (Some languages do provide support for hardware [pseudo-]random number generation. C doesn't.)

More generally, using system calls directly for things like getting the current time makes sense only if you don't care about portability.

C can support both portable and non-portable code. Both are important.

I get the impression that C is too high-level, and assembly language is too tedious, to fully support your requirements. I sympathize with your situation, but it's not one that most programmers share.


> I get the impression that C is too high-level, and assembly language is too tedious, to fully support your requirements.

Definitely seems that way to me as well. Still, if I bend C enough it turns into something that's almost what I want. I just wish GCC had a couple more flags to control code generation in these surprising cases.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: