C11: A New C Standard Aiming at Safer Programming

AngryParsley · on Jan 28, 2013

    //C11, safe version of strcat
    errno_t strcat_s(char * restrict s1, 
                     rsize_t s1max, 
                     const char * restrict s2);

strcat_s() copies no more than s1max bytes to s1. The second function, strcpy_s() requires that s1max isn't bigger than the size of s2 in order to prevent an out-of-bounds read:

    //C11, safe version of strcpy
    errno_t strcpy_s(char * restrict s1, 
                     rsize_t s1max, 
                     const char * restrict s2);

Originally, all of the bounds-checking libraries were developed by Microsoft's Visual C++ team. The C11 implementation is similar but not identical.

There are so many problems with this. Yet another slightly different string manipulation function? Why not standardize on one of the already existing ones, such as strlcat/strlcpy? I can see people making some big mistakes with strcat_s, since the size passed is the number of unused characters left in s1, not the size of s1. And strcpy_s can cause a segfault if given an s1max that is greater than the size of s2. Why not only copy up to the first null character?

Also, these functions have the same name as the VC++ functions, but behave differently. In VC++, strcat_s takes the size of s1, not the space remaining. People are going to google for strcat_s, read the MSDN docs, and unknowingly add buffer overflows to their code.

Finally, these functions have annoying behavior. If they hit the limits passed to them, they erase s1. No best-effort. No copy whatever fits. Just destroy the data in the destination string.

strlcat/strlcpy solve all of the problems I've mentioned. See http://www.courtesan.com/todd/papers/strlcpy.html for more info about them. It's sad to see them only supported by *BSD and OS X.

holisd · on Jan 28, 2013

strlcpy is flawed. And often times silent truncation is not a "best-effort", but a huge hole. This is especially so when strcpy is blindly replaced with strlcpy. For new code, strlcpy is flawed.

to3m · on Jan 28, 2013

Ulrich Drepper? Is that you?

There are valid reasons not to use strlcpy, but we need something like it. What we do not need is this _s shit, which has all the same problems (which sometimes matter, and sometimes don't) WHILE BEING MASSIVELY HARDER TO USE. And what's the advantage? Just that it returns an errno. That's fine, but what about the rest of it? Just read the instructions for it - it's crap! (I can imagine why they think they want it to work this way, because it gives you a fighting chance again non-0-terminated source strings, but it's just never going to work properly if you've only got one size argument.)

Using strlcpy, by contrast, is simplicity itself.

About 99% of the time when you're using strlcpy (and strlcat), you can just do your calls, passing the same size value into each one (how convenient), then check the length of the result using strlen. Is it one less than the size of the buffer? Yes? Then assume there's an overflow, and disregard the result, assuming you even care about that. This is very straightforward to do, and (unlike this _s junk) doesn't require you to go updating counts or checking maximum values or any of that error-prone nonsense. Just one check at the end, rather than a bunch of updates and stuff scattered all through your string code.

strlcpy! Just say yes!

stephencanon · on Jan 28, 2013

As far as I'm aware Ulrich Drepper hates strcpy_s just as much as everyone else does. Certainly glibc had no plans to implement it when last I checked.

holisd · on Jan 28, 2013

Whatever. Ulrich Drepper is/was an asshat for numerous reasons but was not the only one that was not happy with strlcpy. The best part is that as of a year ago, OpenSSH itself still contained buggy usages of strlcpy, and at least one with a security problem.

I agree that the _s functions API are crap and quite bizarre, notice I did not say they weren't, but that doesn't mean another flawed solution should have been chosen instead.

AngryParsley · on Jan 28, 2013

It's a very rare situation in which truncation is worse than a buffer overflow. I'd also bet that erasing the destination string is more liable to cause problems than truncating it. For example, the programmer might have pointers to locations in the destination string passed to str*cat. All of a sudden, these pointers are to garbage data.

strlcpy's behavior is the same as snprintf: return what the length would have been if there was enough room. That way one can recover from the error and realloc enough space. C programmers are used to this pattern.

angersock · on Jan 28, 2013

Why the hell didn't they just pick the existing MSVS semantics? This type of garbage is huge when dealing with cross-platform code, in a number of subtle cases--usually with string manipulation.

cmccabe · on Jan 29, 2013

Yeah, it's hard to believe that they deliberately picked a syntax that would compile but cause crashes and security issues for the people already using these functions.

At least now we have an ironclad reason not to ever use this garbage.

[edit: I almost wonder if someone on the committee deliberately put this in to sabotage the whole "safe C string functions" farce. Reminds me of a Simpsons episode:

Speaker: Then it is unanimous, we are going to approve the bill to evacuate the town of Springfield in the great state of—

Congressman: Wait a second, I want to tack on a rider to that bill – $30 million of taxpayer money to support the perverted arts.

Speaker: All in favor of the amended Springfield-slash-pervert bill? [entire Congress boos] Bill defeated. [gavel]

http://en.wikiquote.org/wiki/The_Simpsons/Season_6 ]

CedarMadness · on Jan 28, 2013

The big problem with C99 was that MS refused to implement it. If C11 is going to take off, it will depend on MS updating their C compiler. Previously they have said that they don't support C development on Windows except on the device driver level, and would prefer everyone use C++ or one of the .NET languages, so I'm not hopeful about C11's chances.

stephencanon · on Jan 28, 2013

Why? No serious C development happens with MSVC today, and yet C is still one of the most widely used programming languages. The world has long since moved on, and C11 will be adopted by gcc, clang, icc, ibm's compilers, pelles and others.

As a professional C programmer, I really couldn't care less whether or not Microsoft decides to implement C11 or remain in the dark ages. That ship sailed years ago.

AlexeyBrin · on Jan 28, 2013

Clang has implemented a good chunk of C11, I would say you can use C11 on OS X, iOS, Linux etc ...

You can even use Clang on Windows if you need (MinGW, Cygwin or compiled with Visual Studio). I'm not sure if you can integrate Clang in VS for example, or if it is possible to access the OS C libraries from Clang.

darksaints · on Jan 28, 2013

I'm not seeing this as much of a problem. Windows is a dying platform because of a multitude of bad decisions like these. If they choose not to implement it, less software will be made for Windows, accelerating its demise.

rbanffy · on Jan 29, 2013

People will continue writing software for windows for as long as it remains the dominant platform. What happens is that they'll use other languages such as C# (which is perfectly fine for them)

darksaints · on Jan 29, 2013

Of course people will continue to write software for the platform. I never said otherwise.

But there is a ton of software being written right now that doesn't target Windows, with no plans to target Windows. The only way Windows users will ever be able to use that software is if Microsoft makes it easy enough to port.

The nail in the coffin will be when most of the games on Steam do not work in Windows. With most business software becoming web-based, the only holdouts left will be old businesses locked into their enterprise crapware.

cmccabe · on Jan 29, 2013

A few adventurous folks wrote a c99-to-c89 compiler, to solve the issues with MSVC.

https://github.com/rbultje/c99-to-c89

(see http://blogs.gnome.org/rbultje/2012/09/27/microsoft-visual-s...)

mjn · on Jan 28, 2013

The removal of variable-length arrays is the main thing from C99 I miss in C11 (though it seems compilers can continue to support them if they wish). Imo they're fairly elegant in many scenarios, and without them, C programmers tend to make heavier use of magic #define'd array sizes in order to retain the notational convenience of auto arrays.

maximilianburke · on Jan 28, 2013

Have they been removed? I thought they were just downgraded from a mandatory feature to an optional feature.

mjn · on Jan 28, 2013

True, I'm unclear on what precisely that means. C traditionally hasn't had subsetting, so "standard C" code is supposed to compile correctly on any conforming implementation. If your code is "standard C99", any conformant C99 compiler is supposed to compile it correctly; and otherwise, you can't assume that (for example, you may be consciously using a GCC extension, which is ok, or relying on implementation-defined behavior, which is usually bad).

But if you write C11 that uses VLAs, it seems like it's still standard C11, sort of: VLAs aren't merely a vendor-specific extension, like the GCC extensions are, but an optional C11 feature which hasn't been deprecated. Nonetheless, a conforming C11 compiler doesn't have to accept your code, so in that sense your code isn't really "standard C11". Will there be more fine-grained names for the different subsets of the standard features, so you can say that such code is compliant with "maxi-C11" but not with "core C11" or something?

simias · on Jan 28, 2013

I write C for a living and I've never ever used that feature. I don't think I've seen it used outside of C99 tutorials as well.

Can you give us a practical example of some code where variable-length arrays prove really useful?

mjn · on Jan 28, 2013

I've mostly used them in two cases:

1. Multidimensional arrays where the sizes are quasi-fixed but not known at compile time. Comes up in various kinds of grid-simulation code, for example. You could malloc() in this case, but then you lose the notational convenience of 2d arrays.

2. Refactoring code that uses #define'd constants, to take runtime instead of compile-time parameters. For example, a common first-pass to said grid-simulation code is to have some WIDTH and HEIGHT magic numbers, which you might later regret and want to make into command-line parameters. Refactoring is trivial if you have VLAs.

It also seems cleaner (to me) semantically. I can see why, implementation-wise, classic C auto arrays required constant sizes. But from a slightly higher-level perspective, whether a variable is auto or not, and whether its size is a compile-time constant or not, feel like independent decisions. It feels particularly messy that you have different notation for arrays whose sizes are known at compile time; the syntactic sugar there has some rough edges. I also find it cleaner if explicit malloc()s are used only when something beyond boilerplate is going on, letting the compiler handle the boilerplate case of allocating/deallocating a conceptually block-scoped array of size N.

jedbrown · on Jan 28, 2013

I usually don't let them creep into the API, but I use them for convenience within a function that needs direct access to the multidimensional structure, i.e.

    void func(int M, int N, int P, double *a_flat) {
      double (*a)[N][P] = (double (*)[N][P])a_flat;
      // use a[i+1][j+2][k-1] instead of a_flat[((i+1)*N+(j+2))*P+k-1].

binarycrusader · on Jan 28, 2013

Variable-length arrays are great for printf-like functions.

kevinnk · on Jan 28, 2013

How so? Do you mind giving an example of a printf like function that needs variable length arrays?

binarycrusader · on Jan 28, 2013

I wouldn't call it a "need", more like "occasionally desirable". The short version is that it's a convenient method of letting the compiler handle resource management when you have a variable number of values and sufficient stack space. As others pointed out, it's particularly handy for multi-dimensional arrays.

There are also some possible applications here: http://gustedt.wordpress.com/2011/01/13/vla-as-function-argu...

In fairness, I've used them only rarely, but those few times I've used them have saved me a bit of grief.

I could easily live without them if required...

nitrogen · on Jan 28, 2013

Are you perhaps thinking of variable length argument lists?

binarycrusader · on Jan 28, 2013

Not quite; those are handy too. Admittedly, varargs is more useful than vla, but vla can be useful for these functions as well.

cmccabe · on Jan 29, 2013

The issue with VLAs (variable length arrays) is that what tends to happen is that programmers create a denial-of-service bug. It's easy to do-- just create a VLA without first checking to make sure the size isn't too big. Then invalid (or sometimes even valid) input may lead you to use too much stack space, and-- boom! Segfault. These segfaults can be hard to reproduce because stack sizes can vary between machines or depending on which code path you call the function in. This is why Google banned VLAs in their coding standard.

If you're absolutely sure that you can handle the power, you can always call alloca and get some stack-allocated space, even with ancient C compilers. Just make sure to enforce a reasonable upper limit. Also, never mix C99-style arrays and alloca.

dkhenry · on Jan 28, 2013

I am not excited about any of the new features ( except many aligned_malloc) and I write C for a living. I think you get to a point with a language with C where I actually _dont_ want much change everything works just the way it is and we have dealt with its warts for decades. I am actually a little leery of the thread additions and the atomic additions. I feel that shouldn't be a part of the language specification and I would much rather see that incorporated at a higher level.

jedbrown · on Jan 28, 2013

Presumably you are familiar with Boehm's famous "Threads Cannot be Implemented as a Library" [1]. Meanwhile, pthreads require heavyweight OS involvement and implementing a simple spinlock requires inline assembly or non-portable compiler builtins. There is still no widely-used portable library of atomics (the Linux kernel is the most complete I know of, but their atomic support isn't trivial to extract for use outside the kernel, and it was only written for one compiler (gcc)). I'm definitely looking forward to stdatomic.h. Unfortunately, compilers are already mis-applying [2] the macros that are supposed to be available for determining the presence of thread and atomic support, so, as usual, configure tests are needed for everything.

[1] http://www.hpl.hp.com/techreports/2004/HPL-2004-209.html [2] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53769

kev009 · on Jan 28, 2013

Widely used, not yet.. but meet http://concurrencykit.org/

jedbrown · on Jan 29, 2013

That's an interesting project, but it doesn't qualify as portable yet since it's currently entirely dependent on gcc inline assembly syntax and doesn't support a number of important architectures, including ARM. Thanks for sharing though, I'll keep an eye on it.

kev009 · on Feb 1, 2013

It is _portable_. It's not intensively _ported_ yet: Archs: Power, SPARCv9, x86, x86-64 Compilers: gcc, clang, icc, suncc

Adding a wildly different compiler shouldn't be the end of the world but it would need some build sys work and a new compiler wrapper.

Samy was focusing on a few archs that help debug the code. I got him access to the GCC build farm, but the ARM boxes there were fairly weak when I remember. Access to good hardware might help him move it along.

stephencanon · on Jan 28, 2013

For those of us who do numerical programing, the new complex initializer macros are a huge improvement; previously it was quite painful to initialize complex values with an infinite imaginary part without resorting to non-portable extensions.

It's a minor fix, but it's vastly better than what was there previously, and it's exactly the sort of change that works really well in C; incremental fixes that are carefully thought out and easy for both implementors and users to adopt.

cygx · on Jan 28, 2013

I am actually a little leery of the thread additions and the atomic additions. I feel that shouldn't be a part of the language specification and I would much rather see that incorporated at a higher level.

What higher level? POSIX? Personally, I believe things like atomicity and alignment do belong into the language specification.

Also keep in mind that the multithreading support as specified by C11 is closely aligned with C++11.

cmccabe · on Jan 29, 2013

we already had aligned_malloc; it was just called posix_memalign.

jessaustin · on Jan 28, 2013

It's been a while since I used C very much, but news like this plus the great new 21st Century C, http://shop.oreilly.com/product/0636920025108.do from Ben Klemens and O'Reilly has me reconsidering that. That book says those who complain the standard doesn't do enough should look at POSIX and libraries like GLib for help.

pbiggar · on Jan 28, 2013

I find it interesting that this snuck up on me. I heard dozens of things about C++0x for years before it happened, but this seems like its already been published and this is literally the first I've hear of it.

dspeyer · on Jan 28, 2013

Same here, but it seems kind of justified. C11 looks a lot less ambitious than C++11. Just tweaks outside of the threading stuff, and I suspect most thread users will stick with their existing feature-rich but nonportable libraries.

C++ has always had the feeling of a language that isn't quite finished, so there's a lot of interest in where it's going next. C has the feeling of a language that does exactly what it sets out to do, so it's not really going anywhere.

AlexeyBrin · on Jan 28, 2013

The article is about C (as in C11 standard) and not about C++ (as in C++11).

stingraycharles · on Jan 28, 2013

Which is exactly what the OP is talking about. There was a lot of buzz about C++11 before it was released, yet there was very little buzz about C11 (or at least, not something the OP noticed).

stephencanon · on Jan 28, 2013

While the C++ community is gossiping about language features, the C community is busy writing software to get things done. =)

Nursie · on Jan 28, 2013

While this is flippant (and amusing) there's an underlying truth here somewhere - I've really not met many C programmers who care about new language features.

C++ folks do seem to care a lot about which features of boost are going to be folded into the language spec and runtimes and what else has been dreamt up for their baby, and how they're going to be able to be more efficient, safer and gosh-darnit just all-round smarter in future.

C programmers mumble something about being busy and already having the tools they need to do everything, though (grudgingly) that VA_ARGS preprocessor extension from '99 was quite handy I suppose...

(I'm a C programmer by trade and history, working on C++ at the moment, just in case I pissed anyone off enough to start a holy war :)

Scaevolus · on Jan 28, 2013

There was a lot of hype about C++0x.

There has not been much about C11.

wladimir · on Jan 28, 2013

It's sad that C99 was never widely adopted. It's 2013 and MSVC still doesn't implement it. I really like the constructs that were added, especially with regard to named structure and array initialization. It manages to make programming in C just that little bit friendlier, more powerful and less over-verbose.

stephencanon · on Jan 28, 2013

It's not clear what you mean by "never widely adopted". With the exception of MSVC, are there any mainstream C compilers that don't try to follow the C99 spec?

sageikosa · on Jan 28, 2013

Microsoft doesn't make a C compiler. They have a C++ compiler that can also compile C90.

wladimir · on Jan 29, 2013

Sorry that I was unclear. I always use a C99 supporting (gcc/clang) compiler for my own C projects.

However, the problem is that because MS doesn't adopt it, a lot of companies and projects still have coding standards that forbid the use of C99 constructs. This because of the (percieved) need to compile for Windows with MSVC some day. So MS not adopting C99 is a problem for adoption, even though most compilers support it fine.

nonamegiven · on Jan 28, 2013

"Finding C99-compliant implementations is a challenge even today."

"Originally, all of the bounds-checking libraries were developed by Microsoft's Visual C++ team. The C11 implementation is similar but not identical."

Picking a nit, but shouldn't this say "The C11 standard ..."? If C99 implementations are hard to find, I'd think it would be more so for C11.

Are there any C11 implementations? Or do you just pick the implementation and libraries that implements that largest set of C11 that are important to you?

cjg_ · on Jan 28, 2013

Clang supports some parts of C11, see http://clang.llvm.org/docs/LanguageExtensions.html#c11 for details.

frozenport · on Jan 28, 2013

Restrict is part of the C99 standard

vor_ · on Jan 28, 2013

Geez, is there a single non-troll comment on that website?

pbiggar · on Jan 28, 2013

Smartbear make code review tools; I wonder if their audience really really likes to nitpick things.

vor_ · on Jan 28, 2013

I was referring to the anti-C comments on the linked article, but I think it came off like I was criticizing the whole website.

meaty · on Jan 28, 2013

Might as well just use Go.

oscargrouch · on Jan 28, 2013

You can use go where you use python, ruby and maybe some of the things that get done on c++. But not being C friendly , and with a GC.. for the things people still use C and C++ these days.. probably even if you want to use Go, you wont be able to..

If the guys from plan9 just get the Ken Thompson C compiler from plan9 and changed that C just a little bit, taking care that a plain C header could be added to that code the same way we could do with C++

only then, it would be possible to "just use Go" (plan9 C can be used with gcc using "fplan9-extensions")

rbanffy · on Jan 29, 2013

Loved the direct struct assignment. Is it plan9 only?

oscargrouch · on Jan 29, 2013

Do you mean, using anonymous structs for struct composition? you can do something similar with "-std=c11 -fms-extensions"

The only thing it will be missing from plan9 c will be the automatic casting from the composition to the base struct.. but its just to cast to the base struct when you need to reference only the parent struct, polymorphism. (Ken Thompson called this effect as "Poor man classes")

Macros can be handy when you need to hide this nasty castings away when not using the plan9 extension..

the nice thing about "-std=c11 -fms-extensions" is that it works with gcc, clang and vc.