It’s unlikely that it’ll ever get dramatically better. It’s already been heavily optimized, and the Rust compiler now has more parallelism than pretty much any other mainstream compiler. Language design choices make Rust more challenging to compile than a language (like Go) that is specifically designed for fast compilation.
I don't agree. There are a lot of things on the table, performance-wise.
1. The compiler could ship binary artifacts, which would avoid all compilation of build scripts/ proc macros, and allow those to be compiled with performance optimizations enabled. This would be huge on its own.
2. Cranelift could potentially improve backend codegen compile times significantly as well.
3. Link times are still suboptimal, mold is promising here.
We can definitely still get significant wins out of the compiler.
Pretty sure compile times can get cut in half (or better) with those changes.
Maybe, but even twice as fast would still make it a “slow compiling language” in comparison to a “fast compiling language” like Go or Pascal.
This is not a knock on Rust—I doubt it’s possible to do what Rust does—including zero overhead abstractions—in a fast compiling language. Go certainly pays a performance penalty with things like boxed generics.
Twice as fast (or more) is just what I'm aware of in terms of "things that are possible to do today but aren't the default/ would take work to hack in". I don't even know what other options there are beyond that.
But sure, twice as fast isn't fast, it's just faster. My point is that we're not at the point of serious diminishing returns, there's tons of stuff left to do.
If there was a magic pot of gold, it would be technically possible to precompile every crate version with every rustc version on every supported platform and distrivute those prebuilt rlibs to users through cargo. That would help with first compile times when using the standard tooling, and not just for proc-macros.
Different people with different use cases have different complains. I haven't quantified but have certainly seen various complaints about both cases from different people.
Go’s generics aren’t boxed. At least, not in the sense that Java’s are. For example, you can write generic functions that operate over slices of unboxed values.
Still worse than true monomorphized generics in C# which too has fast compilation times (by nature of being JIT-compiled, but AOT target is still faster than Rust once you download dependencies).
Go’s implementation strategy for generics essentially is monomophirization plus obvious code size optimizations (e.g. don’t generate different code for different pointer types given that they all have the same underlying representation). Do you have specific scenario in mind where Go’s implementation strategy carries a significant performance penalty? I think possibly there are some misconceptions in this thread about how Go’s implementation actually works.
It is a knock on Rust. The circumstances of Rust's state of existence in 2023, as a language created in this millennium but not in the last decade, are absurd.
> I doubt it’s possible to do what Rust does—including zero overhead abstractions—in a fast compiling language
People packaging releases for software written in Rust and others who are passive consumers and finding themselves downloading some project repo to compile from source for whatever reason (e.g. because the creators don't do binary releases themselves) don't need the Rustlang toolchain to do the things that active contributors to a given project (who want type system diagnostics, etc.) need from it.
I'd call this oversight a massive lack of imagination on the part of TPTB, but that would be wrong, because there is no need to imagine the differences between these use cases. They exist. An adequate toolchain for dealing with projects written in Rust—despite the deliberate decisions made during language design that led to these problems—does not.
> 2. Cranelift could potentially improve backend codegen compile times significantly as well.
I've been told that the Cranelift team (at least for the time being) doesn't have the intention on focusing on the optimizer to a degree where it would be competitive with LLVM's optimizers (which would also be a huge effort). So if you want faster compile times you would have to take significant performance hits (which for a lot of code compiled in CI is not a trade-off that people are willing to take).
Yes, to be clear, Cranelift would be suitable for dev and test builds, you'd likely use llvm for release builds. So in your CI builds you'll almost certainly stick to llvm.
Beyond specific optimization and implementation details of a compiler, the three variables of "compilation speed", "generated code optimization" and "language expressiveness" are fundamentally in tension. In order to move one axis you have to affect one or both of the other two.
It would be great it people would pay Rui to make mold versions for Windows and Mac, which ideally would be required before making it a part of the official Rust tool chain.
He did monetization the wrong way around, IMO. Most CI is on Linux, but most developers are on Windows or macOS, so he should've capitalized on the Linux builds being paid while the local developer builds on Windows and macOS being free.
I doubt that anyone cares all that much about linking times in the CI. And even if someone does, it's probably an individual developer or team, ie, someone without decision power to pay for something as niche as a linker.
Also, mold was designed as an alternative to gold / lld, therefore it would require to be open-source and free on their main platform: linux.
I care deeply about linking times on CI. It's very frustrating having your code all build and run tests locally just to wait a long time for it to pass all of the CI barriers. Plus, CI builds often go stale much faster, so you're looking at much longer build times without caches.
Sure, but you're not really contradicting me unless you're able to get your company to pay for faster tooling. And if you can, why haven't you already?
Well, yes, that is the crux of this argument, that one can convince their employer to use mold. Otherwise, what is the point of using it? Desktop users by and large will not notice a small 3-5% improvement in compile times while those that pay for CI will.
Well, CI is where the costs are, and if the application is big enough, even a few percent reduction via faster linking times would equate to lower costs, while in contrast, developers won't really care or notice a few percent reduction on their local machine.
It's AGPL on Linux now, and they sell commercial licenses for companies that won't touch that license, and they were contemplating earlier making mold only available under a non-free source available license like BSL, so there's no "requirement" as such that it be free and open source, even on Linux.
> Most CI is on Linux, but most developers are on Windows or macOS
Do you have any data on this ? Maybe that's industry dependent, but I hardly know any Windows (not even talking about macOS, that's almost nil) developers outside of video games and web dev. 100% of Rust devs I know use Linux, to keep on the subject.
Data that most people don't use Linux as their day-to-day desktop OS for development? I suppose you can just look at desktop Linux statistics, which shows <5% usage. In my experience, most use macOS, or Windows via WSL2, which does use Linux but I am not sure if that is actually reflected in any desktop OS statistics.
I agree with this assessment, despite the optimism of some others. C++ has had slow compile times since forever, and so will Rust. Rust does a lot more work at compile time than most other popular languages. And it's largely stuff that's fundamental to the language. For example, besides borrow checking, the de facto default way to do polymorphism/generic programming in Rust is at compile time via what is essentially code-gen. In Java if you write `void useFoo(Foo foo)`, it'll compile quickly and will use runtime polymorphism to make sure that the argument is a subtype of `Foo`; in Rust if you write `fn use_foo(foo: impl Foo)`, the compiler is going to spit out a `use_foo` definition for each concrete type that is passed to `use_foo`. That takes time.
That being said, I definitely find the trade-off worth it. Though, I've never been the kind of programmer that desires the constant iteration and feedback of something like "REPL driven development".
> C++ has had slow compile times since forever, and so will Rust.
Rust has a massive advantage, which is having a 'sanctioned' package manager and built-time capabilities. A huge part of Rust's slowdown is due to:
a) Having to compile build scripts
b) Those build scripts being built without optimizations (100s of times slower at runtime)
If cargo + crates.io supports pre-built dependencies that is a massive optimization.
This isn't theoretical or optimistic, it's just a fact - we can already see this by compiling build and proc macro crates with optimizations, it's just not the default and they still have to be compiled once. IF you remove that compilation time, again, it's not theoretical, it's turning N time spent on those deps into 0 time spent.
There is easily a 200% performance win available, just from the known optimizations that are on the table.
Rust has another advantage in the language itself- generic code can be type-checked and (partially) optimized before being instantiated.
When you export a generic function in C++, every file that pulls it in has to re-parse it, and every instantiation has to re-type-check it. C++20 modules should help with the first part, but they can't help with the second (and neither can concepts). Further, separate translation units can wind up duplicating the same instantiations, which the linker has to deduplicate.
When you export a generic function in Rust, by the time it gets pulled in somewhere else, it takes the form of pre-parsed, pre-type-checked MIR. It can also be pre-optimized, so type-independent optimization work is shared between instantiations. The compiler can also tell, before instantiation, which type parameters a function does not actually depend on, and essentially erase them ("polymorphization"). Further, Rust's compilation model reduces the redundant duplicate instantiations C++ does, both by using larger translation units and by automatically sharing any instantiations in dependencies with their dependents (though you can do this by hand in C++).
(Incidentally, these differences also apply to inline functions- in C++ you wind up putting their definitions in headers and recompiling them from scratch over and over; in Rust they are shared MIR form.)
> we can already see this by compiling build and proc macro crates with optimizations, it's just not the default and they still have to be compiled once.
I'm hopeful something like watt (https://github.com/dtolnay/watt) will land in Cargo that'll allow us to ship pre-compiled wasm blobs for proc-macros so we can just have sandboxed binaries.
I think the whole point is to prevent build scripts from doing arbitrary things. The sandbox should give access to the source code being built, record changes to these files (and/or new files generated in the same directories), and that's about it.
C++ compiles times are awful insomuch as you have to do the multiple times because the "template barf" makes finding root causes very challenging, esp with multiple problems.
Rust makes the problems easier to fix, IMHO. So, maybe even with same (or slightly longer) compile times, you'll hopefully have faster time to delivery.
In fact, in my experience, Rust has faster time to delivery than any other language I've used. It takes forever to compile, but I have so many fewer runtime bugs that have to be caught (hopefully) by testing, that it still comes out ahead, overall (again, for me and my various projects).
I also find write-time to not be as slow as others complain about, except when it comes to async/futures where it is, indeed, pretty rough. But, if I sit and think about how many times I have to flip back and forth between my code and some library code to try and guess what exceptions it may or may not throw in other languages or whether something could be null or not, I find that the dev times aren't so much better in these other languages as people sometimes claim.
Sure, if you're a fulltime JavaScript dev with 10 years of experience, you might remember things like that calling the Array constructor with 0 or >1 arguments creates an array with those values, but if you call it with exactly 1 number, it will create an empty array with that capacity. But, since I have to switch between many languages regularly, my time to delivery is significantly reduced by nonsense like that. Likewise, it's reduced by NPEs in Java, double-frees in C++, Kotlin's inane idea to use exceptions for errors and coroutine control-flow, etc, etc.
I just want to note that I fully agree that Rust is, ultimately, an extremely productive language. In my considerable experience with Rust it is the most productive language I have ever written code with professionally.
The fact that my only complaint is that compile times are slower than I'd like should be seen as high praise.
It’s not really that go is better designed for fast compilation - it is just a plain language where the compiler can just spit out vaguely optimized code, and call it a day.
Rust’s unique feature itself fundamentally depends on extensive static analysis. It’s not a design choice, it is pretty much what Rust is - a low-level language without a GC that is still memory safe. The price for that is hefty compile times.
> It’s not really that go is better designed for fast compilation
One of the explicit goals, by Go's creators, was fast build times. I still remember Rob Pike introducing Go during an all-hands at Google, where he talked about the very long build times for C++ and Java in Google's monorepo, and then showed some promising demos. (Most of us rolled our eyes at it then, because it was just a "hello world", but it's quite impressive how the language has evolved and remained true to its goals.)
> - it is just a plain language where the compiler can just spit out vaguely optimized code, and call it a day.
It's a simple language, but I wouldn't call it plain, nor characterize the optimizers that way.
It is not faster at compilation than Java, which was not particularly designed for such.
Also, as can be seen, go is not a well-designed language, having language warts we knew for 50 years. I would take the creators’ claims with a huge grain of salt.
But Java is inspired by Smalltalk, which is a late-binding language that defers most things to runtime. I believe in Java you can generate bytecode directly as you’re parsing the source file.
Java is compiled to bytecode, for later compilation to machine code at runtime (JIT). Go is compiled AOT, straight to machine code. It makes no sense to compare them.
Unless you meant that Java's AOT compilation is faster than Go's?
The parent comment explicitly mentioned that Java is slow at compilation, which is just false.
Also, there are single-pass compilers that produce machine code, they are not fundamentally slower than a byte code generator. Of course extensive optimizations will be more expensive.
I do think highly of Rob Pike and Ken Thompson for their IT work, but they are simply not good at language design, which just shows that PL design is quite unlike working on an OS.
Both statements, because unless otherwise qualified you're comparing apples to oranges when you say Java compiles as fast as Go. There's always going to be more overhead on running the Java bytecode on the JVM than there will be when running the native instructions generated by a compiler (even as "unoptimized" as Go is).
And someone that makes that assertion with a straight face without this caveat is not someone that should be dissing Rob Pike about language design.
Profiling the compilation process suggests that this isn't the case. Rust's higher level passes are rarely the dominant part of execution time.
Check out https://github.com/lqd/rustc-benchmarking-data/tree/main/res... and the other benchmarks in that repository for some data on how real world crates compilation times are spent. You'll find that backend code generation and optimization dominate most crates compile times. There are a few exceptions: particularly macro heavy crates, a couple crates with deeply nested types that hit some quadratic behavior in the compiler. But overall, the backend is still the largest piece.
The front end is time-consuming enough where replacing the backend with something lightweight like Go’s wouldn’t get you a 5-10x improvement, which is what I think you’d need to really move the needle on user perception. Moreover, a lot of the backend slowdown is due to front end choices monomophization which generates large amounts of intermediate code that must then be optimized away.
I doubt that a hypothetical version of Rust that avoided monomorphization would compile any faster. I remember doing experiments to that effect in the early days and found that monomorphization wasn't really slower. That's because all the runtime bookkeeping necessary to operate on value types generically adds up to a ton of code that has to be optimized away, and it ends up a wash in the end. As a point of comparison, Swift does all this bookkeeping, and it's not appreciably faster to compile than Rust; Swift goes this route for ABI stability reasons, not for compiler performance.
What you would need to go faster would be not only a non-monomorphizing compiler but also boxed types. That would be a very different language, one higher-level than even Go (which monomorphizes generics).
Just wanted to note Go does only a partial monomorphization, only monomorhpizes for gcshapes and not for all types. This severely limits the optimization potential and adds a runtime cost to dispatch, at least in its initial implementation.
Then there is an open niche for a “development mode”, that outputs barely optimized binaries with proper error handling, fast. (I do know about debug, etc).
It already exists: It's called “debug” mode and it's what you get when you don't compile it in release mode. (The biggest problem with debug mode is how slow the unoptimized code is: for back-end stuff it doesn't matter, but for things like gamedev you want your dependencies to be compiled in release mode (fortunately the cargo allows you to specify that you want some deps to be compiled in release mode even when your project is compiled in debug mode).
This is a HN thread about a blog post about how compile times have become dramatically better thanks to newly introduced parallelism in an area that was completely single threaded.
> However, at this point the compiler has been heavily optimized and new improvements are hard to find. There is no low-hanging fruit remaining. But there is one piece of large but high-hanging fruit: parallelism.
From discussions I've seen, there's not much high-hanging fruit left either, short of rewriting the entire compiler for better incremental compilation.
I think if you're talking about the compiler getting faster at what it does today, how it does it today, that's true. But that's a heavy constraint. If we got support for binary dependencies, that wouldn't be a compiler optimization in the same sense as parallelism is, but it would radically improve compile times for the average project.
Yeah, but binary dependencies or watt-style precompiled macros aren't going to get improve the build times people really care about, incremental build times. The parallel frontend is the plausibly the last major improvement we'll see on that front for years.
Incremental matters more than clean build times because (A) you're likely to do a lot more of them (B) they break developer flow more than waiting on CI does (C) at least in theory, you can always add more cores to your CI and get reasonable speedups, less so for incremental.
> Yeah, but binary dependencies or watt-style precompiled macros aren't going to get improve the build times people really care about, incremental build times.
Why not? If I add a new struct with `#[derive(serde::Serialize)]` I'll benefit from serde being compiled with optimizations.
> they break developer flow more than waiting on CI does
It might not get 10x better, but 3x isn't outside the realm of possibility. Just swapping the LLVM backend for cranelift can cut compile times in half.
The low-hanging fruit is gone but there are lots of hard but likely-significant improvements left on the table.