I've often wondered for client-side apps if a simple thread-per-goroutine model would work for Go. Sure, you'd lose some of the easy scalability. But I think that's mostly useful for server software and client apps don't usually have thousands of concurrent tasks anyway. You could also lower the overhead some for calling C libraries (which client software does more often).