Netmap is commonly used in HFT as well as packet filtering applications. I believe Verisign is running some of the root DNS servers with netmap as well, getting millions of connections per second.
Barely. On a reasonably configured kernel (you need both syscall auditing and context tracking turned off, which is doable at compile time or runtime), a modern CPU should be able to round-trip a syscall in under 40 ns. That only eats 4% CPU at 1M syscalls per second.
(It's slightly worse than that due to extra cache and TLB pressure, but I doubt that matters in this workload.)
There is a project "Kerlnel" http://kerlnel.org/ that is supposed to be Erlang instance running on bare metal. Site appears to be down and the Github https://github.com/kerlnel hasn't been touched since 2013. Then there is http://erlangonxen.org/ which puts Erlang on top of Xen instead of another operating system.
Yes, given the solarflare I was expecting the article to end up with a receiver coded against ef_vi, which exposes the NIC memory directly (but you have to do the IP/UDP yourself)
Makes me wonder how often bypassing the kernel is used in production networked applications.