Meh. I don't want background jobs to be a breeze in quite this way. I want backg...

olivermuty · on Nov 10, 2023

If you want a language and VM that makes it easy to run jobs on different compute then boy do I have great news for you ;)

Running heterogenous loads on an elixir cluster is almost trivial. Certainly trivial compared to other languages. No external deps required.

throwawaymaths · on Nov 10, 2023

You're creating a really difficult distributed systems problem then. What happens if the async request gets swallowed? Are you coordinating telemetry and tracing across these compute units?

Not all of us can afford elaborate ops teams

yuppiepuppie · on Nov 10, 2023

Its really not that difficult to trace this stuff with modern APM tools. Datadog and Elastic APM support this out of the box for most languages/frameworks. And id be suprised if other big players in space didnt also.

davidw · on Nov 10, 2023

"Just use Datadog" is fine for a bigger more established company.

But it's a lot of expense and overhead for someone closer to starting out.

I do not think it's a "home run" kind of win, but it's definitely a "nice to have" that you can just run everything in one place, and worry about splitting it up when (if) you get bigger.

throwawaymaths · on Nov 10, 2023

Yeah that's fine but expensive for the tracing problem. You've completely ignored the harder bit which if you have any experience you know is a motherfucker of a problem: what happens if your async request fails, and notifies you, or, fails and forgets to notify you, or succeeds but takes way too long.

What sort of state management schemes do you have to put into place yo make sure your database isn't full of corrupted data that is going to crash you later (or, worse, silently violate an assumed invariant - and maybe black hole your money or your customers money)

The nice thing is that elixir gives you this either automatically or with a line or two of code.

yuppiepuppie · on Nov 10, 2023

Not sure what the condescending tone is for...

I do have quite extensive experience with it. And it is a difficult problem.

Lots of frameworks/languages deal with them differently (dlqs, auto-retry, etc.), and each has its pros and cons. In addition, one should have to weigh how important each task is and determine if a 1 in million loss justifies having an extensive resiliency built in.

If Elixir has done that out of the box with a few lines of code like you say, great to hear. And Id love to try it out on day.

keep_reading · on Nov 10, 2023

> I want background work to live on different compute capacity than http requests

The magic of Elixir/Erlang is that all of this logic can live in one codebase and you can choose when deploying your distributed cluster of Elixir/Erlang nodes which nodes run which tasks. The nodes are all aware of each other and when something happens that requires execution of that task it happens on the correct nodes because they're the only ones running those processes. Automagically.

thibaut_barrere · on Nov 10, 2023

You can actually have "background jobs" in very different ways in Elixir.

> I want background work to live on different compute capacity than http requests, both because they have very different resources usage

In Elixir, because of the way the BEAM works (the unit of parallelism is much cheaper and consume a low amount of memory), "incoming http requests" and related "workers" are not as expensive (a lot less actually) compared to other stacks (for instance Ruby and Python), where it is quite critical to release "http workers" and not hold the connection (which is what lead to the creation of background job tools like Resque, DelayedJob, Sidekiq, Celery...).

This means that you can actually hold incoming HTTP connections a lot longer without troubles.

A consequence of this is that implementing "reverse proxies", or anything calling third party servers _right in the middle_ of your own HTTP call, is usually perfectly acceptable (something I've done more than a couple of times, the latest one powering the reverse proxy behind https://transport.data.gouv.fr - code available at https://github.com/etalab/transport-site/tree/master/apps/un...).

As a consequence, what would be a bad pattern in Python or Ruby (holding the incoming HTTP connection) is not a problem with Elixir.

> because I want to have state or queues in front of background work so there's a well-defined process for retry, error handling, and back-pressure.

Unless you deal with immediate stuff like reverse proxying or cheap "one off async tasks" (like recording a metric), there also are solutions to have more "stateful" background works in Elixir, too.

A popular background job queue is https://github.com/sorentwo/oban (roughly similar to Sidekiq at al), which uses Postgres.

It handles retries, errors etc.

But it's not the only solution, as you have other tools dedicated to processing, such as Broadway (https://github.com/dashbitco/broadway), which handles back-pressure, fault-tolerance, batching etc natively.

You also have more simple options, such as flow (https://github.com/dashbitco/flow), gen_stage (https://github.com/elixir-lang/gen_stage), Task.async_stream (https://hexdocs.pm/elixir/1.12/Task.html#async_stream/5) etc.

It allows to use the "right tool for the job" quite easily.

It is also interesting to note there is no need to "go evented" if you need to fetch data from multiple HTTP servers: it can happen in the exact same process (even: in a background task attached to your HTTP server), as done here https://transport.data.gouv.fr/explore (if you zoom you will see vehicle moving in realtime, and ~80 data sources are being polled every 10 seconds & broadcasted to the visitors via pubsub & websockets).

cultofmetatron · on Nov 10, 2023

there's nothing stopping you from doing this but its a real game changer for an early stage startup to need a new service and the steps to getting that out is one file and an extra line in your config.

sph · on Nov 10, 2023

Don't choose Elixir if you just love setting up autoscaled k8s on AWS with redis, RabbitMQ and half a dozen shenanigans for a webapp that serves 10 users a day.

keep_reading · on Nov 10, 2023

Elixir is my excuse for refusing to entertain the idea of an k8s cluster at work. Completely unnecessary.

deviprsd · on Nov 11, 2023

k8s solve a different layer of problems, to me they are a complemental tool to elixir when there is a need for that layer, otherwise yes it is completely unnecessary.

keep_reading · on Nov 12, 2023

Why would you want to deploy an Elixir app on a system that adds an unacceptable amount of latency to the network? K8s is trash for all the stupid network packets being copied from user space to kernel space and back

bcrosby95 · on Nov 10, 2023

I've only used Elixir at small companies, and it definitely feels like a superpower. I don't doubt that the Googlers of the world have no need for the language and runtime, but IMHO it really helps a small company punch above its weight.