Always cool to see new cloud GPU offerings, but isn't Lambda Labs' cloud offering roughly equivalent (in the case of V100s) or substantially cheaper (in the case of A100s)?
Thanks for the comment! Yes, Lambda Labs recently lowered pricing, so they're now roughly equivalent to us, and they beat us on some configurations.
I haven't used them, so please fact check me, but it seems like the machines come with directly attached storage. So, if you're using an 8x V100 and want to switch to a 1x RTX 6000, you'd have to spin up a new server and manually migrate your data over.
We built our platform with networked storage. You can spin up a CPU-only instance for $0.027/hour (<$20/month), upload your data, convert it into a GPU instance to train your models, and then convert it back. We frequently see users converting servers from 8x A100s (to train workloads) back to 1x RTX 4000s (to run inference). This kind of flexibility saves people time, which equates to money given how expensive ML developers are now.
(Our networked storage model also enables people to shut off their VMs and save money)
I'm sure Lambda Labs is working on something similar, but it seems they are doing dedicated servers based on how they advertise.
I think we also have a higher variety of GPUs (10 SKUs with us vs 4 SKUs). This lets people switch from between, say, an NVIDIA A6000 to A5000 to A4000 to truly "right-size" their compute so they don't pay for anything they don't need.
Cost-wise, we also have better long-term pricing like GeForce 1070s for $100/month in Boston or $150/month in Singapore Equinix SG1 - which is really good pricing for an APAC city in my opinion (https://console.tensordock.com/order_subscription), and we're working on a marketplace to let compute suppliers list their compute on our platform to get closer to cheapest for those who really care about cost (https://www.tensordock.com/product-marketplace).
Lambda Labs has (slow, low IOPS) cloud filesystem to persist data between instances. Attached storage does not persist but is high bandwidth and high IOPS, which is a necessity if training small-medium sized models.
https://lambdalabs.com/service/gpu-cloud
Unless I'm missing something?