The key to running LLM services in prod is setting up Gemini in Vertex, Anthropi...

shpat · 2025-06-26T00:32:51 1750897971

Have you had any luck getting your Claude quota bumped on Bedrock? I tried working through AWS support but got nowhere. Gave up and used Vertex + Gemini

com2kid · 2025-06-26T06:18:58 1750918738

Does OpenAI on azure still have that insane latency for content filtering? Last time I checked it added a huge # to time to first token, making azure hosting for real time scenarios impractical.

shakna · 2025-06-26T10:11:37 1750932697

Yes.

Unless you convince MS to let you at the "Provisioned Throughput" model. Which also requires being big enough for sales to listen to you.