Target 1: Baseten
2 points by alichraghi
2 points by alichraghi
Unless I’m mistaken this is comparing a single local model with a deployment that has to host a (presumably) large number of users at scale. I think it’s well known that unscaled deployments can be much faster. The question is to what extent it’s cost effective and feasible to provide the intended quality of service using deployments that basically load balancing across a fleet of single machine deployments.
Unless these SAIL guys have access to lots of capital the main way to monetize this insight is to figure out the cost of a scaled version of this and pitch it to the various companies already in this space.
Cost of compute at hyperscalars < cost of compute at neoclouds. On that basis alone they wouldn’t even need to host their own infra (need exorbitant amounts of capital) just show the disparity and cost to drive value to the org. Think long term though you’re right and they’ll need to build something on prem to directly host and compete, question then becomes exactly what you said which is to what scale can they do this. Cool blog though.