Target 1: Baseten

2 points by alichraghi


Student

Unless I’m mistaken this is comparing a single local model with a deployment that has to host a (presumably) large number of users at scale. I think it’s well known that unscaled deployments can be much faster. The question is to what extent it’s cost effective and feasible to provide the intended quality of service using deployments that basically load balancing across a fleet of single machine deployments.

Unless these SAIL guys have access to lots of capital the main way to monetize this insight is to figure out the cost of a scaled version of this and pitch it to the various companies already in this space.