Home
LLM 'benchmark' as a 1v1 RTS game where models write code controlling the units
1 points by wherewhy