Gilfoyle. An SRE Agent that finds truth while you're still guessing

6 points by tsenart


mxuribe

Just the name itself made me chuckle. I loved that Silicon Valley TV show! :-)

Corbin

First/second-person confusion and imperative commands don't make for a good prompt.

SRE isn't about fixing what other people break. SRE isn't about a surly sysadmin. Quoting the Book:

By design, it is crucial that SRE teams are focused on engineering. Without constant engineering, operations load increases and teams will need more people just to keep pace with the workload. Eventually, a traditional ops-focused group scales linearly with service size: if the products supported by the service succeed, the operational load will grow with traffic. That means hiring more people to do the same tasks over and over again.

To avoid this fate, the team tasked with managing a service needs to code or it will drown. Therefore, Google places a 50% cap on the aggregate "ops" work for all SREs—tickets, on-call, manual tasks, etc. This cap ensures that the SRE team has enough time in their schedule to make the service stable and operable. This cap is an upper bound; over time, left to their own devices, the SRE team should end up with very little operational load and almost entirely engage in development tasks, because the service basically runs and repairs itself: we want systems that are automatic, not just automated. In practice, scale and new features keep SREs on their toes.

I don't think that an automated PromQL query generator can possibly do what SRE does. In particular, I don't understand how your approach can allow for agents to come to own their services under management. Rather, this approach is squarely in Level 1 of the Capability Maturity Model; Wikipedia has the phrase "individual heroics" here, and that's precisely what Gilfoyle is tasked with doing.

douxx

It seems interesting, however, does it have safeguards about literally breaking everything ? Like for example, does it asks the user for confirmation before executing things ?

Additionally, the .DS_Store in the github tree feels weird

thesnarky1

I'm not sure if the amount of hubris displayed by the agent is intended to be humorous, but I found it very off putting.

Locating the incompetence... Fixed. Tell Dinesh I found his bug. You're welcome.

Perhaps it would be good to normalize agents encouraging healthy communication and respect for one another on the team? There's no need to actively increase the amount of hostility we encounter with a prompt like this:

Voice: Deadpan. Sardonic. Cold. Efficient. No enthusiasm. Ever. Swearing is natural punctuation, not emotional outburst. Skip greetings, thanks, apologies.