If Claude Fable stops helping you, you'll never know
92 points by reissbaker
92 points by reissbaker
Imagine a compiler that refused to properly compile a competing language!
Utterly repugnant behaviour here from Anthropic.
Given the amount of pettiness that can enter language wars, I'm surprised we haven't seen this happen yet.
There is a reading of your comment that is tongue in cheek with a fair amount of irony and I choose to read as such. Because in fact they do refuse to compile a competing language.
Better analogy is if compilers refused to compile the code of other compiler developers. (For their own safety, of course.)
This is an excellent example of why running local models you control will be the norm in the long run. Nobody wants to use tools that they don't have control over. It doesn't matter how much better these tools may be if somebody else is deciding what you're allowed to do with them for you.
I don't know about this. Nobody wants to use tools they don't have control over... but somehow almost everyone is. I'm typing this on my iPhone that limits me from doing basic modifications Apple deems "unsafe", on my lap I have my Kindle which Amazon doesn't let me download books from, and soon I will return to work at my SaaS company that limits our (many and large) customers from understanding and controlling how their tool actually works.
I could go on, but IMO all signs seem to indicate it does matter how much better (and easier) these tools are.
There is a convenience aspect that ends up leading people to accept vendor lock in for sure. But I'd argue developers in particular tend to be more conscious of the problem and willing to invest the effort into using things they can actually own. The whole open source ecosystem around Linux is a testament to that.
So, you're right that proprietary services won't go away, and there will always be a market for them, but I do think we will see an open ecosystem being developed in parallel as well.
yup. It's also why relying (too much) on SaaS is a bad move in general, not just for LLMs. You're basically living under someone else's roof and have to abide by their rules. As long as those rules are not illegal or costing them too much business, they will do anything they can get away with if it serves their goals rather than the user's.
Absolutely, we really have to fight to avoid digital serfdom where the means of production are owned by a handful of megacorps and we rent these tools out to do our work.
Where are you getting those local models? Because even the OSS ones are released trained, and they can implement these same "features".
I'm not an expert, but looks like you can't have the independence you are suggesting without incurring in the costs of training the models yourself.
I can't speak to the quality of these models. But huggingface has a lot of models that have been "uncensored". Which suggests that entirely training models from scratch is not needed.
Taking a step back, using multiple models from different sources can also help here. Either by having them check each others work or by combining their work. I remember seeing a chat interface where a prompt would be send to multiple models and then each answer was combined into one by a final model or the person asking the question could cross reference the answers.
To be honest, even if you are not using local models that is a sensible (although expensive) thing to do. Have the claude family do the code work and the gemini family do reviews is what I have seen people do.
I can't speak to the quality of these models. But huggingface has a lot of models that have been "uncensored".
So as a general rule, no open model is going to match Opus or especially Fable. There are several "tiers" as of this month:
"Uncensored" models are a bit of a tradeoff. The model loses intelligence, and it often becomes more glitchy: Chinese characters in English output, models getting stuck in loops, etc. It's hard to tell exactly what people use these for, but judging from the forums, one major use case may be erotic roleplay. I would be careful about using these as coding agents.
To be clear, I mean the quality of the uncensored models compared to their "vanilla" counterpart. I am aware that none of the models you can run yourself (locally or rented hardware) are on par with frontier models. But, again, that was not the context I intended that remark in.
Fair enough!
I have occasionally run "uncensored"/abliterated models through my unpublished test suite, and compared them with the base models on tasks that do not trigger base model rejection.
Most of the models I tested showed signs of degradation relative to the unmodified versions of the same model. There are probably more scientific benchmarks out there, and more up-to-date results, of course.
The "uncensoring" is a special case where models learn when to refuse in a relatively simple way that can be detected by making prompts that models explicitly refuse.
It's possible to train a model to perform poorly in an unobvious way, and do it in response to more specific conditions that won't generalize as clearly. Undoing that may be much harder.
Anthropic doesn't deserve so much flak over this; at least they admit doing this. I assume everybody is doing this.
Ever since DeepSeek, distillation has been demonstrated to be so effective that it can outright disincentivise developing new models. Instead, one can just wait until someone else does it and distill that one relatively easily.
FYI: This is different from, and in addition to, their anti-distillation safeguards. They state this pretty clearly in their post.
Unlike our interventions for [...] and distillation attempts, these safeguards will not be visible to the user. Fable 5 will not fall back to a different model.
Distillation is protected against by falling back to a weaker model, and this is communicated to the user (and, I'd hope, billed accordingly). These additional protections are against talking to Fable about
for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design
So the safeguards might fire if you prompt it "I want to build a frontier LLM, how do I set up a pretraining pipeline?" or maybe "what does RLHF mean?". By contrast, distillation would mean firing a ton of prompts at it and using the output to construct your own model directly.
Ah, thank you for the clarification! I didn't read the article deeply enough, then. This is much more serious, then.
DeepSeek made ~150,000 requests to Anthropic's API, that's not really a meaningful amount, and you have to consider that this number came from Anthropic themselves, who are not incentivized to be truthful about any of those numbers. If anything we should expect the real number to be lower.
On top of that these measures are targetting arbitrary detected end goals, with arbitrary sabotage being applied, based on arbitrary rules that Anthropic make up as they go.
we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development
"We've implemented a rule that says you can't wish for more wishes"
This is very different from their statement in the announcement post:
When Fable’s classifiers detect a request related to cybersecurity, biology and chemistry, or distillation, the response is automatically handled by Claude Opus 4.8 instead. Users will be informed whenever this occurs.
Both are true, and both are stated by Anthropic. The classes you mentioned will get refusals; however, attempting to compete with Anthropic will instead make Fable silently dumber and worse without notifying you (and there's no way to know exactly what prompts that behavior).
I should have been clearer, I understand that both are communicated by Anthropic. What annoys me is that the invisible dumbing down of Fable is only mentioned in the model card and not in the blog post that is presumably read by many more people.
These sort of shenanigans make this a model I would not pay to use. Ideally there'd be a pricing model where you can use it and you pay if its actually useful. Its already bad when you burn $20 of tokens on a task and it simply wasn't useful or most of the cost was model not following directions. But thats rationalizable as a gamble you are paying for. If the model provider just decides to not provide the service you are paying for, thats just fraud.