If Claude Fable stops helping you, you'll never know

92 points by reissbaker

duncan_bayne

Imagine a compiler that refused to properly compile a competing language!

Utterly repugnant behaviour here from Anthropic.

BenjaminRi

Reflections on Trusting Trust, revisited for AI.
alper

Given the amount of pettiness that can enter language wars, I'm surprised we haven't seen this happen yet.
zaphar

There is a reading of your comment that is tongue in cheek with a fair amount of irony and I choose to read as such. Because in fact they do refuse to compile a competing language.
- bendmorris
  
  Better analogy is if compilers refused to compile the code of other compiler developers. (For their own safety, of course.)
Yogthos

This is an excellent example of why running local models you control will be the norm in the long run. Nobody wants to use tools that they don't have control over. It doesn't matter how much better these tools may be if somebody else is deciding what you're allowed to do with them for you.
- elihunter173
  
  I don't know about this. Nobody wants to use tools they don't have control over... but somehow almost everyone is. I'm typing this on my iPhone that limits me from doing basic modifications Apple deems "unsafe", on my lap I have my Kindle which Amazon doesn't let me download books from, and soon I will return to work at my SaaS company that limits our (many and large) customers from understanding and controlling how their tool actually works.
  
  I could go on, but IMO all signs seem to indicate it does matter how much better (and easier) these tools are.
  - Yogthos
    
    There is a convenience aspect that ends up leading people to accept vendor lock in for sure. But I'd argue developers in particular tend to be more conscious of the problem and willing to invest the effort into using things they can actually own. The whole open source ecosystem around Linux is a testament to that.
    
    So, you're right that proprietary services won't go away, and there will always be a market for them, but I do think we will see an open ecosystem being developed in parallel as well.
- sjamaan
  
  yup. It's also why relying (too much) on SaaS is a bad move in general, not just for LLMs. You're basically living under someone else's roof and have to abide by their rules. As long as those rules are not illegal or costing them too much business, they will do anything they can get away with if it serves their goals rather than the user's.
  - Yogthos
    
    Absolutely, we really have to fight to avoid digital serfdom where the means of production are owned by a handful of megacorps and we rent these tools out to do our work.
- reidrac
  
  Where are you getting those local models? Because even the OSS ones are released trained, and they can implement these same "features".
  
  I'm not an expert, but looks like you can't have the independence you are suggesting without incurring in the costs of training the models yourself.
  - creesch
    
    I can't speak to the quality of these models. But huggingface has a lot of models that have been "uncensored". Which suggests that entirely training models from scratch is not needed.
    
    Taking a step back, using multiple models from different sources can also help here. Either by having them check each others work or by combining their work. I remember seeing a chat interface where a prompt would be send to multiple models and then each answer was combined into one by a final model or the person asking the question could cross reference the answers.
    
    To be honest, even if you are not using local models that is a sensible (although expensive) thing to do. Have the claude family do the code work and the gemini family do reviews is what I have seen people do.
    
    emk
    
    I can't speak to the quality of these models. But huggingface has a lot of models that have been "uncensored".
    
    So as a general rule, no open model is going to match Opus or especially Fable. There are several "tiers" as of this month:
    
    Qwen3.6 27B is a little monster for agentic coding. You can run it in 24-32GB of VRAM, and it is surprisingly good at handling concrete, clearly-specified coding tasks. If you don't understand or read your code, however, it will fail. True "vibe-coding" will not work beyond a couple thousand lines. This is the best you can do on a high-end gaming card.
    
    Mid-sized (e.g. DeepSeek V4 Flash, around 280B). These can be run in 256-512 GB of unified RAM and they have their fans. For Mac Studio owners, basically, or people with 2-4 RTX Pro 6000 cards, lol.
    
    Large open models (e.g. Kimi K2.6, around 1T). These fall into the "at least as good as Sonnet, not as good as Opus" range. You cannot run them on prosumer hardware, but you can rent them in the cloud for very low prices.
    
    "Uncensored" models are a bit of a tradeoff. The model loses intelligence, and it often becomes more glitchy: Chinese characters in English output, models getting stuck in loops, etc. It's hard to tell exactly what people use these for, but judging from the forums, one major use case may be erotic roleplay. I would be careful about using these as coding agents.
    
    creesch
    
    To be clear, I mean the quality of the uncensored models compared to their "vanilla" counterpart. I am aware that none of the models you can run yourself (locally or rented hardware) are on par with frontier models. But, again, that was not the context I intended that remark in.
    
    emk
    
    Fair enough!
    
    I have occasionally run "uncensored"/abliterated models through my unpublished test suite, and compared them with the base models on tasks that do not trigger base model rejection.
    
    Most of the models I tested showed signs of degradation relative to the unmodified versions of the same model. There are probably more scientific benchmarks out there, and more up-to-date results, of course.
    
    kornel
    
    The "uncensoring" is a special case where models learn when to refuse in a relatively simple way that can be detected by making prompts that models explicitly refuse.
    
    It's possible to train a model to perform poorly in an unobvious way, and do it in response to more specific conditions that won't generalize as clearly. Undoing that may be much harder.
    
    Yogthos
    
    Qwen 3.6 is excellent, and it doesn't need a huge amount of resources to run. It works particularly well with a harness like ATLAS which is designed to get the most out of smaller local models.
  - alper
    
    I didn't expect to see heavenbanning for using LLMs, but here we are.
    
    sloane
    
    heavenbanning?
    
    …
    
    oh holy shit… shadowbanning, but make it ~~fashion~~ AI psychosis
  - FRIGN
    
    Anthropic doesn't deserve so much flak over this; at least they admit doing this. I assume everybody is doing this.
    
    Ever since DeepSeek, distillation has been demonstrated to be so effective that it can outright disincentivise developing new models. Instead, one can just wait until someone else does it and distill that one relatively easily.
    
    muvlon
    
    FYI: This is different from, and in addition to, their anti-distillation safeguards. They state this pretty clearly in their post.
    
    Unlike our interventions for [...] and distillation attempts, these safeguards will not be visible to the user. Fable 5 will not fall back to a different model.
    
    Distillation is protected against by falling back to a weaker model, and this is communicated to the user (and, I'd hope, billed accordingly). These additional protections are against talking to Fable about
    
    for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design
    
    So the safeguards might fire if you prompt it "I want to build a frontier LLM, how do I set up a pretraining pipeline?" or maybe "what does RLHF mean?". By contrast, distillation would mean firing a ton of prompts at it and using the output to construct your own model directly.
    
    FRIGN
    
    Ah, thank you for the clarification! I didn't read the article deeply enough, then. This is much more serious, then.
    
    gonz
    
    DeepSeek made ~150,000 requests to Anthropic's API, that's not really a meaningful amount, and you have to consider that this number came from Anthropic themselves, who are not incentivized to be truthful about any of those numbers. If anything we should expect the real number to be lower.
    
    On top of that these measures are targetting arbitrary detected end goals, with arbitrary sabotage being applied, based on arbitrary rules that Anthropic make up as they go.
  - owent
    
    we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development
    
    "We've implemented a rule that says you can't wish for more wishes"
  - pyfisch
    
    This is very different from their statement in the announcement post:
    
    When Fable’s classifiers detect a request related to cybersecurity, biology and chemistry, or distillation, the response is automatically handled by Claude Opus 4.8 instead. Users will be informed whenever this occurs.
    
    reissbaker
    
    Both are true, and both are stated by Anthropic. The classes you mentioned will get refusals; however, attempting to compete with Anthropic will instead make Fable silently dumber and worse without notifying you (and there's no way to know exactly what prompts that behavior).
    
    pyfisch
    
    I should have been clearer, I understand that both are communicated by Anthropic. What annoys me is that the invisible dumbing down of Fable is only mentioned in the model card and not in the blog post that is presumably read by many more people.
    
    jedahan
    
    I hope more AI prompters start to value reproducability and introspection.
    
    accelbread
    
    These sort of shenanigans make this a model I would not pay to use. Ideally there'd be a pricing model where you can use it and you pay if its actually useful. Its already bad when you burn $20 of tokens on a task and it simply wasn't useful or most of the cost was model not following directions. But thats rationalizable as a gamble you are paying for. If the model provider just decides to not provide the service you are paying for, thats just fraud.