Prompt caching: 10x cheaper LLM tokens, but how?
27 points by carlana
27 points by carlana
This is a really well explained article. I loved the visuals, and I think finally made me understand a bit more how LLMs work under the hood.
Also, the linked transformer explainer site is really cool!
Amazing article! I wonder what they used for the animations.
Phenomenal read. Very clear explanation, and incredible visualisations.