Prompt caching: 10x cheaper LLM tokens, but how?

27 points by carlana


pta2002

This is a really well explained article. I loved the visuals, and I think finally made me understand a bit more how LLMs work under the hood.

Also, the linked transformer explainer site is really cool!

deivid

Amazing article! I wonder what they used for the animations.

kn100

Phenomenal read. Very clear explanation, and incredible visualisations.