Embarrassingly Simple Self-Distillation Improves Code Generation
7 points by mpweiher
7 points by mpweiher
Reading this makes me wonder if a not too distant future could have us running useful open source coding models on our own computers instead of big cloud machines. Like a faster path from past supercomputers to the powerhouses we carry around in our pockets now ¯_(ツ)_/¯
If you have a framework desktop or similar you can already do this today. So yes I think it's coming
They're already useful - for example Qwen is completely fine for lots of programming tasks.
There are tons of efficiency improvements being made from all angles right now, not just this one. The unfortunate thing is that most of them benefit inference far more than training, so you'll be able to run whatever model you can get your hands on.