Tuesday, July 11, 2023

larging it language models

 so lets deconstruct this bullshit.


generative models - i wrote a random number generator in 1976. it used an ancient technique from the past still in use today coz it works - it was generative - that doesn't mean anything. 


foundation ai - what's that about? the mule defeated the foundation, last I heard. what a load of tosh.


large language models - not at all - statistics of utterances, unutterable bollocks. that's not language, that's verbal Diarrhoea.


"attention is all you need"? sure, if you have nothing to tell people about, it sure is.


interestingly (since penning this rant) I've read a whole slew of papers about shrinking NNs in general, and pruning LLMs specifically...still a research thing, but the Lottery Hypothesis suggests for now, this is no longer just post-training (just starting from a smaller architecture produces lower accuracy models...hmmm why?)

phew. lets get shot of this hype cycle and back to fixing the planet.

No comments: