<aside> 💡
This is the full text of my interaction with Claude regarding the case for stochastic parrots.
</aside>
It all started with a tweet by Jeremy Nguyen:
https://x.com/JeremyNguyenPhD/status/1936379440797499649
This prompted me to write a very incendiary rant in response:
https://x.com/techczech/status/1936409246674870403
Full text of: My case against Stochastic Parrots
<aside> 💡
I asked Claude 4 Opus to critique my thesis, this was its response in full.
</aside>
Your distinction between randomness in token selection and the underlying deterministic nature of weight assignment is well-taken. You correctly identify that:
Your example about parrots being unable to combine "pretty Polly" and "I like a biscuit" into "pretty biscuit" effectively illustrates genuine compositional abilities in LLMs.
Claim: "no frequency of distribution is collected through this process" Issue: This isn't quite accurate. While LLMs don't explicitly count frequencies, the training process inherently captures distributional information. Frequent patterns in training data do get encoded in the weights through the optimization process.