https://x.com/techczech/status/1936409246674870403
Sorry, no such thing. Anybody who says "stochastic parrots" and really means it does not know what they're talking about.
Here's my attempt at a justification:
A: The limits of the "stochasticity" metaphor
1. Stochastic elements during LLM generation:
In the final stage of token generation, the transformer assigns probabilities to tokens as a way of ordering them and one of the top are chosen at random to preserve robustness of output. But this is not based on frequency information but rather on geometric relationships in the training data derrived through a process in which a set of geometric weights is assigned to make sure the tokens reproduce the desired input.
2. Stochastic elements during LLM training:
The precise number representing these weights is derived from backpropagation of feedback derived from a loss function (loosely described as defining the success of prediction/choice). The assignment of weights is derived by finding an good enough approximation of a global minimum of a complex function. This is assisted by stochastic gradient descent which is mostly there for computational feasibility reasons (just like the probabilities during generation).
3. Mistaken frequency assumption and 'stochasticity hyperbole':
Unlike some previous Machine Learning methods, no frequency of distribution is collected through this process although the weights to some extent reflect the frequencies. So the first part "stochastic parrots" is misleading because LLMs are not "inherently" stochastic (any more than humans are).
B: Limits of the "parrot analogy"
1. The hyperbole in parrot analogy
But neither are LLMs parrots because they can clearly produce text that is not purely reprducing a combination of the inputs (unlike the parrots who cannot combine 'pretty Poly' and 'I like a buiscuit' into 'pretty buiscuit'.)
2. Overstating the limits of performance relating to training data
The fact that to some extent LLMs are limited in their capabilities by their training data is completely irrelevant and also trivially applicable to humans. Whatever limitations they have are not those of "parrots".
C: Political dimension of the "stochastic parrot" movement
1. Stochastic parrots are a political movement
Anybody using "stochastic parrots" as a statement about LLMs is doing it for political or aesthetic reasons or to fit into a peer group. Even people who actually say this and use LLMs, do not approach their use of them as if they were actually "stochastic parrots".
2. Ask me how I really feel
Those who refuse or are afraid to use LLMs because they are "stochastic parrots" do it because they were misled by the gang of true "stochastic parrots" believers who have morphed from people concerned about "ethics" into an "anti-technology cult" and have lost all touch with reality.