This post is a little long. Cricket, my cat, has
volunteered to help motivate you go through the full text. If you
activate the switch below, she will appear in the middle of some
paragraphs as you read. You can also press on the button that
appears on the top right corner to see her at any moment.
Use Cricket's help
In the section called 'Supervised Learning' I intend to explain
how learning works to people unfamiliar with machine learning. If
you feel it is too long or not very clear you can skip it. Otherwise
Cricket might come handy in that section.
Artificial Intelligence (AI) companies are promising the world the
moon and the stars. 'AI will bring us to a world where machines do
all the dull work while humans pursue their passions' they
say. Most
people are aware this is BS. In this piece I explore the nature
of that BS and conclude that stupidity, very human and natural
stupidity, is the driving force of the AI industry.
Of course extraordinary claims require extraordinary evidence.
Along the text there are some sections with titles that start with
'Exhibit' where I present evidence of stupidity.
To be clear, I am not saying that AI itself is stupid. It is an
accomplishment of human ingenuity. In a world that made sense it
would actually help us get to utopia. But unfortunately that's not the
world we live in.
Pre LLM's
Society has been disrupted by generative AI; the chatGPTs, the
claudes, the copilots, etc. They are outstanding but they also are
"simply" a clever application of a technique in machine learning
(ML) called supervised learning.
You can think of supervised learning as algorithms for pattern
recognition. Not
only they
are helpful for science, they are a scientific and technological
achievement on their own and they give us immense predictive
powers.
Exhibit A: Outside of science, how has supervised learning been
used?
Years before the arrival of generative AI, supervised learning had
already transformed society. Social media companies started to use it
to maximize what they call "user engagement". Our immense predictive
powers were wielded to provide notifications, content feeds and
recommendations tailored to each user to make them stay on social
media apps for as long as possible.
Now you might be thinking. Wait a minute, I don't pay for social
media. How do they benefit from this? They make money in two
ways. With targeted ads and
by selling
user's data. In other words, they sell your attention and your
information. In the social media business model you are the
product.
A supervised learning algorithm or model, is a mathematical formula
containing many numbers called the 'parameters'. This formula is
applied to another list of numbers called the 'input' and the result
is yet another list of numbers called the 'output'.
These algorithms work with numbers and that means they can work
with any format a computer can handle. Text, images, sound and video
are all provided to these models as a list of numbers. They enter the formula and do a large number of operations with the parameters and the result is another list of numbers that is then converted to the desired output format (text, image or whatever).
The parameters can usually be any number. The specific numerical
values are very important. Some values make the algorithm useful and
work as intended. Most values make it useless.
It is impossible to know in advance what values make a model
useful. They are discovered through a process dubbed 'training'. In
anticipation samples of inputs and corresponding outputs deemed
acceptable are prepared. Then a sample of inputs is provided to the
model and the result is compared to the corresponding acceptable
outputs. Based on this comparison the value of the parameters is
slightly changed so that next time the same inputs are provided the
result will be slightly more similar to the acceptable outputs. This
process is repeated many times with more input-acceptable output
samples until some criterion for 'outputs are good enough in general'
is satisfied.
These models are often called black boxes. What this expression
means is that we cannot explain the connection between the inputs and
the outputs in an intelligible way. We can list and show every single
mathematical operation that links the inputs to the output. But the
number of operations involved prevents us from making any sense of
it.
This is a big problem when it comes to adjusting these models. Say
for example you have a chatbot that is always swearing at the user and
you want it to stop. To achieve this you have to change the values of
the parameters. Since it is impossible to know the change that is
needed, the only available mechanism is further training. You provide
the chatbot with training sets where there is no swearing and this
moves the parameters to new values that produce less of it; though
there is no guarantee it will stop completely.
Stochastic Parrots
Large Language Models (LLM's) are a supervised model. The input is
a list of words (expressed as a list of numbers) and the output is a
probability distribution for the next word. This is, the model has a
list of all existing words and the output is a list of numbers
representing the probability of a specific word being the next
one.
Say for example you give the input "Sam Altman is a". Words like
"the" or "a" or "nevertheless" would have a very small
probability. Words like "martian" or "smurf" would have a higher
probability because even though they are implausible they are at least
grammatically correct. Words like "ceo" or "entrepreneur" that
describe what he does would have even higher probabilities. A well
trained model would give even higher probabilities to adjectives like
"megalomaniac", "grifter" or "full of shit".
After the probability distribution is computed a word is chosen
based on it. That word is added to the existing text. The process is
repeated until a special word meaning 'end of output' is chosen.
Seen this way, LLM's are nothing more than a probabilistic
trick. This is
why Gebru
et al. call them
Stochastic Parrots.
I focused here on LLM's, but what I have explained here applies to
generative AI models in general. They are all stochastic parrots in
their own way.
Exhibit B: The cost of training generative AI
Like all ML models, stochastic parrots need to be trained. Modern
LLM's have trillions of parameters and their
training requires
a lot of ressources. One of them being vast amounts of text
written by humans.
The way AI companies have decided to go about this is to constantly
scrape the internet for content; and they are
very agile
about it, meaning they just move on without thinking things through
or caring about the consequences on people. Their crawlers just take
everything, ignoring any license or copyright (some lawsuits related
to this:
[1][2][3]) and increasing server costs for
lots of websites including
wikipedia. Have
you noticed a substantial increase on paywalls and having to prove
you are a human over the last year? This is a consequence of AI web
scrapers. These parrots are rude!
Textbook example of stupid and evil!
Once training is over, the LLM is capable of having conversations
and it is interesting to interact with it but it still needs to learn
to do specific tasks. Say for example you want it to sell some
product. Then you do some additional training providing curated texts
containing conversations where the product is being sold. This part is
called 'fine tuning'.
People's interactions with LLM's entail a lot of mathematical
operations. Every single generated word, let alone image or video,
requires a long and intricate series of computations. Regular
processors are insufficient for running them fast enough. They run on
GPU's which are a technological achievement on their own.
AI has taken credit for many layoffs. It turns out some of them are
fake. This is a phenomenon
dubbed AI-Washing
where AI is just a scapegoat for layoffs. But there are also real AI
layoffs.
As impressive as LLM's have been, the probabilistic trick behind them
has its limits, and it seems they have been reached. The state of
affairs: They are good at parroting but they cannot reason ([1],
[2]). This puts them far behind human performance for most real world
tasks.
How come AI is taking all those jobs if it doesn't do them so well?
It all comes down to the following dilemma: "On the one hand if I
replace my employees with AI the product wouldn't be so good and I
would lose some business. On the other hand I would be saving so much
money on salaries. Which option gives me more profits?".
Contrary to popular belief, people taking these decisions are not
vampires. Behind the cold-blooded profit-maximizing driven layoffs
there are warm hearts with human emotions. Hearts that are vulnerable
and make mistakes. Hearts that dream of profits and fear of missing
out.
Layoffs aren't the only impact AI has on employment. It is also
transforming it. Lots of jobs now consist in checking and fixing the
output of a generative model instead of doing the essence of the
craft. Many people choose a profession because they want to do that
essential part. Writers want to write, designers want to design,
programmers want to code, etc. Their work brings them satisfaction, it
is often a source of pride and many times even part of their
identity. But now, instead of doing what they signed up for and being
a creative force they become a mere appendage of an algorithm.
Exhibit E: The AI Bubble
What I have said so far seems this way: there is a new technology
that has made us addicted to our screens and shortened our attention
spans. It uses vasts amounts of scarce resources we need, it is
throwing lots of people into unemployment and making jobs worse for
many others. All so that some wealthy individuals can make some more
money.
Business-wise these parrots are not worth the trouble. But the hype
is so big that investors and governments insist in pouring more and
more money into it. The result is a financial bubble that is currently
estimated to be 4 times bigger than the subprime mortgage crisis from
2008.
If this is the first time you hear about the AI financial bubble, I
am sorry to be the bearer of bad news: we are fucked!. I leave you
here a few references about the topic:
One aspect of the bubble are the circular investments among big AI
companies and Nvidia. The following diagram, which is very well
explained in this
video, was published in the Bloomberg article from the list above.
They use the phrase "circular deals" but the proper technical
term in Keynesian economics is "circlejerk" and the bible calls it
"incest" and "a sin".
If it is not working and it is not going to work, why is so much
money invested in it, how come there is a bubble?
AI companies are promising to replace employees for
machines. Machines that never stop working, never complain, don't
get sick and don't take holidays. Huge cost reductions and increase
in productivity! The ultimate dream of company owners since the dawn
of industrial capitalism. For investors, those dreams of profits and
fear of missing out make it an irresistible temptation. AI companies
know it and their marketing focuses on making that temptation as
irresistible as possible.
Here
is an interview with Sam Altman from 2019 at a timestamp where he
is asked about his plans to make openAI profitable. His answer: once
we build a very smart AI we will ask it how to make money. Does this
sound like a reasonable business plan? People in the room even laugh
at this answer, but that hasn't stopped investors from throwing money
at him.
CEO's and high executives of AI companies are constantly 'warning
us' that we have to prepare for what is coming while they pretend to
be concerned about it. "We are such good people that we warn you as
we unleash the technology that will prevent you from putting food on
the table and we fail to make money out of it".
Textbook example of stupid and stupid. I rest my case.