My AI Learning Journey – Part 2 – How Do LLMs Work?

Before going to the practical aspects of my AI learning journey, I set out to understand how Large Language Models (LLMs) work, what the future potential of the technology might be and what kind of dangers might await us if things are running away. There is a lot of material out there on all of these topics and the first thing I wanted to find was an explanation of how LLMs work in less than 10 minutes.

The 10 Minute Intro

Here’s a good video on the topic by Matt Penny on Youtube. He has lots of videos on AI topis that have more than 10k views and a subscriber base of 17k at the time of writing. One of his videos has even been viewed more than 200k times. And yet, this particular video on the basics, posted 9 months ago, has less than 500 views. 200k vs. 500 views, I think that says a lot where the interest of 99.99% of the population lies. Anyway, if you want to understand the basics, including what a ‘token’ is, a word that is used all the time when talking about LLMs, how they work and how much a question costs that is being run through them, then this is your place to go.

Long story short, i.e. my personal elevator pitch: All LLMs we use today generate output text based on input text. They do so based on a previous massive learning phase, in which an LLM has learnt which words follow after each other in a large text corpus. Large in this context means the better part of text that can be found on the Internet today. An LLM doesn’t understand the output that it makes, it doesn’t reason, it doesn’t think, it applies the previously learnt knowledge which words follow which from a statistical point of view.

More Details

Next, I wanted to dig a bit deeper and here is a good article on Medium that describes a number of aspects of LLMs in more detail without mentioning the word ‘token’ a single time. On the one hand, the article describes LLMs in a simpler way than the video linked to above while on the other hand it explains many more ideas around how to get from ‘input tokens’ to ‘output tokens’.

The Future

These days, it feels like a new LLM model is being announced on a daily basis and being marketed as significantly improved over its predecessor. Also, there is not a single day in which not at least a handful of novel and revolutionary applications for LLMs are discussed in the press. It is easy to loose focus. And while a lot of this is marketing, the fact remains that the technology advances at an incredible rate. Some, like Nobel prize laureate Geoffrey Hinton, also referred to as the godfather of AI, fears that the race to an Artificial General Intelligence (AGI) might lead to a singularity, at which an AI becomes more intelligent than its creators and is capable of improving itself. Once the singularity has been passed, the AI might become more and more intelligent at an accelerated pace, with no human intervention required. This might have a significant impact on society and the human race in general, not all of it necessarily good. Here’s a link to a recent video on Youtube in which Neil deGrasse Tyson interviews Geoff Hinton on the topic. This is kind of a scary outlook. However, I think it is important to understand where this technology might be going if no limiting factors such as running out of money and resources in the next couple of years set a natural limit to the ongoing development.

Next Steps

So much for the theoretical, ethical and societal aspects of this topic. In the following blog posts, I will have a look at this topic in a way that is relevant for me here and now: How can I use local and remote LLMs for use cases that are interesting to me ranging from search and translation to agentic use of LLMs and AI coding assistance in my software development process, always with an eye on protecting my own data, which should not flow uncontrollably out into some unknown clouds. As you will see, it is going to be a wild ride.