Definition

With generative artificial intelligence (GenAI), you can have the computer create content for you. Think primarily of texts, but also images, videos, and audio. It is even possible to create PowerPoint presentations or generate voices.

Everything that generative AI produces is original and unique. For example, if it writes a story, it is not a copy-paste job from what is available online. Instead, the software comes up with something entirely new. That is why we call it intelligent. It almost seems as if the computer can reason and be creative. In reality, such a language model simply predicts the next word that is likely to fit in a sequence.

A diagram explaining the functioning of Large Language Models (LLMs). It features the words "The," "cat," "sat," and "on" as input tokens that are transformed into a vector representation. The diagram illustrates the LLM process, indicating that it produces a vector representation for the next token, which is then used to predict the most likely words to follow. The left section defines a token as a part of a word, emphasizing its role as the basic unit used by LLMs. The layout visually represents the flow from input tokens to predicted outputs, demonstrating the model's predictive capabilities.
This image is owned by Thomas Van den Bossche, Electronics-ICT teaching team, Odisee University of Applied Sciences

Large language models (LLMs) are a form of generative AI that can understand and generate text. With this technology, you can predict answers to questions, write creatively such as headlines and blog posts, translate or summarise text. It can also generate code, translate, and detect errors.

LLMs are generally trained on vast amounts of text data, sometimes even petabytes (1 PB = 1,000 TB) of data. This allows them to understand the relationships between sentences, words, and parts of words. They achieve this by processing enormous datasets from the web, including hundreds of thousands of Wikipedia entries, social media posts, and news articles. LLMs train themselves: they are machine learning algorithms that do not require humans to label data. In the initial phase, they create their own labels, which they then use later in the learning process.

Would you like to see step-by-step how an LLM generates text?
Check out the interactive visual from the Financial Times here.