Language Models

2 – Understanding Language Models

What is an LLM (Large Language Model)?

A Large Language Model – LLM for short – is an AI model specifically trained on language. “Large” refers to the enormous amount of data (practically the entire internet) and the billions of parameters.

LLMs can understand, summarize, translate, answer, and generate texts. ChatGPT, Claude, Gemini, and many other products are based on LLMs. They form the basis for almost all modern AI applications that work with text.

What does GPT stand for?

GPT stands for “Generative Pre-trained Transformer.” Each word describes a characteristic:

Generative: The model creates (generates) new content.
Pre-trained: It has been pre-trained on vast amounts of data before being adapted for specific tasks.
Transformer: The underlying architecture – a specific type of neural network developed by Google in 2017.

GPT is a product name from OpenAI. Other companies use similar architectures under their own names.

What is a Transformer?

The Transformer is a neural network architecture introduced in 2017 in a famous Google paper titled “Attention Is All You Need.” It revolutionized language processing because it no longer reads texts word by word, but can grasp relationships between all words simultaneously.

Practically all modern language models – GPT, Claude, Gemini, Llama – are based on the Transformer architecture. It is the most important technical innovation behind the current AI boom.

What is a Token?

A token is the unit in which a language model processes text. A token is not always an entire word – often it is word parts, syllables, or individual characters.

As a rule of thumb: 1 token ≈ ¾ of an English word. A German sentence with 10 words typically has 15–20 tokens. Tokens are relevant because AI services are billed by tokens, and each model has a maximum token limit.

What is the Context Window?

The context window is the maximum amount of text that a model can “see” and process simultaneously – measured in tokens. Everything you write to the model, and everything it responds with, must fit into this window.

Current models have context windows ranging from 128,000 to over 1 million tokens. A 128k token window roughly corresponds to a 300-page book. Larger windows allow entire documents, contracts, or codebases to be analyzed at once.

What is a Hallucination?

A hallucination occurs when an AI claims something that is false – but presents it as a fact. This happens because language models do not “know” what is true. They generate statistically plausible texts. And sometimes what is plausible is simply wrong.

Example: You ask for a court ruling, and the AI invents a case number that does not exist. This is not a bug – it is a fundamental problem of the technology. Therefore, human oversight of AI outputs is indispensable.

What is a Prompt?

A prompt is the input you send to an AI model – meaning your question, your task, or your instruction. It is essentially what you type into the text field of ChatGPT or Claude.

The quality of the answer heavily depends on the prompt. A vague prompt yields a vague answer. The clearer you describe what you want – context, target audience, format, tone – the better the result.

What is Prompt Engineering?

Prompt Engineering is the art of formulating AI inputs to achieve the best possible results. It is not about programming, but about precise communication.

Typical techniques include: clear role assignments (“You are an experienced legal professional”), step-by-step instructions, examples of the desired format, or breaking down complex tasks into sub-steps. Good prompt engineering can make the difference between a usable and an outstanding AI response.

What is a System Prompt?

A system prompt is a hidden instruction sent to the model before the actual conversation begins. It defines the behavior, role, and rules for the entire conversation.

Example: “You are a customer service assistant for Company X. Always respond in German, use informal ‘du’ with the customer, and adhere to the following product information: …” The user does not see this prompt, but it controls how the AI behaves. System prompts are the foundation of every professional AI application.

What is Temperature in an AI Model?

Temperature is a parameter that controls how “creative” or “random” an AI’s response is. The value typically ranges between 0 and 1.

Temperature 0: The AI always selects the most probable next word. Very consistent, but repetitive.
Temperature 1: The AI more frequently selects less probable words. More creative, but less predictable.

For business applications such as contract analysis, a low temperature is recommended. For creative tasks like brainstorming, a higher temperature can be useful.

It’s worth getting to know us.

60 minutes are enough to understand if and how AI makes sense for your business.

Request a consultation

nuwai – built to perform

nuwai develops software, automation, and integrated AI for companies that measure results in daily operations — not in slide decks. We work with real data, clear ownership, and controlled iterations — turning assumptions into verifiable reality.