Transformer

The transformer is the neural network architecture on which nearly all modern language models are based. Its attention mechanism lets it capture relationships across the entire text at once.

The transformer was introduced in 2017 and replaced older architectures such as RNNs because it processes text in parallel rather than sequentially. Its core is the self-attention mechanism, which assesses how strongly words relate to one another in context. This lets models capture long-range relationships and train efficiently on large amounts of data. Practically all well-known LLMs such as GPT, Claude, and Gemini build on this architecture.

Back to the glossary

From term to practice

Save, version, and share your best prompts with Prompt2Love.

Get started free

We use cookies to improve your experience. Analytics cookies help us improve Prompt2Love. Settings

Related terms

From term to practice