Context Window

The maximum number of tokens an LLM can read and remember in one request.

Modern models offer 100K–2M token windows. Bigger windows reduce the need for RAG for small corpora but still cost more and degrade in quality at the edges.