Context Window

The maximum number of tokens an LLM can read and remember in one request.

Modern models offer 100Kโ€“2M token windows. Bigger windows reduce the need for RAG for small corpora but still cost more and degrade in quality at the edges.

Related terms