EndPrompt: Efficient Long-Context Extension via Terminal Anchoring

By Han Tian, Luxuan Chen, Xinran Chen, Rui Kong, Fang Wang

· HF Daily Papers · May 14, 2026

EndPrompt achieves context extension using only short training sequences by preserving original context as an intact first segment and appending a terminal prompt with near-target positional indices.

Categories: Research

Excerpt

Han Tian, Luxuan Chen, Xinran Chen, Rui Kong, Fang Wang — Extending the context window of large language models typically requires training on sequences at the target length, incurring quadratic memory and computational costs that make long-context adaptation expensive and difficult to reproduce. We propose EndPrompt, a method that achieves effective context extension using only short training sequences. The core insight is that exposing a model to long-range relative positional distances does not require constructing full-length inputs: we preserve the original short context as an intact first segment and append a brief terminal prompt as a second segment, assigning it positional indices near the target context length. This two-segment construction introduces both local and long-range relative distances within a short physical sequence while maintaining the semantic continuity of the training text--a property absent in chunk-based simulation approaches that split contiguous context. We provide a theoretical analysis grounded in Rotary Position Embedding and the Bernstein inequality, showing that position interpolation induces a rigorous smoothness constraint over the attention function, with shared Transformer parameters further suppressing unstable extrapolation to unobserved intermediate distances. Applied to LLaMA-family models extending the context window from 8K to 64K, EndPrompt achieves an average RULER score of 76.03 and the highest average on LongBench, surpassing L

Read at source: https://arxiv.org/abs/2605.14589