Predictable Confabulations: Factual Recall by LLMs Scales with Model Size and Topic Frequency

By Matthew L. Smith, Jonathan P. Shock, Samuel T. Segun, Iyiola E. Olatunji, Tegawendé F. Bissyandé

· ArXiv · AI/CL/LG · May 18, 2026

Scaling law shows factual recall follows a sigmoid in log-linear combination of model parameters and topic frequency, explaining 60-94% variance.

Categories: Research

Excerpt

While scaling laws govern aggregate large language model performance, no scaling law has linked factual recall to both model size and training-data composition. We evaluated 38 models on over 8,900 scholarly references evaluated by an automated reference verification system. Recall quality follows a sigmoid in the log-linear combination of model parameter count and topic representation in training data. These two variables alone explain 60% of the variance across 16 dense models from four families, rising to 74-94% within individual families. The form matches a superposition-inspired account in which recall is gated by a signal-to-noise ratio: signal strength scales with concept frequency and the noise floor with model capacity.

Read at source: https://arxiv.org/abs/2605.18732v1