Predictable Confabulations: Factual Recall by LLMs Scales with Model Size and Topic Frequency
Scaling law shows factual recall follows a sigmoid in log-linear combination of model parameters and topic frequency, explaining 60-94% variance.
Excerpt
While scaling laws govern aggregate large language model performance, no scaling law has linked factual recall to both model size and training-data composition. We evaluated 38 models on over 8,900 scholarly references evaluated by an automated reference verification system. Recall quality follows a sigmoid in the log-linear combination of model parameter count and topic representation in training data. These two variables alone explain 60% of the variance across 16 dense models from four families, rising to 74-94% within individual families. The form matches a superposition-inspired account in which recall is gated by a signal-to-noise ratio: signal strength scales with concept frequency and the noise floor with model capacity.
Read at source: https://arxiv.org/abs/2605.18732v1