The Scaling Properties of Implicit Deductive Reasoning in Transformers

· HF Daily Papers ·

Deep bidirectional Transformers can approach explicit chain-of-thought performance on Horn clause deduction when algorithmic alignment is enforced, though CoT remains needed for depth extrapolation.

Categories: Research

Excerpt

Enrico Vompa, Tanel Tammet — We investigate the scaling properties of implicit deductive reasoning over Horn clauses in depth-bounded Transformers. By systematically decorrelating provability from spurious features and enforcing algorithmic alignment, we find that in sufficiently deep models with a bidirectional prefix mask, implicit reasoning approaches explicit CoT performance across graph topologies and problem widths, though CoT remains necessary for depth extrapolation.