Knowledge Distillation of Black-Box Large Language Models (2024)
The paper surveys methods for distilling proprietary LLM behavior into smaller models without direct access to weights or training data.
Excerpt
HN · 122 points · 23 comments
Read at source: https://arxiv.org/abs/2401.07013