Self-Harness: Harnesses That Improve Themselves

· HN · ArXiv ·

The paper proposes AI evaluation harnesses that iteratively improve themselves, relevant to agent testing and automated benchmark design.

Categories: Research

Excerpt

HN · 83 points · 6 comments

Discussions