VibeBench/VibeSearchBench

By VibeBench

· GitHub · LLM repos · May 20, 2026

VibeSearchBench introduces a multi-turn search-agent benchmark with verifiable knowledge-graph evaluation and early GitHub traction.

Categories: OSS & Tools, Research

Excerpt

🔍 The hardest search benchmark in the wild — vague, multi-turn, proactive. 200 long-horizon tasks with persona-driven progressive disclosure, scored by verifiable schema-free knowledge-graph evaluation. No vibes, just triplet F1. — ★ 573 · Python · topics: agentic-ai, benchmark, llm, proactive-agent, search, search-agent

Read at source: https://github.com/VibeBench/VibeSearchBench