2025 06 26

🔍Mind2Web 2 is released! A rigorous agentic search benchmark with long-horizon tasks and Agent-as-a-Judge evaluation.

▶ nnnyt.github.io's clustrmaps 🌎