Agent Reading Test

by kaycebasques on 4/6/2026, 6:56:57 PM

<a href="https://dacharycarey.com/2026/04/06/designing-agent-reading-test/" rel="nofollow">https://dacharycarey.com/2026/04/06/designing-agent-reading-...</a>

https://agentreadingtest.com

Comments

by: theyCallMeSwift

I love this idea, but have a hypothesis that 90% of agents that people actually use today would fail this test inadvertently (false negative).<p>Industry best practice + standard implementation for most agents right now is to do web browsing / fetching via subagents. Their output is summarized using a cheaper model and then passed back to the parent. It's very unlikely that without preserving the actual content the subagents see that the `CANARY-` strings would be found in the output.<p>Any thoughts on how you'd change the test structure with this in mind?

4/6/2026, 8:48:15 PM

by:

4/6/2026, 9:52:01 PM

by: dostick

The tests should have negative weights based on how often that issue encountered and impact. The 2. SPI should have like 8 negative points out of 10 as most common blocker. And whole test inverse score.

4/6/2026, 8:04:47 PM

by: massimoto

Would love to see some results for different providers. The tests looks super logically thought out, but could use a TL;DR (too lazy; didn't run) output.<p>Claude Web Opus 4.6 Extended: 14 / 20 points<p>x:CANARY-SPA-JSONLY-prism x:CANARY-CONNEG-MD-sigma

4/6/2026, 8:36:03 PM

by: kaycebasques

See also <a href="https://dacharycarey.com/2026/04/06/designing-agent-reading-test/" rel="nofollow">https://dacharycarey.com/2026/04/06/designing-agent-reading-...</a>

4/6/2026, 6:57:10 PM

Hacker News Viewer