research-methodology

1 article
sort: new top best
clear filter
0 3/10

A crowd-sourced AI detection benchmark pairing 16K human-written posts from Reddit, HN, and Yelp with AI-generated responses from models at different capability tiers, designed to measure human ability to distinguish AI text and calibrate detectability across platforms.

Anthropic OpenAI Reddit Hacker News Yelp HuggingFace
slop-or-not.space · eigen-vector · 1 day ago · details · hn