testing-methodology

2 articles
sort: new top best
clear filter
0 2/10

Cursor describes CursorBench, their internal benchmark suite for evaluating AI coding agent performance on real developer tasks, which provides better model discrimination and developer alignment than public benchmarks like SWE-bench by using actual user sessions and measuring multi-dimensional agent behavior beyond simple correctness.

Cursor CursorBench SWE-bench Terminal-Bench OpenAI Haiku GPT-5
cursor.com · xdotli · 16 hours ago · details · hn
0 6/10

A practical guide to identifying race conditions in web applications using Burp Suite, demonstrating how multiple simultaneous requests can exploit unsynchronized access to shared resources like account balances and vouchers.

Burp Suite Egor Homakov Starbucks
medium.com · devanshbatham/Awesome-Bugbounty-Writeups · 20 hours ago · details