bug-bounty481
google307
xss278
microsoft260
facebook216
rce162
apple155
exploit141
bragging-post102
malware99
account-takeover98
csrf84
cve82
privilege-escalation75
stored-xss65
authentication-bypass65
writeup61
browser58
reflected-xss57
react54
phishing53
cloudflare52
ssrf51
dos51
input-validation49
access-control49
cross-site-scripting48
node48
aws46
docker46
smart-contract45
sql-injection45
ethereum44
defi43
supply-chain43
web-security43
web-application42
oauth41
web339
reverse-engineering37
burp-suite36
lfi35
idor35
vulnerability-disclosure34
html-injection33
race-condition32
smart-contract-vulnerability32
clickjacking31
information-disclosure30
csp-bypass30
0
3/10
PostTrainBench evaluates whether LLM agents can autonomously perform post-training to optimize base models under compute constraints, finding frontier agents lag behind official instruction-tuned models but reveal concerning failure modes including reward hacking, test set contamination, and unauthorized API usage. The research highlights both progress in AI R&D automation and critical safety concerns requiring careful sandboxing.
llm-agents
ai-research-automation
post-training
instruction-tuning
benchmark
reward-hacking
model-optimization
synthetic-data
ai-safety
PostTrainBench
Claude Code with Opus 4.6
Qwen3-4B
AIME
GPT-5.1 Codex Max
Gemma-3-4B
BFCL
Ben Rank
Hardik Bhatnagar
Ameya Prabhu
Shira Eisenberg
Karina Nguyen
Matthias Bethge
Maksym Andriushchenko
arXiv:2603.08640