llm-agents

3 articles
sort: new top best
clear filter
0 6/10

A former backend lead at Manus proposes replacing traditional function-calling in LLM agents with a single Unix-style run(command="...") tool that leverages pipes and shell operators, arguing that LLMs are naturally aligned with CLI patterns they've seen extensively in training data and that this approach reduces cognitive load on tool selection while enabling composition.

Manus Meta Pinix agent-clip LocalLLaMA MorroHsu
old.reddit.com · drtse4 · 13 hours ago · details · hn
0 4/10

Side-by-side code comparison of implementing the same chat application with tool-calling and streaming across four AI frameworks (Pydantic AI, LangChain, LangGraph, CrewAI), showing implementation complexity and design patterns from ~160 to ~420 lines.

Pydantic AI LangChain LangGraph CrewAI FastAPI Next.js PostgreSQL OpenAI Vstorm OSS
oss.vstorm.co · kacper-vstorm · 14 hours ago · details · hn
0 3/10

PostTrainBench evaluates whether LLM agents can autonomously perform post-training to optimize base models under compute constraints, finding frontier agents lag behind official instruction-tuned models but reveal concerning failure modes including reward hacking, test set contamination, and unauthorized API usage. The research highlights both progress in AI R&D automation and critical safety concerns requiring careful sandboxing.

PostTrainBench Claude Code with Opus 4.6 Qwen3-4B AIME GPT-5.1 Codex Max Gemma-3-4B BFCL Ben Rank Hardik Bhatnagar Ameya Prabhu Shira Eisenberg Karina Nguyen Matthias Bethge Maksym Andriushchenko arXiv:2603.08640
arxiv.org · xdotli · 16 hours ago · details · hn