Ask HN: Are we ready for vulnerabilities to be words instead of code?
But we're giving agents terminal access and API keys now. The attack vector is becoming natural language. An agent gets "socially engineered" by a prompt; another hallucinates fake data and passes it down the chain.
Trying to secure these systems feels like trying to write a regex that catches every possible lie. We've shifted the foundation of security from numbers to words, and I don't think we've figured out what that means yet.
Is anyone thinking about actual architectural solutions to this? Not just "use another LLM to guard the LLM" — that feels like circular logic. Something fundamentally different.
(Not a native English speaker, used AI to clean up the grammar.)