ml-infrastructure

1 article

sort: new top best

bug-bounty242 google206 facebook167 microsoft166 apple124 rce95 exploit84 web351 open-source44 smart-contract42 defi41 writeup40 ethereum38 aws37 dos36 docker36 ai-agents36 sqli36 access-control35 cloudflare35 malware34 cve34 react32 ssrf32 xss27 supply-chain26 account-takeover25 bragging-post24 idor24 smart-contract-vulnerability23 subdomain-takeover23 browser22 node22 cors21 wordpress21 privilege-escalation21 oauth21 automation20 race-condition20 cloud19 tool19 machine-learning18 authentication-bypass18 pentest18 llm17 vulnerability-disclosure17 ctf17 denial-of-service17 buffer-overflow16 phishing16

0 1/10

Launch HN: IonRouter (YC W26) – High-throughput, low-cost inference

product-launch

Cumulus Labs launches IonRouter, a low-cost inference API optimized for open-source and fine-tuned models, backed by IonAttention—a custom C++ inference runtime designed specifically for NVIDIA GH200 hardware architecture that achieves 588 tokens/s on multimodal workloads through novel optimizations around cache coherence, KV block writeback, and attention scheduling.

inference-api gpu-optimization ml-infrastructure llm cuda gpu-orchestration hardware-specific-optimization multimodal startup

IonRouter Cumulus Labs IonAttention TensorDock Palantir Together AI Fireworks Modal RunPod vLLM GH200 OpenAI Veer Suryaa

ionrouter.io · vshah1016 · 1 day ago · details · hn