multimodal

2 articles
sort: new top best
clear filter
0 2/10

Mixedbread releases Wholembed v3, a multimodal multilingual retrieval model that achieves state-of-the-art performance on LIMIT and BrowseComp-Plus benchmarks, outperforming existing semantic search models and becoming the first semantic model to surpass lexical-based retrieval on structured-text-heavy documents.

Mixedbread Wholembed v3 LIMIT BrowseComp-Plus Cohere Embed 4 OpenAI Text Embedding 3 Large Voyage 4 Large Gemini Embedding 2 BM25
mixedbread.com · emschwartz · 1 day ago · details · hn
0 1/10

Cumulus Labs launches IonRouter, a low-cost inference API optimized for open-source and fine-tuned models, backed by IonAttention—a custom C++ inference runtime designed specifically for NVIDIA GH200 hardware architecture that achieves 588 tokens/s on multimodal workloads through novel optimizations around cache coherence, KV block writeback, and attention scheduling.

IonRouter Cumulus Labs IonAttention TensorDock Palantir Together AI Fireworks Modal RunPod vLLM GH200 OpenAI Veer Suryaa
ionrouter.io · vshah1016 · 1 day ago · details · hn