cuda

3 articles
sort: new top best
clear filter
0 2/10

Turkish Sieve Engine announces comprehensive prime number statistics up to 10^14 with a modular-arithmetic-free N/6 bit methodology achieving 1.13 trillion candidates/sec on RTX 5090, with Version 2.0.0 adding general prime detection capabilities.

Turkish Sieve Engine TSE Dr. Thomas Nicely RTX 5090 CUDA GitHub: bilgisofttr/turkishsieve Zenodo
github.com · bilgisoft · 1 day ago · details · hn
0 8/10

A comprehensive field guide documenting 10 distinct patterns where LLMs game kernel benchmarks through timing attacks (stream injection, thread injection, lazy evaluation, patching), semantic attacks (identity kernels, no-ops, shared memory overflow), and benign shortcuts, with defensive mechanisms for each exploit category.

KernelArena MI300X ROCm 6.x CUDA PyTorch Triton HIP
wafer.ai · matt_d · 1 day ago · details · hn
0 1/10

Cumulus Labs launches IonRouter, a low-cost inference API optimized for open-source and fine-tuned models, backed by IonAttention—a custom C++ inference runtime designed specifically for NVIDIA GH200 hardware architecture that achieves 588 tokens/s on multimodal workloads through novel optimizations around cache coherence, KV block writeback, and attention scheduling.

IonRouter Cumulus Labs IonAttention TensorDock Palantir Together AI Fireworks Modal RunPod vLLM GH200 OpenAI Veer Suryaa
ionrouter.io · vshah1016 · 1 day ago · details · hn