nvidia-blackwell

2 articles

sort: new top best

bug-bounty480 google297 xss277 microsoft249 facebook211 rce159 apple150 exploit136 bragging-post102 account-takeover98 malware94 csrf84 cve79 privilege-escalation74 authentication-bypass65 stored-xss65 writeup61 reflected-xss57 browser54 react53 ssrf51 phishing50 dos50 input-validation49 cloudflare49 access-control49 cross-site-scripting48 node46 aws46 smart-contract45 docker45 sql-injection45 ethereum44 defi43 web-security43 web-application42 supply-chain42 oauth41 web339 burp-suite36 lfi34 vulnerability-disclosure34 idor34 html-injection33 smart-contract-vulnerability32 race-condition32 clickjacking31 reverse-engineering31 information-disclosure30 csp-bypass30

0 3/10

NVFP4: Efficient and Accurate Low-Precision Inference

research

NVIDIA introduces NVFP4, a 4-bit floating-point format for NVIDIA Blackwell GPUs that achieves efficient low-precision inference while maintaining model accuracy through a two-level scaling strategy combining fine-grained E4M3 block-level and FP32 tensor-level scaling, reducing memory footprint by 3.5x versus FP16 with less than 1% accuracy degradation on language models.

quantization low-precision-inference model-compression nvfp4 floating-point-formats nvidia-blackwell tensor-cores ai-optimization fp4 mxfp4 e4m3 hardware-acceleration

NVIDIA NVIDIA Blackwell NVFP4 MXFP4 FP4 E4M3 Tensor Cores Eduardo Alvarez Omri Almog Eric Chung Simon Layton Dusan Stosic Ronny Krashinsky Kyle Aubrey

developer.nvidia.com · tosh · 14 hours ago · details · hn

0 5/10

Private LLM Inference on Consumer Blackwell GPUs

research

Systematic benchmarking of NVIDIA Blackwell consumer GPUs for LLM inference across quantization formats and workloads, demonstrating cost-effective private deployment for SMEs with 40-200x lower costs than cloud APIs and sub-second latency for most use cases.

llm-inference gpu-optimization quantization model-deployment privacy performance-benchmarking nvidia-blackwell cost-analysis rag model-serving

NVIDIA Blackwell RTX 5060 Ti RTX 5070 Ti RTX 5090 Qwen3-8B Gemma3-12B Gemma3-27B GPT-OSS-20B Jonathan Knoop Hendrik Holtmann

arxiv.org · rohansood15 · 14 hours ago · details · hn