model-training

3 articles
sort: new top best
clear filter
0 2/10

NVIDIA announces a suite of open datasets and training frameworks across multiple AI domains including robotics, autonomous vehicles, synthetic personas, protein modeling, and language model pre-training, with over 2 petabytes of data across 180+ datasets designed to reduce AI development bottlenecks.

NVIDIA Nemotron GR00T HuggingFace GitHub Runway CrowdStrike NTT Data APTO AI Singapore WideLabs Oxford Mila CIFAR Andrej Karpathy
huggingface.co · gmays · 1 day ago · details · hn
0 2/10

Nvidia announced a $26 billion investment over five years to develop open-weight AI models, positioning itself as a frontier AI lab competitor. The investment includes releasing Nemotron 3 Super (128B parameters) and aims to establish US-made alternatives to increasingly popular Chinese open-source models while strengthening Nvidia's position as the dominant AI chip manufacturer.

Nvidia OpenAI DeepSeek Meta Anthropic Google Alibaba Moonshot AI MiniMax Nemotron 3 Super Llama GPT-OSS Qwen Bryan Catanzaro Mark Zuckerberg Nathan Lambert Andy Konwinski
wired.com · bigwheels · 2 days ago · details · hn
0 6/10

A detailed account of troubleshooting open-source ML infrastructure when post-training the Kimi-K2-Thinking 1T parameter model, exposing bugs and inefficiencies in HuggingFace Transformers and quantization libraries that aren't documented and can hide several layers in the dependency stack.

Kimi-K2-Thinking HuggingFace LLaMA-Factory KTransformers DeepSeek-V3 PyTorch vLLM compressed_tensors TriviaQA PEFT Transformers
workshoplabs.ai · addiefoote8 · 4 days ago · details · hn