pytorch

1 article
sort: new top best
clear filter
0 6/10

A detailed account of troubleshooting open-source ML infrastructure when post-training the Kimi-K2-Thinking 1T parameter model, exposing bugs and inefficiencies in HuggingFace Transformers and quantization libraries that aren't documented and can hide several layers in the dependency stack.

Kimi-K2-Thinking HuggingFace LLaMA-Factory KTransformers DeepSeek-V3 PyTorch vLLM compressed_tensors TriviaQA PEFT Transformers
workshoplabs.ai · addiefoote8 · 4 days ago · details · hn