LoGeR is a novel deep learning architecture from DeepMind and UC Berkeley for 3D geometric reconstruction from extremely long videos (up to 19,000 frames) using a hybrid memory module that combines Sliding Window Attention for local precision with Test-Time Training for global consistency, achieving state-of-the-art results on KITTI, 7-Scenes, and long-sequence benchmarks while maintaining sub-quadratic complexity.
A deep technical exploration of porting a Flash Attention kernel from GPU (Triton) to TPU using JAX, covering the fundamental differences in programming models, compiler behavior, and hardware architectures. The author details how JAX's functional, immutable paradigm and XLA compilation differ from explicit GPU kernel writing, and includes benchmarking and a custom systolic array emulator to understand TPU data flow.