long-context-modeling

1 article
sort: new top best
clear filter
0 2/10

LoGeR is a novel deep learning architecture from DeepMind and UC Berkeley for 3D geometric reconstruction from extremely long videos (up to 19,000 frames) using a hybrid memory module that combines Sliding Window Attention for local precision with Test-Time Training for global consistency, achieving state-of-the-art results on KITTI, 7-Scenes, and long-sequence benchmarks while maintaining sub-quadratic complexity.

LoGeR Google DeepMind UC Berkeley Junyi Zhang Charles Herrmann Junhwa Hur Chen Sun Ming-Hsuan Yang Forrester Cole Trevor Darrell Deqing Sun KITTI VBR 7-Scenes ScanNet TUM-Dynamics
loger-project.github.io · helloplanets · 4 days ago · details · hn