Improving Vision-and-Language Navigation with Image-Text Pairs from the Web Paper • 2004.14973 • Published Apr 30, 2020
Self-Supervised Any-Point Tracking by Contrastive Random Walks Paper • 2409.16288 • Published Sep 24, 2024 • 6
EXIF as Language: Learning Cross-Modal Associations Between Images and Camera Metadata Paper • 2301.04647 • Published Jan 11, 2023
VISITRON: Visual Semantics-Aligned Interactively Trained Object-Navigator Paper • 2105.11589 • Published May 25, 2021
Chasing Ghosts: Instruction Following as Bayesian State Tracking Paper • 1907.02022 • Published Jul 3, 2019 • 2